Page 1 of 1

Large Memory usage for seemingly simple replace

Posted: Mon Mar 06, 2006 6:23 pm
by msiekkinen
I was trying to work with a google sitemap for a site:
https://regionalhelpwanted.com/feeds/site_data.txt

It's all one line string so I wanted to replace "</url>" with "\n". the file is about 5 megs worth of data but when I run that replace it eats up a VM footprint of about 1.7 gigs and with the disk thrasing that follows it still takes me about a minute to terminate the process.

Suggestion: more intelligent regex replace on large data sets. or maybe some special case optimization for things like my example. I don't really need a regex for that example but I don't think i can represent newlines otherwise.

Confirmation and Work Around of Issue

Posted: Mon May 22, 2006 6:53 pm
by JPMcCrory-WalMartISD
msiekkinen and Textpad Support:

:!: The Issue :!:
I have experienced this issue continually ever since I moved from Textpad v4.3.2 to 4.7.2.

This issue seems to not be tied to the actual replace process eating up memory but instead the redraw that occurs during the replace this is evidenced that when Windows XP finally reaches a critically low memory level it sometimes steps in and ends the replace process to prevent it from completely filling the memory. When this occurs Texpad finally redraws correctly but only has completed a partial replace.

This tends to happen on files > 200KB in size with no new lines (CRLF) in the file (or one long line of text). The character being replaced is ususually 20-100 characters apart separating lines of varying length. Like with EDI data the ~ (tilde) with terminates each segment like a period would terminate a sentence.

What I draw my inference that this is a redraw issue from is the fact that it's memory not CPU that grows and not suddenly but more like a small spike when it loads the screen or file into memory and then a gradual increase in memory usage until windows memory management steps in and ends the replace and does a good redraw. (Please note that replaces are CPU intensive and not memory intensive by nature.) The memory that is used is not freed up until textpad is closed. The memory does not show as used in Task Manager until you add the VM Size to your Processes Columns. It is also interesting that the number of Page Faults increases dramatically when this is occuring with a PF Delta of between 1000-3000.

:idea: Now for the work around at least until this gets fixed :idea:
The work around is simple just go to the end of the line of text and press enter. By inserting just one new line (CRLF) the system doesn't not seem to have an issue.

Yes it's that simple the key thing is that is shouldn't be required in the first place.

If and example file or any other information is needed to assist with research please contact me.

This issue has been escalated to the development services area of Wal-Mart that purchased the enterprise license of Textpad for it's development teams