Large Memory usage for seemingly simple replace
Posted: Mon Mar 06, 2006 6:23 pm
I was trying to work with a google sitemap for a site:
https://regionalhelpwanted.com/feeds/site_data.txt
It's all one line string so I wanted to replace "</url>" with "\n". the file is about 5 megs worth of data but when I run that replace it eats up a VM footprint of about 1.7 gigs and with the disk thrasing that follows it still takes me about a minute to terminate the process.
Suggestion: more intelligent regex replace on large data sets. or maybe some special case optimization for things like my example. I don't really need a regex for that example but I don't think i can represent newlines otherwise.
https://regionalhelpwanted.com/feeds/site_data.txt
It's all one line string so I wanted to replace "</url>" with "\n". the file is about 5 megs worth of data but when I run that replace it eats up a VM footprint of about 1.7 gigs and with the disk thrasing that follows it still takes me about a minute to terminate the process.
Suggestion: more intelligent regex replace on large data sets. or maybe some special case optimization for things like my example. I don't really need a regex for that example but I don't think i can represent newlines otherwise.