Newline characters in brackets
Posted: Thu Nov 20, 2008 9:26 am
Hi,
I'm trying to clean up a huge XML file (150mb, 30000 root elements, 200000+lines) by throwing out all tags I don't need.
My strategy was to match every root element using regular expressions, but I can't get this to work because:
1. I can't include newline characters inside brackets;
2. I can't seem to replace every newline with an empty string;
I have replaced every newline character by an empty string using UltraEdit, but then I can't open the file in TexPad. "Line too long". When I finally removed the newlines inbetween subroot elements, but not inbetween root elements, the file opened fine, but then TextPad starts crashing on me every I use regular expressions...
My questions are:
- Why can't I use newline characters inside brackets with regular expressions?
- Is there any workaround for this? (except for taking the newlines outside the brackets, a solution I found on this forum but is of no use for me)
Thanks in advance!
Jo3p
I'm trying to clean up a huge XML file (150mb, 30000 root elements, 200000+lines) by throwing out all tags I don't need.
My strategy was to match every root element using regular expressions, but I can't get this to work because:
1. I can't include newline characters inside brackets;
2. I can't seem to replace every newline with an empty string;
I have replaced every newline character by an empty string using UltraEdit, but then I can't open the file in TexPad. "Line too long". When I finally removed the newlines inbetween subroot elements, but not inbetween root elements, the file opened fine, but then TextPad starts crashing on me every I use regular expressions...
My questions are:
- Why can't I use newline characters inside brackets with regular expressions?
- Is there any workaround for this? (except for taking the newlines outside the brackets, a solution I found on this forum but is of no use for me)
Thanks in advance!
Jo3p