Page 1 of 1

Reg Ex Find and Replace help

Posted: Fri Sep 29, 2006 9:26 pm
by burtonfigg
Hi,

I have a big long web page, with lots of web addresses in - e.g:

<a href="http://www.google.co.uk/search?q=mysql+ ... -8">Google for MySQL Tutorials</a>

Would it be possible to do a find and replace, using a reg ex which finds all ampersands contained within the hyperlinks in any <a href> tags, and replace them with "&amp;"?

Thanks

Jim

Posted: Fri Sep 29, 2006 11:03 pm
by ben_josephs
Tricky. I don't believe you can do that in TextPad in a single step, or even in a single repeated step.

But you can do it in WildEdit:
Find what: &(?!amp;)(?=[^<]*>)
Replace with: &amp;

[X] Regular expression

Posted: Tue Oct 03, 2006 9:40 am
by Boris
How about this multi-step solution:
(as ever, use POSIX regular expression syntax)

Search : (<a href.*)\x26(.*>)
Replace with: \1###amp;\2

[X] Regular expression

Repeatedly using "Replace All" will substitute all ampersands (Hex 26), within an <a href > line, for "###amp;".

When no more matches are found, then:

Search : ###
Replace with: &

[ ] Regular expression

Hope this helped.

Posted: Tue Oct 03, 2006 11:42 am
by ben_josephs
(<a href.*)\x26(.*>) or, more simply, (<a href.*)&(.*>) does not match just <a> tags containing ampersands. It matches everything from an <a href to the last > on the same line, provided there's an ampersand in between. So it will replace ampersands within the bodies of single-line <a> elements and between <a> elements on the same line as well as ampersands in <a> tags. This is not what the original poster asked for.

You might try changing .* into [^<>]*