Page 1 of 1

Regular Expression Find and Replace on HTML tags

Posted: Wed Jun 01, 2005 6:41 pm
by burtonfigg
I have a table with "news" in it. Some of the content contains links to other sites in the form of html markup. I have extracted the data into a text file, and I want to tidy it up so that it validates as xhtml strict.

So, if I had this for example:

Here is some news about <a href=http://www.google.com>google</a> which I read today

I would want to do a Find and Replace in TextPad to convert it to:

Here is some news about <a href="http://www.google.com">google</a> which I read today

That's all it is - to put the URL in speech marks. I could do half of it with a simple find and replace:

find: <a href=http and replace with: <a href="http

But then the complicated bit is to close the speech marks.

I was hoping that this is something that could be done using a regular expression in the Find and Replace section, but I can't work out how to do it.

I tried something like

find: <a href=[*]> and replace with <a href="[*]">

But obviously this is way too simplistic, and wrong.

If anyone can advise about how I can do this it would be much appreciated. I have read guides and tutorials on RegExpressions but find them totally incomprehensible!

Thanks

Jim

Posted: Wed Jun 01, 2005 8:58 pm
by s_reynisson
Find (<a href=)([^>]+)(>[^>]+>)
Replace \1"\2"\3
Should get you started and pls note that I have used POSIX regular expression syntax, which can be selected from the Editor page of the Preferences dialog box. HTH

Thanks!

Posted: Thu Jun 02, 2005 5:57 am
by burtonfigg
Thanks very much - that worked without a problem! Excellent...

Jim