Need help with Find/Replace Regular Expressions
Posted: Fri Apr 27, 2001 3:08 am
Hi, I hope someone here can help me. I am attempting to save some time by creating a global find/replace procedure that will automatically add "title" attributes to every HTML hyperlink tag in a page/site. Obviously, each link is different and I want to take advantage of the tagged expressions to "remember" the specific URLs, etc.
The link structure presently looks like this:
<a href="table.html" target="_top" class="main"><b>Application Information</b></a>
I want it to look like this:
<a href="table.html" target="_top" class="main" title="Link to Application Information page"><b>Application Information</b></a>
... where 'title="Link to XXXX page" ' is the part I want automatically generated by the link text.
I have worked out the following find/replace expressions:
FIND: <a href="\(.*\)">\(.*\)</a>
REPLACE: <a href="\1" title="Link to \2 page">\2</a>
So far it works fine but it will add any formatting tags like boldface (<b>) or (<font face=XXX>) that occur in between the <a> and </a> tags. This is my first problem (how to strip the extra tags from the "\2" expression I use in the title attribute).
My second major problem is that because the wildcard ".*" always finds the largest matching expression on a line, so if 2 or more links occur on the same line, it finds and highlights the whole batch of links and treats it as one matching object!
One of my more rigorous examples is:
<td width="124" align="left" valign="middle"><font face="Arial" size="1">
<a href="http://root/directory/Main_Homepage.htm" target="_top" class="red_no"><strong>Home</strong></a></font></td></tr><tr><td width="8" align="left"><font face="Arial" size="1">• </font></td><td width="132" align="left" colspan="2"><font face="Arial" size="1"><a href="Applications.html" target="_top" class="black_no"><b>Applications</b></a></font></td>
(Yes I know the code is messy, but someone else developed it in FrontPage and it's not my job to fix it all.) Can anyone help with the find/replace part of my problem? Any handy macros already out there?
Thanks in advance,
Ben
The link structure presently looks like this:
<a href="table.html" target="_top" class="main"><b>Application Information</b></a>
I want it to look like this:
<a href="table.html" target="_top" class="main" title="Link to Application Information page"><b>Application Information</b></a>
... where 'title="Link to XXXX page" ' is the part I want automatically generated by the link text.
I have worked out the following find/replace expressions:
FIND: <a href="\(.*\)">\(.*\)</a>
REPLACE: <a href="\1" title="Link to \2 page">\2</a>
So far it works fine but it will add any formatting tags like boldface (<b>) or (<font face=XXX>) that occur in between the <a> and </a> tags. This is my first problem (how to strip the extra tags from the "\2" expression I use in the title attribute).
My second major problem is that because the wildcard ".*" always finds the largest matching expression on a line, so if 2 or more links occur on the same line, it finds and highlights the whole batch of links and treats it as one matching object!
One of my more rigorous examples is:
<td width="124" align="left" valign="middle"><font face="Arial" size="1">
<a href="http://root/directory/Main_Homepage.htm" target="_top" class="red_no"><strong>Home</strong></a></font></td></tr><tr><td width="8" align="left"><font face="Arial" size="1">• </font></td><td width="132" align="left" colspan="2"><font face="Arial" size="1"><a href="Applications.html" target="_top" class="black_no"><b>Applications</b></a></font></td>
(Yes I know the code is messy, but someone else developed it in FrontPage and it's not my job to fix it all.) Can anyone help with the find/replace part of my problem? Any handy macros already out there?
Thanks in advance,
Ben
