Page 1 of 1

Regex help: look ahead

Posted: Wed Jun 23, 2004 10:59 am
by mo
Here's one I am guessing WE can do with this look-ahead thing, which I would like to learn if someone will show an example. This is a case where I could have used it. I did a search and replace for every other case of "&"; replace with "&"

<tr>
<td class="dbdr">&amp;#7749;</td>
<td class="dbdr">¼</td>
<td class="dbdr">lc n underdot</td>
<td class="dbdr">1E47</td>
<td class="dbdr">&amp;#7749;</td>
<td class="dbdr">.n</td>
</tr>
<tr>
<td class="dbdr">&amp;#209;</td>
<td class="dbdr">½</td>
<td class="dbdr">Cap N tilda</td>
<td class="dbdr">00D1</td>
<td class="dbdr">&amp;#209;</td>
<td class="dbdr">~N</td>
</tr>
<tr>
<td class="dbdr">&amp;#241;</td>
<td class="dbdr">¾</td>
<td class="dbdr">lc n tilda</td>
<td class="dbdr">00F1</td>
<td class="dbdr">&amp;#241;</td>
<td class="dbdr">~n</td>
</tr>

Posted: Wed Jun 23, 2004 11:39 am
by ben_josephs
Here's a starting point:

Code: Select all

&amp;(?=(#[0-9]+;).*\1)
with '.' does not match a newline character not selected.

This will fail if there's more than one section for the same character entity. And it will be rather slow on large input. Better to specify the context more narrowly, without a newline-inclusive .* repeatedly trying to run to the end.

Posted: Wed Jun 23, 2004 1:30 pm
by mo
OK I think I see what it is saying now. Thanks for the example relating to my situation.

Posted: Wed Jun 23, 2004 1:58 pm
by ben_josephs
mo wrote:OK I think I see what it is saying now.
It's saying:

Code: Select all

match:

  &amp;           "&amp;" literally
  (?=             if it's followed by
    (             (beginning of marked bit)
      #[0-9]+;    a hash, a digit sequence, a semicolon
    )             (end of marked bit)
    .*            anything (expensive)
    \1            a copy of whatever matched the marked bit
  )

Posted: Wed Jun 23, 2004 3:48 pm
by MudGuard
In this case, lookahead seems unnecessary:

Find
<tr>\n<td class="dbdr">&amp;

Replace
<tr>\n<td class="dbdr">\&

should be possible to do this in Textpad.

If you want to replace the second occurrence of a row, use

Find
&amp;(#[0-9]+);</td>\n<td class="dbdr">(.*)</td>\n</tr>

Replace by
\&\1;</td>\n<td class="dbdr">\2</td>\n</tr>



Btw, why do you set a class for each td?
Why not set it for the whole table (or, if there are other rows in the table, for the tr)?

Then, instead of
td.dbdr
you could use

table.dbdr td
or
tr.dbdr td
as CSS selector.

Posted: Wed Jun 23, 2004 9:28 pm
by mo
MudGuard,

I see the method you are suggesting; nevertheless I'm glad I asked about look-ahead; I'm sure this will be useful.
Btw, why do you set a class for each td?
Why not set it for the whole table (or, if there are other rows in the table, for the tr)?
Truth is I've been learning this whole thing (if not my entire life) just sufficiently to get done what I want to get done. CSS was a major leap for me and I was happy to get anything that did what I wanted and passed the validators.

I'm still not sure if I could do what I want your way though. I want a border around each cell. It was my understanding that this should be done at the <td> level.

Posted: Wed Jun 23, 2004 9:53 pm
by MudGuard
The selectors I gave do select td elements...