Can someone explain this

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
louiscar
Posts: 3
Joined: Wed Jan 06, 2010 6:47 pm

Can someone explain this

Post by louiscar »

Hi,

I discovered something odd here. I note that the following expression will find multiple ending square brackets and I can't seem to stop this with a ? after '\]?'

<!\[CDATA\[(.+)\]>

So for instance this will find:

<![CDATA[Eastborne Marina]>
<![CDATA[Eastborne Marina]]>
<![CDATA[Eastborne Marina]]]]]>

Setting is Posix btw.

Not sure I understand why this is happening as it wouldn't with other chars

Can someone explain why \] appears to be a special case?
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

It isn't a special case, and \] doesn't match more than one bracket.

The repetition operators are greedy: they match as much as possible (while not preventing their containing regular expressions from matching). So the subexpression .+ matches as many characters as possible, so long as there is a ]> following. That is, the .+ in your regex matches everything between the [ and the last ].

Try
<!\[CDATA\[([^]]+)\]>
Post Reply