Keep only text that is between certain tag <a </a>

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
steve1040
Posts: 39
Joined: Fri Oct 13, 2006 2:19 am

Keep only text that is between certain tag <a </a>

Post by steve1040 »

How do I only keep the text that is between <a </a>?

This is the only constant in the website address: href="http://www.website.com/xxxxx/wrwrwr/

The only text I want to keep in the following example is:
Current Gen finder (CG)

<b><a
href="http://www.website.com/xxxxx/wrwrwr/Pla ... l">Current
Gen finder (CG)</a></b> - A bunch of text here weder erer ere find skjewkrje wdlwerje that you are looking for.<o:p></o:p></p>
steve1040
Posts: 39
Joined: Fri Oct 13, 2006 2:19 am

Post by steve1040 »

^bump
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

This is not something to which TextPad is ideally suited. In particular, its regular expression engine is incapable of matching text containing an arbitrary number of newlines.

But you can do it with WildEdit (http://www.textpad.com/products/wildedit/), which uses a far more powerful regular expression engine than TextPad. Try something like
Find what: .*?<a.+?>(.*?)</a>.*?(?=<a|\Z)
Replace with: $1\n

[X] Regular expression
[X] Replacement format

Options
[ ] '.' does not match a newline character [i.e., not selected]
nitinmukesh123
Posts: 5
Joined: Fri May 19, 2006 6:50 am

Post by nitinmukesh123 »

<b><a href="http://www.website.com/xxxxx/wrwrwr/Pla ... l">Current Gen finder (CG)</a></b> - A bunch of text here weder erer ere find skjewkrje wdlwerje that you are looking for.<o:p></o:p></p>
<b><a href="http://www.website.com/xxxxx/wrwrwr/Pla ... l">Current Gen finder (CG)1</a></b> - A bunch of text here weder erer ere find skjewkrje wdlwerje that you are looking for.<o:p></o:p></p>
Find what .*<a.*>(.*)</a>.*\n
Replace with \1\n

Result
Current Gen finder (CG)
Current Gen finder (CG)1
Newbie at regex stuff so not sure it will work for all conditions you might have.

TextPad v4.7.3
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

That doesn't work if the <a> elements straddle newlines, as in the original poster's example, or if there is more than one <a> element on a line.
textpad-fan
Posts: 25
Joined: Sat Apr 21, 2007 5:41 pm

Post by textpad-fan »

Detagger can do that and much more.
Post Reply