How do I only keep the text that is between <a </a>?
This is the only constant in the website address: href="http://www.website.com/xxxxx/wrwrwr/
The only text I want to keep in the following example is:
Current Gen finder (CG)
<b><a
href="http://www.website.com/xxxxx/wrwrwr/Pla ... l">Current
Gen finder (CG)</a></b> - A bunch of text here weder erer ere find skjewkrje wdlwerje that you are looking for.<o:p></o:p></p>
Keep only text that is between certain tag <a </a>
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
This is not something to which TextPad is ideally suited. In particular, its regular expression engine is incapable of matching text containing an arbitrary number of newlines.
But you can do it with WildEdit (http://www.textpad.com/products/wildedit/), which uses a far more powerful regular expression engine than TextPad. Try something like
But you can do it with WildEdit (http://www.textpad.com/products/wildedit/), which uses a far more powerful regular expression engine than TextPad. Try something like
Find what: .*?<a.+?>(.*?)</a>.*?(?=<a|\Z)
Replace with: $1\n
[X] Regular expression
[X] Replacement format
Options
[ ] '.' does not match a newline character [i.e., not selected]
-
- Posts: 5
- Joined: Fri May 19, 2006 6:50 am
Find what .*<a.*>(.*)</a>.*\n<b><a href="http://www.website.com/xxxxx/wrwrwr/Pla ... l">Current Gen finder (CG)</a></b> - A bunch of text here weder erer ere find skjewkrje wdlwerje that you are looking for.<o:p></o:p></p>
<b><a href="http://www.website.com/xxxxx/wrwrwr/Pla ... l">Current Gen finder (CG)1</a></b> - A bunch of text here weder erer ere find skjewkrje wdlwerje that you are looking for.<o:p></o:p></p>
Replace with \1\n
Result
Newbie at regex stuff so not sure it will work for all conditions you might have.Current Gen finder (CG)
Current Gen finder (CG)1
TextPad v4.7.3
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm