Anyone have any leads on creating a macro to strip specific HTML tags? I've got a library of HTML code, and I need to strip out all SPAN and DIV tags. Most are formatted like this:
<span class=body>text is here</span>
<div class="body">text is here</div>
Any help would be much appreciated.
Strip HTML tag
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
Use Search | Replace...
Just the tags?
This assumes you are using Posix regular expression syntax:
Just the tags?
Or whole elements?Find what: </?(span|div)[^>]*>
Replace with: [Nothing]
[X] Regular expression
These will only work if the entire tags and elements are on one line.Find what: <span[^>]*>[^<]*</span>|<div[^>]*>[^<]*</div>
Replace with: [Nothing]
[X] Regular expression
This assumes you are using Posix regular expression syntax:
Configuration | Preferences | Editor
[X] Use POSIX regular expression syntax
-
- Posts: 10
- Joined: Tue May 17, 2005 1:53 pm
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
It's kind of you to say so! I'm glad my posts are useful.
The bible for these things is
Friedl, Jeffrey E F
Mastering Regular Expressions, 2nd ed
O'Reilly, 2002
ISBN: 0596002890
http://regex.info/
In this book you will find that there is much that can be done with modern extended regular expression recognisers that can't be done with the rather weak recogniser that TextPad uses. WildEdit (http://www.textpad.com/products/wildedit/) uses a far mor powerful one (Boost).
The bible for these things is
Friedl, Jeffrey E F
Mastering Regular Expressions, 2nd ed
O'Reilly, 2002
ISBN: 0596002890
http://regex.info/
In this book you will find that there is much that can be done with modern extended regular expression recognisers that can't be done with the rather weak recogniser that TextPad uses. WildEdit (http://www.textpad.com/products/wildedit/) uses a far mor powerful one (Boost).
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm