Replace all HTML tags with a comma

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
rrhandle
Posts: 11
Joined: Thu Mar 23, 2006 4:03 pm

Replace all HTML tags with a comma

Post by rrhandle »

I need to replace the tags here with commas so I can turn it into a .cvs file. I know I will end up with multiple commas, but that is OK. I can do another replace and change ,, to ,
rrhandle
Posts: 11
Joined: Thu Mar 23, 2006 4:03 pm

Found it

Post by rrhandle »

<.*?>
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

<span title="-->">

is legal in HTML, your regex would only find the green part, not the red.

If you can guarantee that there is no > in an attribute value, then your regex works. But it does not work with html generally.
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

<([^"]|"[^"]*")+?>
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

<img title="-->" alt='-->'>

or, to make it more tricky,

<img title="'-->" alt='"-->'>

;-)


If you really want to cover all variants, the regex must be quite complicated ...

Up to html 4 (i.e. as long as html was based on sgml), even <p/bla/ is legal html, considered the same as <p>bla</p> ...
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

<([^"']|"[^"]*"|'[^']*')+?>

Leaving out the trailing > is just being unreasonable!
pbaumann
Posts: 28
Joined: Fri Jun 27, 2014 9:43 am

Have you tried the example for macros?

Post by pbaumann »

You could use a variation of the first example for macros. The basic version is marking the entire tag. The example explains how to use DEL key to delete a HTML TAG and that you can perform this until all HTML TAGs are gone. So additionally to add DEL key you add the introduction of a comma.

Defining the macro in the proper way it will be executed until all the replacements have taken place.
pbaumann
User avatar
AmigoJack
Posts: 533
Joined: Sun Oct 30, 2016 4:28 pm
Location: グリーン ヒル ゾーン
Contact:

Post by AmigoJack »

I can top that: an HTML element (or "tag") can span across multiple lines and attributes don't have to be enclosed in quotations under certain conditions - this is perfectly legal:

Code: Select all

<a
 href="one"
 title=two
 style='color: yellow'
>
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

The regular expression I suggested above handles that.
Post Reply