Changing HTML tag case?

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
Eric Vitiello

Changing HTML tag case?

Post by Eric Vitiello »

I've been trying to create a regular expression to force all HTML tags to lower case, but have been wholly unsuccessful. givne the following HTML tag:

<TAGNAME PARAM="value" PARAM=value NOVALPARAM>

all capital text should be forced to lower case using the regex, and all lowercase (values) should be left untouched.

Anyone have a solution?
Randall McDougall

Re: Changing HTML tag case?

Post by Randall McDougall »

I don't think you can with a single Regexp, since the number of possible "values" to be left untouched is indeterminate ... (assuming you mean Uppercase in the values stays Upper -- only Keywords change) ... you could do is as three separate iterations however (maybe two if you're careful ...) One regexp to change all the Tagnames -- fairly simple:
<\(/?[^ >]+\)
replaced as:
<\U\1
and then a second pass repeated until failure to change the Keywords ... combining those that have parms (key=val) with the standalones (nkey) is possible but makes the regexp more complex than "off the top of my head" ... as does setting it so that it fails when done ... I'll do it up for you later today if you'd like ...
Jeff Epstein

Re: Changing HTML tag case?

Post by Jeff Epstein »

This specific question is addressed in my TextPad Regular Expression FAQ:

http://www.jeffyjeffy.com/textpad/docum ... l_tag_text

Please let me know if it's not clear, or there's any other information you need.


Jeff
http://www.jeffyjeffy.com/textpad
Andreas

Re: Changing HTML tag case?

Post by Andreas »

Jeff, sorry to disappoint you but the given solution is wrong!
The given solution would transform
<IMG SRC="Pictures/MyPicture.jpg">
to
<img src="pictures/mypicture.jpg">
which is a problem if you try to upload the page to an unix-based web-server (file names are case-sensitive under unix).
There could be other cases as well where the case of an attribute value is important!
The given solution also fails if an attribute value contains a > character!

I think there is no easy solution to this.
I think I remember TidyHTML has this as a feature...
Andreas
Randall McDougall

Re: Changing HTML tag case?

Post by Randall McDougall »

Ok, to transform all the tags/keywords to lower case ... REPLACE ALL
<\(/?[^ >]+\)
with
<\L\1

Then REPLACE ALL
\(<[^>]*[[:space:]]\)\([a-z]*[A-Z]+[a-zA-Z]*\)\([=>[:space:]]\)
with
\1\L\2\E\3

repeatedly until the replace can't find the REGEXP anymore ... even for a complex tag that shouldn't take more than a dozen tries (so it could be put into a Macro pretty easily) ... without some way of controlling the scan point that the replace picks up at (which would be nice -- allowing the second regexp to work in a single pass ... in fact it could allow them to be combined for a single REPLACE ALL ... but that's neither here nor there since we done have the capability) that's the best you can do I think.
Andreas

Re: Changing HTML tag case?

Post by Andreas »

Uuups, I just noted that my examples got interpreted as HTML...
The given solution would transform
<IMG SRC="Pictures/MyPicture.jpg">
to
<img src="pictures/mypicture.jpg">

Andreas

<a href="http://www.djh-freeweb.de/~andreas.waechter/"><img src="http://www.djh-freeweb.de/~andreas.waec ... ><br>Meine Homepage</a>
Andreas

Re: Changing HTML tag case?

Post by Andreas »

So img-tags get through the forum software, but anchors don't...

And I just noted that I confused more with my last posting than I cleared.
My posting from 02-06-01 06:59 does relate to my posting from 02-05-01 19:31.

The solution given by Randall McDougall (at 02-06-01 01:51) works - as far as I tested it - except in two cases:
if the attribute name stands at the end of a line ([:space:] does not contain \n according to textpad help).
But this can easily be repaired by adding \n to the last [].
The second case is not so easily solved:
if an attribute is not on the same line as the opening <
I can't see a solution to this one as Textpad does not support \n in combination with * or +

Andreas
Randall McDougall

Re: Changing HTML tag case?

Post by Randall McDougall »

A quick bit of fine tuning to my previous solution:

regexp = <\(/?[^[:space:]!>]+\)
repexp = <\L\1

regexp = <\([a-z]+[^>]*[[:space:]]\)\([a-z]*[A-Z]+[a-zA-Z]*\)\([=>[:space:]]\)
repexp = <\1\L\2\E\3

The changes keep it from messing with text in a Comment (they also disable it from the !DOCTYPE superTag, but my personal preference is not to mess with that anyway) ...

As Andreas notes, if the Tag has an imbedded \n, it messes this up ... [:space:] is supposed to allow for carriage returns or linefeeds, but *doesn't* match the CRLF combo that marks a newline (and you can't include a \n in a class expression directly) ... so aside from first running some Replace to force all Tags onto a single line (which is easy enough, but more of a change than you asked for) there's no solution for that.

I've created a macro to do the piece above and put it at:
http://www.connection.com/~rsm/textpad/LowerTag.TPM

it does the above, with the second exp applied a large number of times ...
Andreas

Re: Changing HTML tag case?

Post by Andreas »

Why not submit it to support@textpad.com so it will end up on the add-ons / macros page (just to keep all textpad add-ons in one place)?

Andreas
Jeff Epstein

Re: Changing HTML tag case?

Post by Jeff Epstein »

---Andreas (---.mch.tli.de) wrote:
> Jeff, sorry to disappoint you but the given solution (on your web page) is wrong!

Yup :' (. I misunderstood the question. the example on the web page solves a much simpler problem.


Jeff
http://www.jeffyjeffy.com/textpad
Randall McDougall

Re: Changing HTML tag case?

Post by Randall McDougall »

-- Andreas --

I've taken your suggestion and sent an updated version in (this one forces the Tags onto a single line and includes detailed docs and a description of how to reconstruct it to force all UpperCase) ... hopefully see it up soon ... ^_^

--

Randall
Post Reply