Page 1 of 1

Character Class not recognized in Regular expression

Posted: Wed Feb 18, 2004 11:18 am
by forest78
Hi,

My intention is to insert newlines before each instance of uppercase characters. I already tried the following:

replace \([:upper:]\) with \n\1

The checkbox "Regular Expression" is enabled. However, instead of treating [:upper:] as a character class, Textpad replaces all instances of u, p, e and r.

Is this expected behaviour?

Thanks,

- R a l p h -

Posted: Wed Feb 18, 2004 11:41 am
by michaeldinning
2 sets of [] around the :upper: is what is needed.

ie: replace \([[:upper:]]\) with \n\1

'Match Case' must be checked, otherwise :upper: matches all lower case charactors also[/b]

Posted: Wed Feb 18, 2004 11:48 am
by Jens Hollmann
Yes. For some unknown reason these character classes are only valid inside a group of characters which must be enclosed in square brackets.

The correct expression to look for a single upper case character therefore is:

[[:upper:]]

HTH

Jens

Posted: Wed Feb 18, 2004 2:09 pm
by s_reynisson
This is called a POSIX bracket-expression and it can describe a special range like class.
Special because it uses the current locale.
So [a-z] finds just that, a to z, but [[:lower:]] finds a to z and letters like áýúóþæðö,
or whatever lowercase characters that are defined in your locale.
[:lower:] is the POSIX range character class and the extra [] stand for the
POSIX bracket-expression. From Mastering RegEx's, 2nd Ed.
Hope I mangled this correctly together! ;)