Why RE "137[^\x5d]" found pattern "137]"

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
tn7077
Posts: 5
Joined: Tue Sep 26, 2006 4:59 pm

Why RE "137[^\x5d]" found pattern "137]"

Post by tn7077 »

My file contained the following 2 lines:
1. source\fi_output_msg.c(408): case INTERROGATE_TRK_REQ_FC: /* fUNCTION CODE 137 */
2. source\library_lookup.c(779): 5142, /* itbl_atan[ 0][137] */

I used POSIX regular expression and found the following:
a. RE "137\x5d" found "137]" in line 2 as expected.
b. RE "137[\x5d]" found no occurrence. I expected it to find "137]" in line 2 as in a.
c. RE "137[^\x5d]" found both "137 " in line 1 and "137]" in line 2. I expected it to find only "137 " in line 1.

I wanted to search for "137" not followed by ']'.

Questions
1. Are my expectations correct?
2. Am I missing something?

Artifacts
TextPad version 5.0.3
Windows XP with SP2 installed
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

137\x5d matches 137]. You can use the simpler regex 137] for that.
137[\x5d] matches 137 followed by any one of the characters \ x 5 d. This is not what you want.
137[^\x5d] matches 137 followed by any one character that is not one of \ x 5 d. This also is not what you want.
What you want is 137[^]].
tn7077
Posts: 5
Joined: Tue Sep 26, 2006 4:59 pm

Post by tn7077 »

Thanks, Ben.

Here's what I learned then:

1. "\" in [] is a literal "\", not an Esc character.

2. It looks to me that
to find "137[", one uses RE "137\["
to find "137]", one uses RE "137]"

Question: when does one need to use "\x<hexdigit><hexdigit>"?
an example would be appreciated.

Thanks again.
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

You need to quote a literal [ to indicate that it's not the beginning of a character class expression. A ] is only special to the right of a [, so one on its own doesn't need to be quoted. A similar rule applies to ( and ). On the other hand, a { is special only when it starts a legal interval operator. This anomaly is the result of the history of regex development.

You use the \xdd notation to represent characters that are difficult (for example, most control characters, such as ESC - \x1B) or impossible (for example, NULL - \x00) to represent literally.
Kaizyn
Posts: 6
Joined: Tue Jul 31, 2007 7:50 pm

When to use hex codes for characters?

Post by Kaizyn »

http://www.regular-expressions.info/quickstart.html

Basically, if you want to match a non-printable character use the hex codes. Otherwise, it's not needed.
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

But be aware that the regular expression recogniser used by TextPad is very old and rather weak by the standards of recent tools. Much of what you will find at that site and the many other regex sites on the web is not available in TextPad, so you may get frustrated if you discover a handy trick that doesn't work in TextPad. The recogniser that WildEdit (http://www.textpad.com/products/wildedit/) uses (Boost) is far more powerful.
tn7077
Posts: 5
Joined: Tue Sep 26, 2006 4:59 pm

Post by tn7077 »

Appreciate the info.
Post Reply