RE help

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
Joe Scuderi

RE help

Post by Joe Scuderi »

I'd like to find & replace this string:

G:\HBP\x\s\bf\

but TextPad won't find it as either a regular expression or literal. I think the \H means hex? I really should learn this syntax. Where can I find some more examples of how to escape different characters in regular expressions? Thx!
Joe Scuderi

RE: RE help

Post by Joe Scuderi »

Ok this worked G:\HBP\x\s\bf\

but now I'm stuck on the RE to replace

state_xx_xx.htm

with

state_xx.htm

where xx may be anything. I don't yet understand how to use \0 to \9 "Substitute the text matching tagged expression 0 through 9. \0 is equivalent to &." Would this be my solution? Is there an example somewhere?
Joe Scuderi

RE: RE help

Post by Joe Scuderi »

Ok, got it. In the search string, you can define "tags" in the find field that you can use in the replace field. Adding tags does not change what is found.

to replace
\state_xx_xx.htm

with

\state_xx.htm

Find: \state_\(..\)_..
Replace: \state_\1

the \( in find is the start of tag #1 and the \) is the end of that tag. That is tagged is .. which is any two characters. Replacing with tag #1 or \1 does the real trick. Cool.
Joe Scuderi

RE: RE help

Post by Joe Scuderi »

Ok, another one I'm stuck on. Since I'm just learning Regular Expressions, this may help someone else.

I want to find this:

<a href="whatever\state_ak.htm">Alaska

and move

whatever to replace Alaska. Alaska will always be Alaska, but whatever can be any text of different lenghts. I also want to capitalize whaever

Result

<a href="state_ak.htm">Whatever
Joe Scuderi

RE: RE help

Post by Joe Scuderi »

In simply trying to select <a href="whatever\state_ak.htm">Alaska
here are some things that DID NOT work:

<a href="*\state_ak.htm">Alaska
<a href="[:word:]\state_ak.htm">Alaska
<a href="[a-z]\state_ak.htm">Alaska
<a href="/([a-z]/)\state_ak.htm">Alaska

This does work: <a href=".*\state_ak.htm">Alaska

The asterisk is not a wildcard for any number of characters, it is a wild card for "zero or more occurrences of the smallest possible preceding regular expression". Since . finds one characer .* finds zero or more characters.

SOLUTION
Find: <a href="\(.*\)\state_ak.htm">Alaska
Replace: <a href="state_ak.htm">\u\1

As before the \( marks the first tag and the \) ends it. In the replace line \1 now stands for the text located in the first tag, in this case .* or whatever. Finally, the \u before the \1 changes the next character to upper case. This takes time to learn, but learning these few regular expression items save me hours of cutting and pasting. SO COOL!
Joe Scuderi

RE: RE help

Post by Joe Scuderi »

More problems with RE. In changing this

<a href="whatever\state_$$.htm">Replaceme</a></i>

where whatever is multiple chars and $ is any single chr

Desired result:

<a href="state_$$.htm">Whatever</a></i>

Find: <a href="\(.*\)\state_\(.?.?\).htm">\(.*\)<[^>]*a>
Replace: <a href="state_\2.htm">\u\1</a>

It wasn't so easy to figure out .? is a single character, but I guess most people using TextPad already know regular expressions from programing languages, UNIX, etc.
Roy Beatty

RE: RE help

Post by Roy Beatty »

You might be missing a \ ...
old Find: <a href="\(.*\)\state_\(.?.?\).htm">\(.*\)<[^>]*a>
new Find: <a href="\(.*\)\state_\(.?.?\)\.htm">\(.*\)<[^>]*a>

Ok, ok, it's not likely your data would yield a false-positive match, but as you're learning RE's ...

Roy

PS: You're right about ".?" Consider sending a TextPad enhancement request to include an illustration for it in the Help file.
Post Reply