RegEx Help - Select 5 digit postal code from address string

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
steve1040
Posts: 39
Joined: Fri Oct 13, 2006 2:19 am

RegEx Help - Select 5 digit postal code from address string

Post by steve1040 »

I have several strings like this

10010 Kennerly Road, St. Louis, MO 63128
5 Maryland Plaza, St Louis, MO 63108-1501

How do I select
63128
63108
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

TextPad's regex engine is too weak to do this properly.

This regex
\<[0-9]{5}\>
matches every sequence of exactly 5 digits, but it matches 5-digit house numbers as well.

You need to specify that the matched text is not at the beginning of a line (or something similar). You can specify that the matched text is at the beginning of a line with the anchor ^. An anchor is a regex that matches at specific places but consumes zero characters. The only anchors recognised by TextPad's engine are:
^ - beginning of a line
$ - end of a line
\< - beginning of a word
\> - end of a word

With a modern regex engine, such as the one used by WildEdit (http://www.textpad.com/products/wildedit/), you could use something like
(?<!^)[0-9]{5}\>
which uses the negative look-behind operator in an anchor (?<!^) that matches anywhere other than at the beginning of a line. TextPad doesn't have a modern regex engine and won't let you do this.

In TextPad you will have to make do with matching the ZIP code together with some context that determines that it is a ZIP code, for example, its preceding space:
_[0-9]{5}\> [Replace the underscore with a space.]
Post Reply