I have several strings like this
10010 Kennerly Road, St. Louis, MO 63128
5 Maryland Plaza, St Louis, MO 63108-1501
How do I select
63128
63108
RegEx Help - Select 5 digit postal code from address string
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
TextPad's regex engine is too weak to do this properly.
This regex
\<[0-9]{5}\>
matches every sequence of exactly 5 digits, but it matches 5-digit house numbers as well.
You need to specify that the matched text is not at the beginning of a line (or something similar). You can specify that the matched text is at the beginning of a line with the anchor ^. An anchor is a regex that matches at specific places but consumes zero characters. The only anchors recognised by TextPad's engine are:
^ - beginning of a line
$ - end of a line
\< - beginning of a word
\> - end of a word
With a modern regex engine, such as the one used by WildEdit (http://www.textpad.com/products/wildedit/), you could use something like
(?<!^)[0-9]{5}\>
which uses the negative look-behind operator in an anchor (?<!^) that matches anywhere other than at the beginning of a line. TextPad doesn't have a modern regex engine and won't let you do this.
In TextPad you will have to make do with matching the ZIP code together with some context that determines that it is a ZIP code, for example, its preceding space:
_[0-9]{5}\> [Replace the underscore with a space.]
This regex
\<[0-9]{5}\>
matches every sequence of exactly 5 digits, but it matches 5-digit house numbers as well.
You need to specify that the matched text is not at the beginning of a line (or something similar). You can specify that the matched text is at the beginning of a line with the anchor ^. An anchor is a regex that matches at specific places but consumes zero characters. The only anchors recognised by TextPad's engine are:
^ - beginning of a line
$ - end of a line
\< - beginning of a word
\> - end of a word
With a modern regex engine, such as the one used by WildEdit (http://www.textpad.com/products/wildedit/), you could use something like
(?<!^)[0-9]{5}\>
which uses the negative look-behind operator in an anchor (?<!^) that matches anywhere other than at the beginning of a line. TextPad doesn't have a modern regex engine and won't let you do this.
In TextPad you will have to make do with matching the ZIP code together with some context that determines that it is a ZIP code, for example, its preceding space:
_[0-9]{5}\> [Replace the underscore with a space.]