Replacement Expression Nuance With Sub-expressions

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
User avatar
kengrubb
Posts: 324
Joined: Thu Dec 11, 2003 5:23 pm
Location: Olympia, WA, USA

Replacement Expression Nuance With Sub-expressions

Post by kengrubb »

I'm doing some data cleansing with WE, but I ran into the same Regex issue with TP.

I'm changing dates in a text qualified CSV file from DDMMMYYYY ("22Jan2014") to YYYYMMDD ("20140122").

This Regex Replace did not work.
Find what: (")(\d{2})(Jan)(\d{4})(")
Replace with: $1$401$2$5

[X] Regular expression

Replace All
I'm guessing TextPad is determining, "There aren't 401 sub-expressions. PICNIC."

This Regex Replace worked.
Find what: (")(\d{2})(Jan)(\d{4})(")
Replace with: $1$4\x30\x31$2$5

[X] Regular expression

Replace All
This Regex Replace also worked, and I find it easier to read.
Find what: (")(\d{2})(Jan)(\d{4})(")
Replace with: ${1}${4}01${2}${5}

[X] Regular expression

Replace All
If you're following along, and you need this for something, this only fixed the January dates. I had to rinse and repeat for the other 11 months.
(2[Bb]|[^2].|.[^Bb])

That is the question.
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

To bind the the 4 but not the following digits to the $ either enclose the index in braces:
${4}
or use parentheses to group subexpressions in the normal way:
($4)

To make all your substitutions in a single step, use conditional replacement expressions:
Find what: (?<=\d{2})(?:(JAN)|(FEB)|(MAR)|(APR)|(MAY)|(JUN)|(JUL)|(AUG)|(SEP)|(OCT)|(NOV)|(DEC))(?=\d{4})
Replace with: ?1(01):?2(02):?3(03):?4(04):?5(05):?6(06):?7(07):?8(08):?9(09):?10(10):?11(11):?12(12)
But you're unlikely to bother to do this unless you're going to do this sort of thing often.

Edit: Made minor corrections.
Last edited by ben_josephs on Fri Jul 31, 2015 7:45 pm, edited 1 time in total.
User avatar
kengrubb
Posts: 324
Joined: Thu Dec 11, 2003 5:23 pm
Location: Olympia, WA, USA

Post by kengrubb »

Oh that is just TOO wickedly cool not to use from time to time.
(2[Bb]|[^2].|.[^Bb])

That is the question.
Post Reply