need help with complex replace

speedilad · Post by **speedilad** » Wed Dec 08, 2010 5:35 pm

I have tried a lot of combinations but have not been able to do this replace.

Each line of the text file is variable in length and ends with an amount that has 2 decimal places and is followed by "\par". Examples: " 158.00\par" and " 9.53\par"

Note the amounts can be a variable number of whole dollars.

I want to replace the space in front of the the first whole dollar digit with this character "~" example: "~158.00\par" and "~9.53\par"

Thanks for any help...

ben_josephs · Post by **ben_josephs** » Wed Dec 08, 2010 6:02 pm

Use "Posix" regular expression syntax:

Configure | Preferences | Editor

[X] Use POSIX regular expression syntax

Search | Replace... (<F8>):

Find what: _([0-9]+\.[0-9]{2}\\par) [Replace the underscore with a space]
Replace with: ~\1

[X] Regular expression

Replace All

speedilad · Post by **speedilad** » Wed Dec 08, 2010 6:37 pm

I never knew about that posix... Not sure what that means, but it worked great... Thank you so very much...

One question to help me understand.. what does the {2} mean. two digits? if so how does the left side of the decimal work with variable number of digits??? Thanks for taking the time... Jim

ben_josephs · Post by **ben_josephs** » Thu Dec 09, 2010 12:13 am

"Posix" syntax (TextPad uses the term incorrectly) doesn't change the expressive power of TextPad's regular expressions: it just changes the way that backslashes are used. In most cases regular expressions are more readable in this syntax than in the default syntax.

_([0-9]+\.[0-9]{2}\\par) [replace the underscore with a space] matches

Code: Select all

_           [the underscore is really a space] a space
(           (start of captured text number 1)
  [0-9]+    a non-empty sequence of digits (see below)
  \.        a dot
  [0-9]{2}  two digits (see below)
  \\        a backslash
  par       the literal text "par"
)           (end of captured text number 1)

where
[0-9]+ matches

Code: Select all

[0-9]       any digit
+           ... any non-zero number of times

and
[0-9]{2} matches

Code: Select all

[0-9]       any digit
{2}         ... two times

We replace the matched text with
~\1
which is composed of

Code: Select all

~     a tilde
\1    captured text number 1

speedilad · Post by **speedilad** » Fri Dec 10, 2010 4:30 pm

Ben, thanks for that very detailed explanation.... I fully understand all of it, but still have one question...

in the last part:
We replace the matched text with
~\1
which is composed of
Code:
~ a tilde
\1 captured text number 1

what would be a "captured text 2"?

ben_josephs · Post by **ben_josephs** » Fri Dec 10, 2010 7:30 pm

Whatever would have been matched by a second parenthesised part of the regex if there had been one.

If part of a regex is parenthesised, whatever text that part matches is captured for use in the replacement expression. The parentheses are numbered from left to right by the position of their open-parenthesis symbols. In the replacement expression, \1 represents what was captured by the first parenthesised expression, \2 represents what was captured by the second, and so on.