Need to replace a multiple line large string

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
DerMajo
Posts: 4
Joined: Mon Sep 26, 2011 1:11 pm

Need to replace a multiple line large string

Post by DerMajo »

Hi

my source text ist for example this one (java-code):

Code: Select all

    /**
     * Return the current page's binding container.
     * @return the current page's binding container
     */
    public static BindingContainer getBindingContainer1(String name) {
        System.out.println("getBindingContainer1; " + "#{bindings." + name +
                           "}");
        return (BindingContainer)resolveExpression("#{bindings." + name + "}");
    }
Now i want to replace the following lines

Code: Select all

System.out.println("getBindingContainer1; " + "#{bindings." + name +
                           "}");
into this

Code: Select all

if (log.isTraceEnabled()) log.trace("getBindingContainer1; " + "#{bindings." + name + "}");
I have build a regular expression, that works fine at all regex-tester i have found, but not in textpad :/

My regular expression:

Code: Select all

System\.out\.println\((.*?)[\s]{1,2}(.*?)\);
Replace expression:

Code: Select all

if (log.isTraceEnabled()) log.trace(\1\2);
Does anybody know why it does not work for TextPad?

Regards
Majo
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

1. For this to work at all in this style you must use Posix syntax:
Configure | Preferences | Editor

[X] Use POSIX regular expression syntax
2. TextPad's regex recogniser doesn't support \s . Use [ \t] (or just a space, if appropriate) instead.

3. TextPad's regex recogniser doesn't support non-greedy repetition operators such as *? . It may be possible instead to use a more restrictive repetition, such as [^ \t]* .

4. TextPad's regex recogniser can't match text containing an arbitrary number of newlines. Is the number of newlines in the text to be matched fixed or arbitrary? If it's arbitrary, can you remove them first?

If all newlines in the text to be matched have been removed, you might try something like this:
Find what: System\.out\.println\(([^ \t]*)[ \t]{1,2}(.*)\);
Replace with: if (log.isTraceEnabled()) log.trace(\1\2)

[X] Regular expression

Replace All
DerMajo
Posts: 4
Joined: Mon Sep 26, 2011 1:11 pm

Post by DerMajo »

Thank you for your fast answer.

I have found the solution myself.

Code: Select all

System\.out\.println\((.*?)\n(.*?)\);
Because the newlines \n, \r are just two character in the class \s, the regex does match \f, \t und \v, which i dont want to find.

So I read some stuff and try \r and \r\n and just \n to match with the newline and it works.

May I ask you, where do you know, what syntax the TextPad's regex recogniser supports?

Regards
Majo
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

Good. So the number of newlines is fixed.

Note that the .*? subexpression is not doing what you think it's doing. As I wrote before, TextPad's regex recogniser does not support non-greedy repetition operators. So .*? is just an optional .*, which is entirely equivalent to (but probably slower than) .*.

Information on TextPad's regex recogniser is available in the help under
Reference Information | Regular Expressions,
Reference Information | Replacement Expressions and
How to... | Find and Replace Text | Use Regular Expressions.

Be aware that the regular expression recogniser used by TextPad is very weak compared with modern tools; much that works elsewhere won't work in TextPad.
DerMajo
Posts: 4
Joined: Mon Sep 26, 2011 1:11 pm

Post by DerMajo »

Hi

I dont know the meaning of "non-greedy repetition operators". If you have the time and the will, can you explain why this (.*?) operation is not doing, what i thought? In practice this replacement is successfully finished. I am a bit confused.

Thank you and regards
Majo
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

When I Googled non greedy regular expression it returned 13 000 000 results. I'm sure something in there will answer your question.
DerMajo
Posts: 4
Joined: Mon Sep 26, 2011 1:11 pm

Post by DerMajo »

Ähm, I'm knowing the usage of google to get the meaning of "non greedy regular expression". But my question was not "Can you explain me the meaning of..." but also "why does my replacement succeed, while you said it wont, because of non-greedy operations".

Its not important to know because my issue is solved. It was just nice to know why it succeed if non-greedy operators not supported.

So far thank you for your help and hints :)
Bye Majo
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

I did not say your regex wouldn't work; I said it "is not doing what you think it's doing". You used .*?, which is a non-greedy regex. But TextPad doesn't know about non-greedy regexes and therefore doesn't recognise it as one. It recognises it as an optional (?) possibly empty sequence (*) of characters. But an optional possibly empty sequence is just a possibly empty sequence; so the ? is redundant.

Dot (.) may or may not match a newline. With TextPad's recogniser, dot does not match a newline.

If dot does not match a newline, this particular regex will usually match the same text, regardless of whether the repetitions are greedy or non-greedy, as the longest (greediest) match and the shortest (non-greediest) match must both be within a line, and the first repetition must match up to the end of the line. But if dot does match a newline, the repetitions may match across many lines if they are greedy, but may match within a single line if they are non-greedy.

It is usually wise to avoid using non-greedy repetitions if possible, even when they are available, as they are often slower than greedy repetitions.
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

jerry0503214 wrote: May I ask you, where do you know, what syntax the TextPad's regex recogniser supports?
I know the idea is absolutely absurd, but you might take a look at the Textpad Help. Help Topics, Content, Reference Information, Regular Expressions.
ak47wong
Posts: 703
Joined: Tue Aug 12, 2003 9:37 am
Location: Sydney, Australia

Post by ak47wong »

You're responding to a spam posting, MudGuard. Those lines were copied from an earlier post in this thread. Happy New Year anyway!
sosimple
Posts: 30
Joined: Sat May 16, 2009 6:54 am

Post by sosimple »

DerMajo wrote:Ähm, I'm knowing the usage of google to get the meaning of "non greedy regular expression". But my question was not "Can you explain me the meaning of..." but also "why does my replacement succeed, while you said it wont, because of non-greedy operations".

Its not important to know because my issue is solved. It was just nice to know why it succeed if non-greedy operators not supported.

So far thank you for your help and hints :)
Bye Majo
The difference between 'greedy" and "non-greedy" is this:
A greedy match will match the LONGEST group of characters possible before continuing to complete matching of the remainder of your regex.

A non-greedy match will match the SHORTEST group of characters possible before continuing to complete matching of the remainder of your regex.

Depending on your regex expression and the text you are searching within, a greedy match and a non greedy match could easily return the same matching text.

So, for example, if you were searching:
(greedy)
Regex=^.*at
Text=The cat found a hat on the boat.
Match="The cat found a hat on the boat"

(non-greedy)
Regex=^.*?at
Text=The cat found a hat on the boat.
Match="The cat"

TextPad doesn't understand a "non-greedy" search, so TextPad will match "The cat found a hat on the boat" in either case.

If your regex expression and the text you are searching within is composed in some particular way, then the "greedy" and "non-greedy" search will return the same matching text, for example:

(greedy)
Regex=^.*on
Text=The cat found a hat on the boat.
Match="The cat found a hat on"

(non-greedy)
Regex=^.*?on
Text=The cat found a hat on the boat.
Match="The cat found a hat on"

So, in your case, if you are certain that the result of TextPad's search/match was correct for ".*?" then it was correct only by a coincidence that TextPad using a greedy-search, matched the same text as your assumption of what a non-greedy-search would have matched.

Basically, it means that if there is only one possible match, then greedy and non-greedy will match the same text. But, if there are two or more possible ways for the search to match, then the non-greedy search will match the SHORTEST of the possible matches, and the greedy search will match the LONGEST of the possible matches.

Kevin
Post Reply