Is this a reg exp bug or is it just me?
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
Is this a reg exp bug or is it just me?
Search for (regexp non-Posix) ^.*: *\|\\n"
Replace All with nothing
Before:
"PcTestData:\n"
"PcAnswerPosX: 4027727.93728731\n"
After:
\n"
4027727.93728731
I was expecting:
4027727.93728731
Replace All with nothing
Before:
"PcTestData:\n"
"PcAnswerPosX: 4027727.93728731\n"
After:
\n"
4027727.93728731
I was expecting:
4027727.93728731
- s_reynisson
- Posts: 939
- Joined: Tue May 06, 2003 1:59 pm
- Bob Hansen
- Posts: 1516
- Joined: Sun Mar 02, 2003 8:15 pm
- Location: Salem, NH
- Contact:
did not work for me.^[^.0-9]*|\\n"
Try this:
Search for:
^.*: ([0-9]+.[0-9]+)\\n"
Replace with:
\1
Explanation of Search RegEx:
^......................Start from beginning of line
.* ....................Any number of characters
: ..................... A colon followed by a space (there is a space after this colon)
( ......................Beginning of first tagged expression
[0-9]+ .............One or more digits
. ......................A period (decimal point)
[0-9]+ ..............One or more digits
) .......................End of first tagged expression
\ ......................Treat next character as normal character
\n" ...................Specific string of characters
Explanation of Replace RegEx:
\1 ..................Contents of first tagged expression
======================================
This is assuming that the double quotes are really on the lines.
This does not eliminate the first line.
This will result with convert from:
"PcTestData:\n"
"PcAnswerPosX: 4027727.93728731\n"
To:
"PcTestData:\n"
4027727.93728731
Hope this was helpful.............good luck,
Bob
Bob
- s_reynisson
- Posts: 939
- Joined: Tue May 06, 2003 1:59 pm
- Bob Hansen
- Posts: 1516
- Joined: Sun Mar 02, 2003 8:15 pm
- Location: Salem, NH
- Contact:
Hmmm back to s_reynisson.
It looks like I was wrong.
Your Regex just worked for me. It appears that I made an operational error in my testing.
I had done a Find using your RegEx and that did not work. I frequently have done Finds in the past so I would not do a Replace by mistake. But I just now did a Search/Replace for the RegEx, and left the Replace field blank, clicked on Replace All, and it did work as you showed, with the blank line above the number string.
Gotta go back and understand how TextPad is doing that, replacing something different than what shows up as Find. I may have to relearn the way I have been using RegEx with TestPad.
Thanks for your solution.
It looks like I was wrong.
Your Regex just worked for me. It appears that I made an operational error in my testing.
I had done a Find using your RegEx and that did not work. I frequently have done Finds in the past so I would not do a Replace by mistake. But I just now did a Search/Replace for the RegEx, and left the Replace field blank, clicked on Replace All, and it did work as you showed, with the blank line above the number string.
Gotta go back and understand how TextPad is doing that, replacing something different than what shows up as Find. I may have to relearn the way I have been using RegEx with TestPad.
Thanks for your solution.
Hope this was helpful.............good luck,
Bob
Bob
- s_reynisson
- Posts: 939
- Joined: Tue May 06, 2003 1:59 pm
Ok, I must be getting tired
TP simply stops on line 3 and does not report
"can not find regular expression...."
Same if you put an emty line 1, stops on that to.
Edit: tested a little more and
So the solution ^[^.0-9]+|\\n" is ok if there are emty lines.
Code: Select all
"PcTestData:\n"
"PcAnswerPosX: 4027727.93728731\n"
to test, TP stops on line 3
"can not find regular expression...."
Same if you put an emty line 1, stops on that to.
Edit: tested a little more and
Code: Select all
^.*|\\n" fails
^.+|\\n" works
Then I open up and see
the person fumbling here is me
a different way to be
the person fumbling here is me
a different way to be
- Bob Hansen
- Posts: 1516
- Joined: Sun Mar 02, 2003 8:15 pm
- Location: Salem, NH
- Contact:
Did some more testing on ^[^.0-9]*|\\n", seeing a number of strange (to me) things. Using POSIX, Conditions are text, and regex. Scope is active document. All tests are done from the top of the document.
Using Search, Replace dialog box:
1. If I have multiple lines on document with some blank lines in between, doing "Replace All" works fine, all characters that are not digits/decimals are eliminated. (brackets"[]", parenthesis "()", operators "+-*\"^, pipes "|", colons":", are all treated as digits/decimal, they are not eliminated)
2. If I do Find Next, it will highlight the correct strings until it comes to a blank line. It stops looking at that point. If I move the cursor down to another line with text, then FindNext continues correctly until it finds a blank line and stops again.
3. If I do Replace Next, the first instance is replaced. But it does not find any other instances. This is probably related to #2 above, because the line it is on after the first replace is a blank line.
4. Using Find dialog box, selecting Mark All marks every line. Empty lines, all text, all digits, mixed text/digits.
I did not expect to see the extra "digit/decimal" characters retained.
I did not expect to see the Find Next procress stop at a blank line, thought it would continue to end of document.
I did not expect to see every line bookmarked with Mark All, it must be because of the invisible \n
I suppose these results may be normal, but thought I would detail them for others to also understand this behavior. Can someone please provide an explanation for me?
======================
Editing note after posting: I see that s_reynisson also saw some of these anomolies. He snuck that posting in ahead of me while I was working on this. He also modified the code to continue past blank lines.......atta boy!.
Looks like it may be better to use "+" for one or more rather than "*" for any quantity. Only in this instance or similar instances, or as a general rule in TextPad? If also in similar, can guidelines be provided to define "similar instances". What are the tradeoffs of "+" vs. "*" ?
Using Search, Replace dialog box:
1. If I have multiple lines on document with some blank lines in between, doing "Replace All" works fine, all characters that are not digits/decimals are eliminated. (brackets"[]", parenthesis "()", operators "+-*\"^, pipes "|", colons":", are all treated as digits/decimal, they are not eliminated)
2. If I do Find Next, it will highlight the correct strings until it comes to a blank line. It stops looking at that point. If I move the cursor down to another line with text, then FindNext continues correctly until it finds a blank line and stops again.
3. If I do Replace Next, the first instance is replaced. But it does not find any other instances. This is probably related to #2 above, because the line it is on after the first replace is a blank line.
4. Using Find dialog box, selecting Mark All marks every line. Empty lines, all text, all digits, mixed text/digits.
I did not expect to see the extra "digit/decimal" characters retained.
I did not expect to see the Find Next procress stop at a blank line, thought it would continue to end of document.
I did not expect to see every line bookmarked with Mark All, it must be because of the invisible \n
I suppose these results may be normal, but thought I would detail them for others to also understand this behavior. Can someone please provide an explanation for me?
======================
Editing note after posting: I see that s_reynisson also saw some of these anomolies. He snuck that posting in ahead of me while I was working on this. He also modified the code to continue past blank lines.......atta boy!.
Looks like it may be better to use "+" for one or more rather than "*" for any quantity. Only in this instance or similar instances, or as a general rule in TextPad? If also in similar, can guidelines be provided to define "similar instances". What are the tradeoffs of "+" vs. "*" ?
Hope this was helpful.............good luck,
Bob
Bob
Well, I came in this morning to find a flurry of responses. Thank-you so much. I should point out that I don't actually need a solution to the regexp - but thanks anyway. But can someone tell me either is my original regexp wrong or is there a bug in the TP regexp code? From Bob's research I think there are some anomalies. Perhaps Helios will take note. Having said that, it's hard to find fault with TP especially now that Helios have been more actively dealing with problems.
- Bob Hansen
- Posts: 1516
- Joined: Sun Mar 02, 2003 8:15 pm
- Location: Salem, NH
- Contact:
Hi Ed. This is a fair question:
I have just spent hours testing various modifications of the RegEx you were asking about. I have been working in the area of the "non ordinary" characters "\" and "n" and the timing associated with those replacements.
I have come up with a number of combinations that step through properly giving correct results with Replace Next, but when doing Replace All, the "\" is still there. The best I can do still ends up with a "\" on the first line.
One quick observation: At times, when I add a real backslash "\", I may end up with message that "missing ( or )" is reason for failure. But the number of matching ( and ) is correct. And that is using \( and \) for non-Posix as you indicated in your first message.
I will try to do some more testing using Posix instead to see if both behave the same.
At one point, I thought I had it figured out. Can't go into details right now, but now I also wonder if there may be a bug here. Will try to do some more testing, don't want to cry "Wolf!" Although I think you were provided with a solution, your question is still valid, on the surface, the Regex looks like it should work.But can someone tell me either is my original regexp wrong or is there a bug in the TP regexp code?
I have just spent hours testing various modifications of the RegEx you were asking about. I have been working in the area of the "non ordinary" characters "\" and "n" and the timing associated with those replacements.
I have come up with a number of combinations that step through properly giving correct results with Replace Next, but when doing Replace All, the "\" is still there. The best I can do still ends up with a "\" on the first line.
One quick observation: At times, when I add a real backslash "\", I may end up with message that "missing ( or )" is reason for failure. But the number of matching ( and ) is correct. And that is using \( and \) for non-Posix as you indicated in your first message.
I will try to do some more testing using Posix instead to see if both behave the same.
Hope this was helpful.............good luck,
Bob
Bob