I Can Consistently Crash TP 7.0.4 with RE

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
User avatar
kengrubb
Posts: 324
Joined: Thu Dec 11, 2003 5:23 pm
Location: Olympia, WA, USA

I Can Consistently Crash TP 7.0.4 with RE

Post by kengrubb »

I can consistently crash TP 7.0.4 using an RE Find, and I have sent the .dmp file to support@textpad.com

I am fairly adept with RE, and thus far the move to Perl RE isn't stopping me. In fact, it makes a lot of things easier.

I wanted to change the first \ in a line to a tab, and then change the last \ in a line to a tab.

The RE Find syntax that crashes TP 7.0.4 is this

^([^\]{1,})\\

I wanted to change it to this:

\1\t

I ended up changing \ to ÿ (\xFF), made my changes, then changed all remaining ÿ back to \

Can anyone help me with a better workaround?
(2[Bb]|[^2].|.[^Bb])

That is the question.
User avatar
bbadmin
Site Admin
Posts: 1020
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

Ken, "^([^\]{1,})\\" is an invalid regular expression, which is causing the crash that will be fixed. Any literal "\" must be input as "\\", otherwise it escapes the next character. In this case it causes the closing "]" to be treated as a literal, so the opening "[" is never closed.

Try searching for: "([^\\\r\n]*)\\(.+)\\"
and replacing with: "$1\t$2\t"
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

I can reproduce that. The Boost regex library throws an exception: "Unmatched [ or [^ in character class declaration" but TextPad doesn't catch it.

You need to quote the backslash in the character set:
^([^\\]{1,})\\

Your regex requires a matched line to begin with a character that isn't a backslash; I assume that's intentional.

It's equivalent to
^([^\\]+)\\

But negated character sets match newline characters, so the whole regex matches across newlines. You need
^([^\\\r\n]+)\\

The placeholder symbol is now $, not \. So your replacement should be
$1\t

If the lines always contain at least two backslashes, you can do it all in one go:
Find what: ^([^\\\r\n]+)\\(.*)\\
Replace with: $1\t$2\t
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

Snap!
User avatar
kengrubb
Posts: 324
Joined: Thu Dec 11, 2003 5:23 pm
Location: Olympia, WA, USA

Post by kengrubb »

Very much appreciated.

So "Not the literal \" is:

[^\\]

Rather than:

[^\]

This actually may clear up one or two similar issues where I was unable to find what I wanted with RE from the past.
(2[Bb]|[^2].|.[^Bb])

That is the question.
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

The regex recogniser used by TextPad before version 7 did not require a backslash in a character set to be quoted with a backslash. So [^\] was the correct expression for any one character that isn't a backslash (note: this is not the same as "not a backslash").

The regex recogniser used by TextPad version 7 allows a character set to contain constructs that use backslash as an escape, such as \n and \x41. Therefore if a backslash is being used as a literal it must be quoted. So [^\\] is now the correct expression for any one character that isn't a backslash.

Note that the backslash has a split personaility. With Perl-style regexes (as in TextPad 7) there is a consistent rule. Before an alphanumeric character the backslash escapes the character: it changes it from a literal to something special. Before a non-alphanumeric charcter the backslash quotes the character: it changes it from something special to a literal.
Post Reply