Find string within range

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

gcotterl
Posts: 252
Joined: Wed Mar 10, 2004 8:43 pm
Location: Riverside California USA

Find string within range

Post by gcotterl »

To find rows with '684556' between positions 718 and 1198, I'm using this regular expression:

^.{717,1198}684556

But a dialog box is displayed saying "The complexity of matching the regular expression exceeded the predefined bounds. Try refactoring the regular expression to make each choice made by the state machine unambiguous. This exception is thrown to prevent "eternal" matches that take an indefinite period time to locate."

What would be a better expression?
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

I can't reproduce this.

Is that the exact entire regex?
Please describe the target text (but don't post huge quantities of it).
gcotterl
Posts: 252
Joined: Wed Mar 10, 2004 8:43 pm
Location: Riverside California USA

Post by gcotterl »

Except for the string and the range, it has the same syntax as your previous reply; see:

http://forums.textpad.com/viewtopic.php ... ght=068345

In my file, each row has 1,274 characters; in some of the rows, the string is located within the bounds specified.
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

Please answer my questions explicitly.

Is ^.{717,1198}684556 the exact and entire regex you're using?
Please provide one line that elicits the problem.

I have tried many things, but I can't make this search fail or even just run slowly.

Are you using TextPad 7.2.0?
gcotterl
Posts: 252
Joined: Wed Mar 10, 2004 8:43 pm
Location: Riverside California USA

Post by gcotterl »

I'm using Textpad 7.2.0 (32-bit edition)
The exact and entire regex I'm using is: ^.{717,1198}684556
The "Regular Expression" box is ticked.
My file contains 985,945 rows and each row contains 1,274 characters.
The search string ('684556') exists in 175,296 rows within the specified range.
Here is one line:

00961805522013UNIT 1 CM 181/073 INT. IN COMM IN LOT 1-P OF TR 33576 MB 407/001 Y01105201249560004210COPPER CANYON RD PALM SPRINGS 000092262LND000090270STR000206550 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000010000000002968200351280000003847103900100000005921045121000000296826826170000001680068263700000033946684556000000003320000009999999999900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000210986000000000000002109860000000000000UNPAID UNPAID 00000905200000000000000000000000000UNPAID

[color=yellow]However, Textpad finds thousands of lines containing the string before the dialog box with the "The complexity..." message is displayed.[/color]
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

I'm still unable to reproduce your problem. I even got TextPad to mark 200 000 long lines matching your regex. It took 9 seconds.

Perhaps the issue only arises with the 32-bit version of TextPad. I'm using the 64-bit version.
gcotterl
Posts: 252
Joined: Wed Mar 10, 2004 8:43 pm
Location: Riverside California USA

Post by gcotterl »

I just tried the regex again.

This time, when I clicked "Mark All", TextPad marked 22 rows in the first 13,822 rows before the "The complexity..." dialog box was displayed.

All of the rows look the same (except for the actual data) and no "weird" characters exist.
gcotterl
Posts: 252
Joined: Wed Mar 10, 2004 8:43 pm
Location: Riverside California USA

Post by gcotterl »

When I placed the cursor on the row below the 22nd marked row and pressed "FIND NEXT", the "Complexity ...." box was immediately displayed.
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

How many rows below the 22nd marked row is the first row that the regex should match?

If it's not too many, post all those lines from, but not including, the 22nd marked row up to and including the next one that the regex should match.

Enclose them in a

Code: Select all

[/color][/b]...[b][color=blue]
[/color][/b] block.
gcotterl
Posts: 252
Joined: Wed Mar 10, 2004 8:43 pm
Location: Riverside California USA

Post by gcotterl »

About 603,000 rows are between the 22nd marked row and the next row that contains the string.
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

I still can't reproduce your problem.
Which line is the cursor on when the error message is displayed?
Please post that line and (if it's a different line) the next line on which the regex should match. Enclose them in

Code: Select all

[/color][/b]...[b][color=blue]
[/color][/b] blocks.
As I suggested earlier, the issue might arise only with the 32-bit version of TextPad.
gcotterl
Posts: 252
Joined: Wed Mar 10, 2004 8:43 pm
Location: Riverside California USA

Post by gcotterl »

The cursor is on line 13810 (the first line that contains the search string).

Here is line 13810 and line 13811 (both containing the regex):

[code]00961805522013UNIT 1 CM 181/073 INT. IN COMM IN LOT 1-P OF TR 33576 MB 407/001 Y01105201249560004210COPPER CANYON RD PALM SPRINGS 000092262LND000090270STR000206550 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000010000000002968200351280000003847103900100000005921045121000000296826826170000001680068263700000033946684556000000003320000009999999999900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000210986000000000000002109860000000000000UNPAID UNPAID 00000905200000000000000000000000000UNPAID
00961805632013UNIT 2 CM 181/073 INT. IN COMM IN LOT 1-P OF TR 33576 MB 407/001 Y01105201249560004230COPPER CANYON RD PALM SPRINGS 000092262LND000094000STR000159000 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000010000000002530000351280000003279103900100000005047045121000000253006826170000001680068263700000033946684556000000003320000009999999999900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000183608000000000000001836080000000000000UNPAID UNPAID 00000905300000000000000000000000000UNPAID
[/code]
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

... and deselect Disable BBCode in this post.
gcotterl
Posts: 252
Joined: Wed Mar 10, 2004 8:43 pm
Location: Riverside California USA

Post by gcotterl »

Here you go:

The cursor is on line 13810 (the first line that contains the search string).

Here is line 13810 and line 13811 (both containing the regex):

Code: Select all

00961805522013UNIT 1 CM 181/073 INT. IN COMM IN LOT 1-P OF TR 33576 MB 407/001 Y01105201249560004210COPPER CANYON RD PALM SPRINGS 000092262LND000090270STR000206550 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000010000000002968200351280000003847103900100000005921045121000000296826826170000001680068263700000033946684556000000003320000009999999999900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000210986000000000000002109860000000000000UNPAID UNPAID 00000905200000000000000000000000000UNPAID 
00961805632013UNIT 2 CM 181/073 INT. IN COMM IN LOT 1-P OF TR 33576 MB 407/001 Y01105201249560004230COPPER CANYON RD PALM SPRINGS 000092262LND000094000STR000159000 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000 000000000010000000002530000351280000003279103900100000005047045121000000253006826170000001680068263700000033946684556000000003320000009999999999900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000183608000000000000001836080000000000000UNPAID UNPAID 00000905300000000000000000000000000UNPAID 
 
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

Sorry, I'm still unable to get 64-bit TextPad to fail in this way. I have no more ideas.
Post Reply