Deleting batches by lines

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
harrycornelius
Posts: 7
Joined: Tue Jan 16, 2007 12:06 pm

Deleting batches by lines

Post by harrycornelius »

Hi All

I have a rather strange request.

I need to delete lines 2-40, 42-80, 82-120 and so on...

And I then need to delete lines 51-2000, 2051-4000, 4051-6000 and so on...

Can anyone shed some light on the best way to do this? If I can crack this one, it will mean a great deal.

Thanks to everyone in advance.

Harry
User avatar
Bob Hansen
Posts: 1516
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

Regarding this section:
And I then need to delete lines 51-2000, 2051-4000, 4051-6000 and so on...
It looks like you are trying to delete every other block of 1950 lines.

Since TextPad RegEx cannot find a count of \n, you could replace \n with another unique character, like "~", and then do a search for something like (.*~){1950} blocks.

Go to line 51
Find, find next, delete, find next, find next, delete, find next, find next, delete, etc....

When done, replace the "~" with \n and you should be done.
-----------------------------------------

Hmmmm, won't work, because you will still need to move another 50 characters for each next find. Maybe do two finds. The first one finds 2000 characters, then next one does the 1950 to be deleted.
Find 2000, find 1950, delete, find 2000, find 1950, delete, etc.

This probably sound more confusing that I am explaining. I don't have access to TextPad to test this out right now, but maybe the ideas are making sense and you can make it happen.

ben_josephs will probably show up with a much easier method for you.
Hope this was helpful.............good luck,
Bob
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

These suggestions may be easier, but it could be that they're not the suggestions you want. :-)

This isn't the sort of thing that TextPad is best at. I would write a quick script to do it. For example, in Perl, for the first job:

Code: Select all

my $i = 1 ;
for my $line ( <> )
{ if ( ( $i % 40 ) == 1 )
  { print $line ;
  }
  ++ $i ;
}
or, more obscurely,

Code: Select all

for ( my $i = 1 ; <> ; ++ $i )
{ ( ( $i % 40 ) == 1 ) and print $_ ;
}
For the second job:

Code: Select all

my $i = 1 ;
for my $line ( <> )
{ if ( ( ( $i - 1 ) % 2000 ) < 50 ) )
  { print $line ;
  }
  ++ $i ;
}
or

Code: Select all

for ( my $i = 1 ; <> ; ++ $i )
{ ( ( ( $i - 1 ) % 2000 ) < 50 ) and print $_ ;
}
Alternatively, you could use WildEdit (http://www.textpad.com/products/wildedit/):
Find what: (.*\r?\n)(.*\r?\n){39}
Replace with: $1

[X] Regular expression
[X] Replacement format

Options
[X] '.' does not match a newline character
and
Find what: ((.*\r?\n){50})(.*\r?\n){1950}
Replace with: $1
The \r?\n bit allows it to cope with files with either DOS or Unix line endings.

You'll have to buy a licence for WildEdit to use it for files of the size you've indicated.
harrycornelius
Posts: 7
Joined: Tue Jan 16, 2007 12:06 pm

Post by harrycornelius »

Thanks for the help.

I have purchased Wildedit, and will try it in there.

I'll pass on any relevant info.

Thanks everyone,

Harry
harrycornelius
Posts: 7
Joined: Tue Jan 16, 2007 12:06 pm

Post by harrycornelius »

Hi Ben

That has worked a treat, I can't thank you enough as I've been trying to find a solution to this for ages.

I'm writing an article for a company that provide height data to local authorities - and I'll give WildEdit/Textpad a plug, if this is OK with you. This article goes out to all Local Authorities in the UK.

It means that a previously 100Mb plus height data file at 5m intervals now becomes a much more usable file at 65Kb and 200m grid intervals - which is good enough for most applications, especially where the land coverage is large. These grids are then draped with JPEG aerial photography and bought into a VRML where buildings, trees, digital photos etc can be added.

I thought you might be interested in the application.

Thanks again,
Harry
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

You're welcome.

I have no problem with your plugging Helios's products. I have no connection with them except that I use TextPad. :-)
harrycornelius
Posts: 7
Joined: Tue Jan 16, 2007 12:06 pm

A new problem..

Post by harrycornelius »

OK, I'm now trying to create a 100m grid using the following syntax:

Stage1: (.*\r?\n)(.*\r?\n){19}
Replace with: $1

and

Stage2: ((.*\r?\n){100})(.*\r?\n){1900}
Replace with: $1

The first stage works fine, but the second stage 'hangs' and gives me a 'memory exhausted' message. Does anyone have any ideas?

Thanks,
Harry
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

I don't actually use WildEdit, so I haven't got a licence for it. Therefore I can't test it on files as big as yours.

But you might try
((?:.*\r?\n){100})(?:.*\r?\n){1900}
which doesn't capture subexpressions you're not going to use.

If your files have DOS line endings you can use
((?:.*\r\n){100})(?:.*\r\n){1900}
and if they have Unix line endings you can use
((?:.*\n){100})(?:.*\n){1900}
although I doubt that will make much difference.
Post Reply