Deleting batches by lines
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 7
- Joined: Tue Jan 16, 2007 12:06 pm
Deleting batches by lines
Hi All
I have a rather strange request.
I need to delete lines 2-40, 42-80, 82-120 and so on...
And I then need to delete lines 51-2000, 2051-4000, 4051-6000 and so on...
Can anyone shed some light on the best way to do this? If I can crack this one, it will mean a great deal.
Thanks to everyone in advance.
Harry
I have a rather strange request.
I need to delete lines 2-40, 42-80, 82-120 and so on...
And I then need to delete lines 51-2000, 2051-4000, 4051-6000 and so on...
Can anyone shed some light on the best way to do this? If I can crack this one, it will mean a great deal.
Thanks to everyone in advance.
Harry
- Bob Hansen
- Posts: 1516
- Joined: Sun Mar 02, 2003 8:15 pm
- Location: Salem, NH
- Contact:
Regarding this section:
Since TextPad RegEx cannot find a count of \n, you could replace \n with another unique character, like "~", and then do a search for something like (.*~){1950} blocks.
Go to line 51
Find, find next, delete, find next, find next, delete, find next, find next, delete, etc....
When done, replace the "~" with \n and you should be done.
-----------------------------------------
Hmmmm, won't work, because you will still need to move another 50 characters for each next find. Maybe do two finds. The first one finds 2000 characters, then next one does the 1950 to be deleted.
Find 2000, find 1950, delete, find 2000, find 1950, delete, etc.
This probably sound more confusing that I am explaining. I don't have access to TextPad to test this out right now, but maybe the ideas are making sense and you can make it happen.
ben_josephs will probably show up with a much easier method for you.
It looks like you are trying to delete every other block of 1950 lines.And I then need to delete lines 51-2000, 2051-4000, 4051-6000 and so on...
Since TextPad RegEx cannot find a count of \n, you could replace \n with another unique character, like "~", and then do a search for something like (.*~){1950} blocks.
Go to line 51
Find, find next, delete, find next, find next, delete, find next, find next, delete, etc....
When done, replace the "~" with \n and you should be done.
-----------------------------------------
Hmmmm, won't work, because you will still need to move another 50 characters for each next find. Maybe do two finds. The first one finds 2000 characters, then next one does the 1950 to be deleted.
Find 2000, find 1950, delete, find 2000, find 1950, delete, etc.
This probably sound more confusing that I am explaining. I don't have access to TextPad to test this out right now, but maybe the ideas are making sense and you can make it happen.
ben_josephs will probably show up with a much easier method for you.
Hope this was helpful.............good luck,
Bob
Bob
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
These suggestions may be easier, but it could be that they're not the suggestions you want.
This isn't the sort of thing that TextPad is best at. I would write a quick script to do it. For example, in Perl, for the first job:or, more obscurely,
For the second job:
or
Alternatively, you could use WildEdit (http://www.textpad.com/products/wildedit/):
You'll have to buy a licence for WildEdit to use it for files of the size you've indicated.
This isn't the sort of thing that TextPad is best at. I would write a quick script to do it. For example, in Perl, for the first job:
Code: Select all
my $i = 1 ;
for my $line ( <> )
{ if ( ( $i % 40 ) == 1 )
{ print $line ;
}
++ $i ;
}
Code: Select all
for ( my $i = 1 ; <> ; ++ $i )
{ ( ( $i % 40 ) == 1 ) and print $_ ;
}
Code: Select all
my $i = 1 ;
for my $line ( <> )
{ if ( ( ( $i - 1 ) % 2000 ) < 50 ) )
{ print $line ;
}
++ $i ;
}
Code: Select all
for ( my $i = 1 ; <> ; ++ $i )
{ ( ( ( $i - 1 ) % 2000 ) < 50 ) and print $_ ;
}
andFind what: (.*\r?\n)(.*\r?\n){39}
Replace with: $1
[X] Regular expression
[X] Replacement format
Options
[X] '.' does not match a newline character
The \r?\n bit allows it to cope with files with either DOS or Unix line endings.Find what: ((.*\r?\n){50})(.*\r?\n){1950}
Replace with: $1
You'll have to buy a licence for WildEdit to use it for files of the size you've indicated.
-
- Posts: 7
- Joined: Tue Jan 16, 2007 12:06 pm
-
- Posts: 7
- Joined: Tue Jan 16, 2007 12:06 pm
Hi Ben
That has worked a treat, I can't thank you enough as I've been trying to find a solution to this for ages.
I'm writing an article for a company that provide height data to local authorities - and I'll give WildEdit/Textpad a plug, if this is OK with you. This article goes out to all Local Authorities in the UK.
It means that a previously 100Mb plus height data file at 5m intervals now becomes a much more usable file at 65Kb and 200m grid intervals - which is good enough for most applications, especially where the land coverage is large. These grids are then draped with JPEG aerial photography and bought into a VRML where buildings, trees, digital photos etc can be added.
I thought you might be interested in the application.
Thanks again,
Harry
That has worked a treat, I can't thank you enough as I've been trying to find a solution to this for ages.
I'm writing an article for a company that provide height data to local authorities - and I'll give WildEdit/Textpad a plug, if this is OK with you. This article goes out to all Local Authorities in the UK.
It means that a previously 100Mb plus height data file at 5m intervals now becomes a much more usable file at 65Kb and 200m grid intervals - which is good enough for most applications, especially where the land coverage is large. These grids are then draped with JPEG aerial photography and bought into a VRML where buildings, trees, digital photos etc can be added.
I thought you might be interested in the application.
Thanks again,
Harry
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
-
- Posts: 7
- Joined: Tue Jan 16, 2007 12:06 pm
A new problem..
OK, I'm now trying to create a 100m grid using the following syntax:
Stage1: (.*\r?\n)(.*\r?\n){19}
Replace with: $1
and
Stage2: ((.*\r?\n){100})(.*\r?\n){1900}
Replace with: $1
The first stage works fine, but the second stage 'hangs' and gives me a 'memory exhausted' message. Does anyone have any ideas?
Thanks,
Harry
Stage1: (.*\r?\n)(.*\r?\n){19}
Replace with: $1
and
Stage2: ((.*\r?\n){100})(.*\r?\n){1900}
Replace with: $1
The first stage works fine, but the second stage 'hangs' and gives me a 'memory exhausted' message. Does anyone have any ideas?
Thanks,
Harry
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
I don't actually use WildEdit, so I haven't got a licence for it. Therefore I can't test it on files as big as yours.
But you might try
((?:.*\r?\n){100})(?:.*\r?\n){1900}
which doesn't capture subexpressions you're not going to use.
If your files have DOS line endings you can use
((?:.*\r\n){100})(?:.*\r\n){1900}
and if they have Unix line endings you can use
((?:.*\n){100})(?:.*\n){1900}
although I doubt that will make much difference.
But you might try
((?:.*\r?\n){100})(?:.*\r?\n){1900}
which doesn't capture subexpressions you're not going to use.
If your files have DOS line endings you can use
((?:.*\r\n){100})(?:.*\r\n){1900}
and if they have Unix line endings you can use
((?:.*\n){100})(?:.*\n){1900}
although I doubt that will make much difference.