Page 1 of 1

Delete each line with less than 4 words and...

Posted: Mon May 02, 2011 11:06 am
by beric
I need to delete every line that contains less than 4 words or more than 10.
What can I do?

Posted: Mon May 02, 2011 11:47 am
by SteveH
Can you provide some more information on the sort of text you want to search? Is this structured data or English prose? Is it punctuated in some way?

If this is prose rather than data then words can be separated by spaces, commas, semi colon, question marks etc.

Please post an example.

Posted: Mon May 02, 2011 11:48 am
by ben_josephs
Use "Posix" regular expression syntax:
Configure | Preferences | Editor

[X] Use POSIX regular expression syntax
Remove lines that contain fewer than 4 words:
Find what: ^(\<[a-z0-9_']+\>[^a-z0-9_']*){1,3}\n
Replace with: [nothing]

[X] Regular expression

Replace All
Remove lines that contain more than 10 words:
Find what: ^(\<[a-z0-9_']+\>[^a-z0-9_']*){11,}\n
Replace with: [nothing]

[X] Regular expression

Replace All

Posted: Mon May 02, 2011 11:54 am
by ben_josephs
These are more accurate:

Remove lines that contain fewer than 4 words:
Find what: ^[^a-z0-9_']*([a-z0-9_']+[^a-z0-9_']+){0,2}[a-z0-9_']+[^a-z0-9_']*\n
Replace with: [nothing]

[X] Regular expression

Replace All
Remove lines that contain more than 10 words:
Find what: ^[^a-z0-9_']*([a-z0-9_']+[^a-z0-9_']+){10,}[a-z0-9_']+[^a-z0-9_']*\n
Replace with: [nothing]

[X] Regular expression

Replace All

Posted: Tue May 03, 2011 8:42 pm
by beric
Great thanks!

Posted: Wed Jul 06, 2011 8:26 pm
by actroid
Thanks i follow your instruction and work well....
Have file like this:

mybuyeremail@email.com:mybuyeraddress
mybuyeremail3@email.com:
mybuyeremail4@email.com:
mybuyeremail1@email.com:mybuyeraddress1


how to delete all line donot contain mybuyeraddress....

TQ

Posted: Wed Jul 06, 2011 9:47 pm
by ben_josephs
Search | Find... (<F5>):
Find what: mybuyeraddress

Mark All
<Esc>
Search | Invert All Bookmarks
Edit | Delete | Bookmarked Lines