Page 1 of 1

Find text that does not contain other text

Posted: Thu May 08, 2008 5:46 pm
by melvers
I have had several times when it would have been very beneficial to have a 'not' type operator in the Find dialog for words/strings (not just a single character).

Something like Find: !\(if \)something

This would find all the occurrences of the word 'something' that is not preceeded by 'if '.

Posted: Fri May 09, 2008 3:45 am
by HerNameWasTextPad
Depending upon what \(if \) is, you might be able to construct an expression that does in fact exclude. For example, consider this expression:

.*[^i][^f] something

The above will find all instances of "something," except for those that are preceded exactly by the three characters "if " (without the quotes).

If (excuse the pun) you cannot be that precise, then you will need to construct a messier expression, in order to cover further possibilities.

If you need even greater flexibility, however, then TextPad's implementation of regular expressions will just have to be considered less than what you need.

I myself need an implementation that pays me five bucks every time I run it--but I haven't found one yet.

By the way, you may find the key sequence "F5, Enter, Esc, RightArrow" useful when using the .* construct in the above way, once the appropriate expression is loaded in the "Find what:" text box of the "Find" dialog box.

Edit:

Unfavorably, I have just discovered that the above expression will also exclude such items as "of something" or "is something," so this is NOT the solution that it appears to be.

Posted: Fri May 09, 2008 5:15 am
by HerNameWasTextPad
Just to spin your wheels, think about this:

- Make two copies of the file you need to search
- In a separate instance of TextPad, open both copies
- On Copy 1, Find Next "if something", just to load "if something" into the "Find what" text box of the "Find" dialog box
- On that same Copy 1, run a macro (select "repeat to end of file" on Macros -> Multi-play...) that goes F5, Enter, Esc, Delete in order to delete from Copy 1 all instances of "if something"
- After the macro has finished, on that same Copy 1, Mark All instances of "something"

Now, you have all the lines that contain instances of "something" bookmarked--excluding "if something"--so, at least, you can more easily go to the appropriate line numbers in Copy 2 to edit manually.

Every little bit helps.

Posted: Fri May 09, 2008 5:47 am
by HerNameWasTextPad
OK, last try:

- Replace all instances of "something" by "replacement"
- Replace all instances of "if replacement" by "if something"

I think you're done.

Posted: Fri May 09, 2008 5:43 pm
by ben_josephs
TextPad's regular expression recogniser is very old and is just too weak to do this without a horror such as
([^i]..|.[^f].|..[^ ]|^.?.?)something
(You can do it properly with look-behind assertions, but even WildEdit doesn't offer those.)

But you can bookmark the lines in four steps. To mark lines containing 'something' that don't contain 'if something':

mark lines containing 'something'
invert all bookmarks (Search | Invert All Bookmarks)
mark lines containing 'not something'
invert all bookmarks

This isn't perfect, because it will not mark lines that contain 'something' both with the 'if ' and without the 'if ', such as
something and also if something

But why does it work at all?

If the set of lines containing 'something' is A, and the set of lines containing 'if something' is B, you want to find the lines in A that are not in B, that is, A intersect not B; in symbols: A & ¬ B ( using ¬ for not and & for intersection).

TextPad lets you invert bookmarks and form the union of two sets of bookmarks (if you mark a set of lines A, and mark another set B, you have marked the union of A and B; in symbols: A + B (using + for union)). But it doesn't let you form the intersection, which is what you need.

However, A & ¬ B = ¬ ( ¬ A + B ) , that is, not ( ( not A ) union B ) .

So:

mark lines containing 'something' ( A )
invert all bookmarks ( ¬ A )
mark lines containing 'not something' ( ¬ A + B )
invert all bookmarks ( ¬ ( ¬ A + B ) )

Edit: Corrected typo.

Posted: Fri May 09, 2008 7:09 pm
by HerNameWasTextPad
Excellent set analysis, Ben.

I was toying with something similar, but gave up when I realized that it didn't cover ALL lines containing "something"; however, I like yours better:

- "Mark All" of what you DON'T want (B)
- Search -> Invert All Bookmarks (not B)
- "Find Next" what you DO want (A)

Then, whenever you have BOTH a "bookmark" AND a "find" on the same line (A & not B), you've found what you were looking for; however, this is able to miss an instance of what you DO want that occurs on the same line as an instance of what you DON'T want.
----------

If bookmarking is your objective, then try this particular algorithm:

- Replace All instances of "something" by "abcxyzabc"
- Replace All instances of "if abcxyzabc" by "if something"
- Mark All "abcxyzabc"
- Replace All "abcxyzabc" by "something"
- Enjoy the bookmarks and use F2 to move through the file

I knew Helios had a good reason to forgo the not operator. It just took me 7 years to figure out what it was.

Posted: Tue May 13, 2008 5:36 am
by MudGuard
Ben, you forgot the first step:

Clear all Bookmarks

(Mark all does not remove any existing bookmarks, it just add bookmarks on lines containing the search expression)

Find text that does not contain other text

Posted: Mon Jun 16, 2008 2:14 pm
by melvers
Thanks for the suggestions. They will be useful when I only have a couple of files to deal with. However, more often I need to be able to search 100s of files. Fortunately, this isn't an everyday occurrence. More an every now-and-then type thing.