find words with the same beginning
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
find words with the same beginning
hi there,
i don't know if i'm just too stupid or if i just didn't search as i should, but i didn't find any answer to my problem.
i have a text. in this text, there are normally a lot of names, beginning always with the same 3 letters, f. ex
ab- (then, there are some more letters, f.ex the whole name is ab-cdef-gh-1)
and i found out how to mark the line, where these words are, and i can cut the marked lines, but what i need is just all the names that begin with this ab- ...
can anyone help or give me a search - hint?
thx
i don't know if i'm just too stupid or if i just didn't search as i should, but i didn't find any answer to my problem.
i have a text. in this text, there are normally a lot of names, beginning always with the same 3 letters, f. ex
ab- (then, there are some more letters, f.ex the whole name is ab-cdef-gh-1)
and i found out how to mark the line, where these words are, and i can cut the marked lines, but what i need is just all the names that begin with this ab- ...
can anyone help or give me a search - hint?
thx
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
You haven't provided enough information.
If the names can be anywhere in the line, and
if there is only one matching name on any line, and
if the characters in the names are restricted to letters, digits and hyphens, and
if names embedding the prefix (such as cd-abef-gh-1) are not to be matched,
then
try this:
Use "Posix" regular expression syntax:
If the names can be anywhere in the line, and
if there is only one matching name on any line, and
if the characters in the names are restricted to letters, digits and hyphens, and
if names embedding the prefix (such as cd-abef-gh-1) are not to be matched,
then
try this:
Use "Posix" regular expression syntax:
Search | Replace... (<F8>):Configure | Preferences | Editor
[X] Use POSIX regular expression syntax
Find what: (^|.*[^a-z0-9-])(ab-[a-z0-9-]+).*
Replace with: \2
[X] Regular expression
Replace All
Last edited by ben_josephs on Tue May 17, 2011 11:24 am, edited 1 time in total.
If you perform the following search and replace it will delete all (nearly all - see below) lines containing the string 'ab-'
This saerches for all lines (^ is the start of a line) followed by any characters (.) repeated zero or more times (*) followed by the 'ab-' string then any more characters until the final line feed (\n).
Where this won't work is if the last line of the file contains one of the names. This is because there is no line feed there..
It might be worth checking out TextPad's help file on regular expression searching as it's a good introduction.
Hope this helps.[/i]
In the preferences make sure you have enabled POSIX regular expressions.Find what: ^.*ab-.*\n
Replace with: [nothing]
[X] Regular expression
Replace All
This saerches for all lines (^ is the start of a line) followed by any characters (.) repeated zero or more times (*) followed by the 'ab-' string then any more characters until the final line feed (\n).
Where this won't work is if the last line of the file contains one of the names. This is because there is no line feed there..
It might be worth checking out TextPad's help file on regular expression searching as it's a good introduction.
Hope this helps.[/i]
Running TextPad 5.4 on Windows XP SP3 and on OS X 10.7 under VMWare or Crossover.
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
The OP wants to keep the matching words, not delete the lines containing them.
And ^.*ab-.*\n matches lines containing words that embed ab- as well as lines containing words that begin with it.
And ^.*ab-.*\n matches lines containing words that embed ab- as well as lines containing words that begin with it.
Last edited by ben_josephs on Tue May 17, 2011 11:29 am, edited 1 time in total.
I've just re-read your reply and realised that you probably don't want to cut all the lines that contain the 'ab-' strings as I first thought.
ben_josephs answer is appropriate if you want to list them. I'll leave my incorrect answer up in case it's of interest.
As an aside, is there a way to delete the last line if it contains the name?
ben_josephs answer is appropriate if you want to list them. I'll leave my incorrect answer up in case it's of interest.
As an aside, is there a way to delete the last line if it contains the name?
Running TextPad 5.4 on Windows XP SP3 and on OS X 10.7 under VMWare or Crossover.
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
A line not terminated with a line terminator is arguably not a line.
I don't believe that using a single TextPad regex you can match both terminated lines (with their terminators) and an unterminated last line, because TextPad's weak regex recogniser doesn't allow a \n to be in an alternation, to be quantified, or to be contained in a parenthesised expression. (It appears to allow \n?, but that doesn't work.)
I don't believe that using a single TextPad regex you can match both terminated lines (with their terminators) and an unterminated last line, because TextPad's weak regex recogniser doesn't allow a \n to be in an alternation, to be quantified, or to be contained in a parenthesised expression. (It appears to allow \n?, but that doesn't work.)
hi there,
thx for your answers...
indeed i wanted to be able to copy these words... with ^.*ab-.* i can find the lines that have these expressions, but is there a way to just mark these words, and not the whole line?
what i am trying to do is to copy a whole text and just search for the names and copy them in a new textfile...
txh
thx for your answers...
indeed i wanted to be able to copy these words... with ^.*ab-.* i can find the lines that have these expressions, but is there a way to just mark these words, and not the whole line?
what i am trying to do is to copy a whole text and just search for the names and copy them in a new textfile...
txh
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
Can you provide a sample of the file format to see what is required? If you can show this as Code to keep the formatting it will help.pfistyle wrote:what i am trying to do is to copy a whole text and just search for the names and copy them in a new textfile...
To provide a solution you need to provide some guidance by answering the questions from ben_josephs regarding the positions of the names and whether the 'ab-' string can occur elsewhere (such as cd-efab-gh-1).
Running TextPad 5.4 on Windows XP SP3 and on OS X 10.7 under VMWare or Crossover.
hi there,
sorry for being absent for a while.
the text looks f. ex. like this:
bla bla bla ab-cdef-gh-01 bla bla bla bla bla
ab-ijkl-gh-07 bla bla bla ab-ijkl-gh-08 bla
bla bla bla bla ab-mnop-gh-003 bla bla bla
bla bla ab-qrst-gh-05 bla bla:
ab-uvw-gh-01
bla bla bla...
hope this helped to make clear the structure... now the expressions i look for are always beginning with the same two letters, a "-", then 3 to 4 letters random, another "-", then the same two letters and in the end there is always a 2 or 3 digit number.
thanks!
sorry for being absent for a while.
the text looks f. ex. like this:
bla bla bla ab-cdef-gh-01 bla bla bla bla bla
ab-ijkl-gh-07 bla bla bla ab-ijkl-gh-08 bla
bla bla bla bla ab-mnop-gh-003 bla bla bla
bla bla ab-qrst-gh-05 bla bla:
ab-uvw-gh-01
bla bla bla...
hope this helped to make clear the structure... now the expressions i look for are always beginning with the same two letters, a "-", then 3 to 4 letters random, another "-", then the same two letters and in the end there is always a 2 or 3 digit number.
thanks!
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
Please explain precisely in what way my earlier suggestion isn't suitable.
Here's another, more restrictive, suggestion that should work with the style of text in your example:
Here's another, more restrictive, suggestion that should work with the style of text in your example:
Or evenFind what: (^|.* )(ab-[a-z]+-gh-[0-9]+).*
Replace with: \2
[X] Regular expression
Replace All
These assume you are using "Posix" regular expression syntax:Find what: .*\<(ab-[a-z]+-gh-[0-9]+).*
Replace with: \1
[X] Regular expression
Replace All
Configure | Preferences | Editor
[X] Use POSIX regular expression syntax
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
The other thing to try is to make a TextPad file containing your example text and run the search and replace on that.
If it works on that text (it should) and not on the 'real' file then your example may not represent the real file.
Code: Select all
bla bla bla ab-cdef-gh-01 bla bla bla bla bla
ab-ijkl-gh-07 bla bla bla ab-ijkl-gh-08 bla
bla bla bla bla ab-mnop-gh-003 bla bla bla
bla bla ab-qrst-gh-05 bla bla:
ab-uvw-gh-01
bla bla bla
Running TextPad 5.4 on Windows XP SP3 and on OS X 10.7 under VMWare or Crossover.