I'm trying to tidy up a large body of text which has 'em' dashes in it, by adding spaces around each 'em' dash. But I want to exclude dashes which already have spaces around them. Here's an example with both cases in it:
This is--in my opinion--the best White Castle burger I've ever had -- and I've eaten a lot of them.
with...
This is -- in my opinion -- the best White Castle burger I've ever had -- and I've eaten a lot of them..
I've tried various regular expressions as suggested in the Help file, but I can't find the solution.
Thanks in advance!
RE search-replace dashes
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
Stephan
Re: RE search-replace dashes
Click "Search | Replace" (or hit F8)
Find what:
\([^[:space:]]\)\(--\)\([^[:space:]]\)
Replace with
\1 \2 \3
Note: No spaces before '\1' and after '\3' in the above line
BTW, check the 'regular expression' box.
That works on your example, so I think it should work...
Hope that helped,
Stephan
Find what:
\([^[:space:]]\)\(--\)\([^[:space:]]\)
Replace with
\1 \2 \3
Note: No spaces before '\1' and after '\3' in the above line
BTW, check the 'regular expression' box.
That works on your example, so I think it should work...
Hope that helped,
Stephan
-
Paul Havemann
Re: RE search-replace dashes
Works like a champ. Thanks much!
Perhaps, one day, I'll figure out *why* it works. ;}
Perhaps, one day, I'll figure out *why* it works. ;}
-
Stephan
Re: RE search-replace dashes
Well that's not that complicated (just finished "Mastering Regular Expressions"):
'\(' and '\)' specify the start / end of a sequence of characters _and_ remembers that sequence for later referencing.
Now, the RE
\([^[:space:]]\)\(--\)\([^[:space:]]\)
has 3 such parts:
1. \([^[:space:]]\)
That is match one character that's not a white space. The outer []'s delimit a character class, the '^' denotes a NOT and [:space:] is a 'name' for, well, spaces.
Note that this will require exactly one character to match, no more and no less.
2. \(--\)
This matches and 'remembers' just '--'
3. \([^[:space:]]\)
Hey we've already seen that.
Now the capturing and remembering is stored in character sequences like \1, \2 and the like.
Usually the first '\(' (from the left) goes into \1, the 2nd \( is stored in \2...
So if you search '\(aaa\(bbb\)\)' in
aaaabbbbb
and replace it with '*\1#\2+' you'll end up with
a*aaabbb#bbb+bb
as \1 is 'aaabbb' and \2 is bbb - because the '\(' are nested....
Now that I think about it, it's likely that there's a more elegant solution:
Find '\>--\<' and replace with ' -- '
At least works with your example, too. \> matches the end of a word, \< matches the beginning of a word.
Check the help file about this.
Happy regexing!
Stephan
'\(' and '\)' specify the start / end of a sequence of characters _and_ remembers that sequence for later referencing.
Now, the RE
\([^[:space:]]\)\(--\)\([^[:space:]]\)
has 3 such parts:
1. \([^[:space:]]\)
That is match one character that's not a white space. The outer []'s delimit a character class, the '^' denotes a NOT and [:space:] is a 'name' for, well, spaces.
Note that this will require exactly one character to match, no more and no less.
2. \(--\)
This matches and 'remembers' just '--'
3. \([^[:space:]]\)
Hey we've already seen that.
Now the capturing and remembering is stored in character sequences like \1, \2 and the like.
Usually the first '\(' (from the left) goes into \1, the 2nd \( is stored in \2...
So if you search '\(aaa\(bbb\)\)' in
aaaabbbbb
and replace it with '*\1#\2+' you'll end up with
a*aaabbb#bbb+bb
as \1 is 'aaabbb' and \2 is bbb - because the '\(' are nested....
Now that I think about it, it's likely that there's a more elegant solution:
Find '\>--\<' and replace with ' -- '
At least works with your example, too. \> matches the end of a word, \< matches the beginning of a word.
Check the help file about this.
Happy regexing!
Stephan