RE search-replace dashes

Paul Havemann · Post by **Paul Havemann** » Fri Dec 07, 2001 3:39 pm

I'm trying to tidy up a large body of text which has 'em' dashes in it, by adding spaces around each 'em' dash. But I want to exclude dashes which already have spaces around them. Here's an example with both cases in it:

This is--in my opinion--the best White Castle burger I've ever had -- and I've eaten a lot of them.

with...

This is -- in my opinion -- the best White Castle burger I've ever had -- and I've eaten a lot of them..

I've tried various regular expressions as suggested in the Help file, but I can't find the solution.

Thanks in advance!

Stephan · Post by **Stephan** » Fri Dec 07, 2001 3:52 pm

Click "Search | Replace" (or hit F8)

Find what:
\([^[:space:]]\)\(--\)\([^[:space:]]\)

Replace with
\1 \2 \3

Note: No spaces before '\1' and after '\3' in the above line

BTW, check the 'regular expression' box.

That works on your example, so I think it should work...

Hope that helped,

Stephan

Paul Havemann · Post by **Paul Havemann** » Fri Dec 07, 2001 6:48 pm

Works like a champ. Thanks much!

Perhaps, one day, I'll figure out *why* it works. ;}

Stephan · Post by **Stephan** » Fri Dec 07, 2001 11:04 pm

Well that's not that complicated (just finished "Mastering Regular Expressions"):

'\(' and '\)' specify the start / end of a sequence of characters _and_ remembers that sequence for later referencing.

Now, the RE

\([^[:space:]]\)\(--\)\([^[:space:]]\)

has 3 such parts:

1. \([^[:space:]]\)

That is match one character that's not a white space. The outer []'s delimit a character class, the '^' denotes a NOT and [:space:] is a 'name' for, well, spaces.
Note that this will require exactly one character to match, no more and no less.

2. \(--\)

This matches and 'remembers' just '--'

3. \([^[:space:]]\)

Hey we've already seen that.

Now the capturing and remembering is stored in character sequences like \1, \2 and the like.
Usually the first '\(' (from the left) goes into \1, the 2nd \( is stored in \2...

So if you search '\(aaa\(bbb\)\)' in

aaaabbbbb

and replace it with '*\1#\2+' you'll end up with

a*aaabbb#bbb+bb

as \1 is 'aaabbb' and \2 is bbb - because the '\(' are nested....

Now that I think about it, it's likely that there's a more elegant solution:
Find '\>--\<' and replace with ' -- '

At least works with your example, too. \> matches the end of a word, \< matches the beginning of a word.
Check the help file about this.

Happy regexing!

Stephan

Community

RE search-replace dashes

RE search-replace dashes

Re: RE search-replace dashes

Re: RE search-replace dashes

Re: RE search-replace dashes