Removing hard returns

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
quatermass
Posts: 4
Joined: Sun Aug 20, 2006 7:50 pm
Location: Livingston, UK
Contact:

Removing hard returns

Post by quatermass »

I've a OCR document with narrow columns of text and I need to remove the hard returns.

But using textpad and its reg. expressions isn't working for me.

I type in the find box
[:alpha:]\n
and replace
\1 \n

but it strips out the letter found and puts a \1 in its place!

Why?

It's bugging me rotten. The help file in Textpad doesn't show examples!

:(
Stuart Halliday
User avatar
Bob Hansen
Posts: 1516
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

Not totally clear what you are trying to do. The search/replace does not match the text explanation. Showing a Before and After sample is usually best for us to help you.

Here are a couple of options:

1. You can change from [:alpha:]\n to ^(.+)\n and replace with \1 to remove the return codes

2. You can replace hard returns by Search for \n and replace with nothing.

3. You can keep blank lines by first replacing \n\n with ~, then replacing the remaining \n with nothing, then replace the ~ with \n\n (You can use any other character for the ~, just pick a character that is not likely to be in your text).
Hope this was helpful.............good luck,
Bob
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Re: Removing hard returns

Post by MudGuard »

quatermass wrote: I type in the find box
[:alpha:]\n
and replace
\1 \n
but it strips out the letter found and puts a \1 in its place!
\1 refers to the first () in the find expression. As there is no () in your find, \1 can't refer to it ...

[:alpha:] means any of the characters a, h, l, p, :

If you want to find a letter, you have to put the [:alpha:] in a character class, i.e. in [].
Your search should be
[[:alpha:]]\n
And as you want (for whatever reason) to refer back to that character class, it should be in (), thus
([[:alpha:]])\n

But if all you want to do is replace a line break with a space + linebreak why not simply search for \n and replace with _\n (_ representing the space)
Post Reply