I've a OCR document with narrow columns of text and I need to remove the hard returns.
But using textpad and its reg. expressions isn't working for me.
I type in the find box
[:alpha:]\n
and replace
\1 \n
but it strips out the letter found and puts a \1 in its place!
Why?
It's bugging me rotten. The help file in Textpad doesn't show examples!
Removing hard returns
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 4
- Joined: Sun Aug 20, 2006 7:50 pm
- Location: Livingston, UK
- Contact:
Removing hard returns
Stuart Halliday
- Bob Hansen
- Posts: 1516
- Joined: Sun Mar 02, 2003 8:15 pm
- Location: Salem, NH
- Contact:
Not totally clear what you are trying to do. The search/replace does not match the text explanation. Showing a Before and After sample is usually best for us to help you.
Here are a couple of options:
1. You can change from [:alpha:]\n to ^(.+)\n and replace with \1 to remove the return codes
2. You can replace hard returns by Search for \n and replace with nothing.
3. You can keep blank lines by first replacing \n\n with ~, then replacing the remaining \n with nothing, then replace the ~ with \n\n (You can use any other character for the ~, just pick a character that is not likely to be in your text).
Here are a couple of options:
1. You can change from [:alpha:]\n to ^(.+)\n and replace with \1 to remove the return codes
2. You can replace hard returns by Search for \n and replace with nothing.
3. You can keep blank lines by first replacing \n\n with ~, then replacing the remaining \n with nothing, then replace the ~ with \n\n (You can use any other character for the ~, just pick a character that is not likely to be in your text).
Hope this was helpful.............good luck,
Bob
Bob
Re: Removing hard returns
\1 refers to the first () in the find expression. As there is no () in your find, \1 can't refer to it ...quatermass wrote: I type in the find box
[:alpha:]\n
and replace
\1 \n
but it strips out the letter found and puts a \1 in its place!
[:alpha:] means any of the characters a, h, l, p, :
If you want to find a letter, you have to put the [:alpha:] in a character class, i.e. in [].
Your search should be
[[:alpha:]]\n
And as you want (for whatever reason) to refer back to that character class, it should be in (), thus
([[:alpha:]])\n
But if all you want to do is replace a line break with a space + linebreak why not simply search for \n and replace with _\n (_ representing the space)