Removing line returns between quotes

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
awillson
Posts: 2
Joined: Thu Oct 22, 2009 12:47 pm

Removing line returns between quotes

Post by awillson »

I have a pipe delimited filed from a database extract which I need to import into another tool (Omniscope) unfortunately there is a field containing line returns (real example in bold below) as this field in the application is freetype.

I would like to search and replace all the line returns between |" and "| so that each record is on a single line with 43 fields separated by pipes.

Each of these description fields contain letters from english alphabet, digits and punctuation marks (pretty much anything available on a UK keyboard - annoyingly some also contain a pipe!) but all within the quotes.

AM_WS_SD_CR|RDO - AM|V B|DESKTOP SOFTWARE AND BUSINESS USER APPS|APPLICATION - FUNCTIONALITY / COMPONENT|PERSONAL FOLDERS|LONDON|09/01/2009 04:22:46|END USER TECHNOLOGIES|DESKTOP|SOFTWARE DESKTOP APPLICATION|MICROSOFT OUTLOOK|BLAH|BLAH|Unable to Expand Personal Folder|36029|"Customer unable to expand personal folder.
Personal Folder Name: Archive
File Name: Archive.pst
Location: Clarion server"
|BOB|003|SECONDARY|INC00000123456|0|TITLE|SMITH|09/01/2009 04:22:46|VS|WS_SD_CR|RDO - AM|Low|END USER TECHNOLOGIES|DESKTOP|SOFTWARE DESKTOP APPLICATION|MICROSOFT OUTLOOK|EUROPE|09/01/2009 04:22:41|DESKTOP SOFTWARE AND BUSINESS USER APPS|APPLICATION - FUNCTIONALITY / COMPONENT|PERSONAL FOLDERS|OCS|vb|0|TEMPLATE|IT

To summarise I require a search and replace string to turn the enbolded text above into

|"Customer unable to expand personal folder. Personal Folder Name: Archive File Name: Archive.pst Location: Clarion server"|

I hope I have everything required here, any and all help appreciated.

Many thanks
Andrew
User avatar
talleyrand
Posts: 625
Joined: Mon Jul 21, 2003 6:56 pm
Location: Kansas City, MO, USA
Contact:

Post by talleyrand »

Unfortunately, as you will find in other posts the regular expression parser in TP is rather weak and won't handle the newlines well. Typically, people resolve this by replacing the \n with a character that doesn't exist in the base text, like ~
However, that will bump up against the other issue where the parser is always greedy so it'd find the first quote in row 1 and the last quote in line N. Oh and then replace that character with something like a space.

You might be able to true it up by doing a combo of replacing newlines with an unused character and then resplit the file based on a pipe count (each row has N pipes). Bob/Ben/s_reynisson will probably provide a killer example but I'm still just a novice with regex
I choose to fight with a sack of angry cats.
awillson
Posts: 2
Joined: Thu Oct 22, 2009 12:47 pm

Post by awillson »

Hi,

Many thanks, I have noticed the greed and handling of new lines very frustrating!

thank you for your time.
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

Search for: "(.*)\n(.*)\n(.*)\n(.*")
Replace with: "\1 \2 \3 \4

This assumes that you will always have four lines of info in a similar format to begin with.

====================
Use the following settings:
-----------------------------------------
[X] Regular expression
Replace All
-----------------------------------------
Configure | Preferences | Editor
[X] Use POSIX regular expression syntax
-----------------------------------------
Hope this was helpful.............good luck,
Bob
Post Reply