Finding character within quotes
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
Finding character within quotes
Hello all, I have a CSV file where i need to eliminate all the commas that are contained with quotes on a single entry....for example:
AdrTp,6,String,O,,"ADDR,PBOX,HOME,BIZZ,MLTO,DLVY",0/1,
AdrLine,6,Section,N/A,,N/A,0/5,
TradgSsn,5,Section,N/A,,N/A,0/1,
TradgSsnCd,6,String,O,,"ACHO,ACHC,ACHL,WAM1,WMAI,NNET,JNET,TOS1,TOS2",0/1,
Value,7,String,M,,,1/1,
StrtNm,6,String,O,,,0/1,
Becomes
AdrTp,6,String,O,,"ADDR / PBOX / HOME / BIZZ / MLTO / DLVY",0/1,
AdrLine,6,Section,N/A,,N/A,0/5,
TradgSsn,5,Section,N/A,,N/A,0/1,
TradgSsnCd,6,String,O,,"ACHO / ACHC / ACHL / WAM1 / WMAI / NNET / JNET / TOS1 / TOS2",0/1,
Value,7,String,M,,,1/1,
StrtNm,6,String,O,,,0/1,
Note I added the Bolds just for effect, and the backslash could be any character, it just needs to be something aside from a comma, as comma's are the delimater for the rest of the file....
Thanks!
Steve
AdrTp,6,String,O,,"ADDR,PBOX,HOME,BIZZ,MLTO,DLVY",0/1,
AdrLine,6,Section,N/A,,N/A,0/5,
TradgSsn,5,Section,N/A,,N/A,0/1,
TradgSsnCd,6,String,O,,"ACHO,ACHC,ACHL,WAM1,WMAI,NNET,JNET,TOS1,TOS2",0/1,
Value,7,String,M,,,1/1,
StrtNm,6,String,O,,,0/1,
Becomes
AdrTp,6,String,O,,"ADDR / PBOX / HOME / BIZZ / MLTO / DLVY",0/1,
AdrLine,6,Section,N/A,,N/A,0/5,
TradgSsn,5,Section,N/A,,N/A,0/1,
TradgSsnCd,6,String,O,,"ACHO / ACHC / ACHL / WAM1 / WMAI / NNET / JNET / TOS1 / TOS2",0/1,
Value,7,String,M,,,1/1,
StrtNm,6,String,O,,,0/1,
Note I added the Bolds just for effect, and the backslash could be any character, it just needs to be something aside from a comma, as comma's are the delimater for the rest of the file....
Thanks!
Steve
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
Replace All repeatedly until it's all done.Find what: ^([^"]*("[^"]*{2})*"[^"]*),
Replace with: \1 /
[X] Regular expression
This assumes you are using Posix regular expression syntax:
It searches forConfigure | Preferences | Editor
[X] Use POSIX regular expression syntax
Code: Select all
1. ^ the beginning of a line
2. [^"]*("[^"]*{2})*"[^"]* text containing an odd number of quotes,
that is:
2.1. [^"]* text not containing quotes
2.2. ("[^"]*{2})* text containing an even number
(possibly zero) of quotes
2.3. " a quote
2.4. [^"]* text not containing quotes
3. , a comma
Last edited by ben_josephs on Tue Oct 10, 2006 8:29 am, edited 1 time in total.
I have one that works but it is not ideal as you have to run a search and replace multiple times to catch all the instances.
Make sure you have the Regular expression checkbox ticked.
This is replacing each commas within quotes with a space. The brackets in the find expression create a reference that can be used (\1 \2 etc) within the replacement expression.Find what: (".*),(.*")
Replace with:\1 \2
Make sure you have the Regular expression checkbox ticked.
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
It's been pointed out that there's an error in the regular expression I suggested above. Despite this, the thing as a whole appears to work, although I'm not sure why!
The subexpression
"[^"]*{2}
should be
("[^"]*){2}
and the whole expression should be
^([^"]*(("[^"]*){2})*"[^"]*),
The expression is composed of:
2 is composed of:
2.2 is composed of:
2.2.1 is composed of:
2.2.1.1 is composed of:
2.1, 2.4, 2.2.1.1.2 are each composed of:
Apologies for any confusion caused, and thanks, Ronny.
The subexpression
"[^"]*{2}
should be
("[^"]*){2}
and the whole expression should be
^([^"]*(("[^"]*){2})*"[^"]*),
The expression is composed of:
Code: Select all
1. ^ the beginning of a line
2. [^"]*(("[^"]*){2})*"[^"]* text containing an odd number of quotes,
3. , a comma
Code: Select all
2.1. [^"]* text not containing quotes
2.2. (("[^"]*){2})* text containing an even number
(possibly zero) of quotes
2.3. " a quote
2.4. [^"]* text not containing quotes
Code: Select all
2.2.1 ("[^"]*){2} text containing 2 quotes
2.2.2 * ... any number of times
Code: Select all
2.2.1.1 "[^"]* text containing 1 quote
2.2.1.2 {2} ... twice
Code: Select all
2.2.1.1.1 " a quote
2.2.1.1.2 [^"]* text not containing quotes
Code: Select all
...1 [^"] anything that isn't a quote
...2 * ... any number (possibly zero) of times
Your welcome!
It was a pleasure to help you Ben.ben_josephs wrote: Apologies for any confusion caused, and thanks, Ronny.
Cheers
Meisn