Page 1 of 1
Moving section of text to end of line
Posted: Tue Sep 21, 2010 10:35 pm
by BBowers
I have a file with approx. 1000 lines with over half of the lines that begin with RV... following the third comma, similar to this.
3488,10033,some txt, RV text,I-80 Exit 2,Big Lake,ID,84822,(423)405-1191,
I'd like to move the part that begins with "RV to next comma", ie. "RV text", and add it to the end of the line following the phone number. The "text" represents different text. It could be RV Dump, RV Propane, etc. etc. so that it looks like the following:
3488,10033,some txt, I-80 Exit 2,Big Lake,ID,84822,(423)405-1191,RV text
I've tried but I can't figure out how the replacement piece of the puzzle works to get it right.
Posted: Wed Sep 22, 2010 7:37 am
by ben_josephs
Find what: ^([^,]*,[^,]*,[^,]*, *)(RV[^,]*),(.*)
Replace with: \1\3\2
[X] Regular expression
Replace All
This assumes you are using Posix regular expression syntax:
Configure | Preferences | Editor
[X] Use POSIX regular expression syntax
Posted: Wed Sep 22, 2010 4:05 pm
by BBowers
Thanks ben, very helpful.
Posted: Wed Sep 22, 2010 10:15 pm
by BBowers
Ben, I wonder if you could talk me through the syntax of this line you provided? Including the "Replace with". It gets a little confusing. Especially when I try to test some of the different methods. A couple of times the \1\2 etc. doesn't work the way I'd expect and then I get stuck. I think I've gotten through part of it, but I get a little lost.
Posted: Thu Sep 23, 2010 10:50 am
by ben_josephs
If part of a regex is parenthesised, whatever text that part matches is captured for use in the replacement expression. The parentheses are numbered from left to right by the position of their open-parenthesis symbols. In the replacement expression,
\1 represents what was captured by the first parenthesised expression,
\2 represents what was captured by the second, and so on.
^([^,]*,[^,]*,[^,]*, *)(RV[^,]*),(.*) matches
Code: Select all
^ the beginning of a line
( start of captured text number 1
[^,]* any text within a line not containing commas (see below)
, a comma
[^,]* any text within a line not containing commas (see below)
, a comma
[^,]* any text within a line not containing commas (see below)
, a comma
* [there's a space before the *] a (possibly empty) sequence of spaces
) end of captured text number 1
( start of captured text number 2
RV the literal text "RV"
[^,]* any text within a line not containing commas (see below)
) end of captured text number 2
, a comma
( start of captured text number 3
.* any text within a line
) end of captured text number 3
where
[^,]* matches:
Code: Select all
[^,] any character except newline or comma
* ... any number (possibly zero) of times
We stick the matched text back together, reordered as required:
\1\3\2 is composed of:
Code: Select all
\1 captured text number 1
\3 captured text number 3
\2 captured text number 2