I have a file with approx. 1000 lines with over half of the lines that begin with RV... following the third comma, similar to this.
3488,10033,some txt, RV text,I-80 Exit 2,Big Lake,ID,84822,(423)405-1191,
I'd like to move the part that begins with "RV to next comma", ie. "RV text", and add it to the end of the line following the phone number. The "text" represents different text. It could be RV Dump, RV Propane, etc. etc. so that it looks like the following:
3488,10033,some txt, I-80 Exit 2,Big Lake,ID,84822,(423)405-1191,RV text
I've tried but I can't figure out how the replacement piece of the puzzle works to get it right.
Moving section of text to end of line
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
ben_josephs
- Posts: 2464
- Joined: Sun Mar 02, 2003 9:22 pm
Ben, I wonder if you could talk me through the syntax of this line you provided? Including the "Replace with". It gets a little confusing. Especially when I try to test some of the different methods. A couple of times the \1\2 etc. doesn't work the way I'd expect and then I get stuck. I think I've gotten through part of it, but I get a little lost.
-
ben_josephs
- Posts: 2464
- Joined: Sun Mar 02, 2003 9:22 pm
If part of a regex is parenthesised, whatever text that part matches is captured for use in the replacement expression. The parentheses are numbered from left to right by the position of their open-parenthesis symbols. In the replacement expression, \1 represents what was captured by the first parenthesised expression, \2 represents what was captured by the second, and so on.
^([^,]*,[^,]*,[^,]*, *)(RV[^,]*),(.*) matches
where
[^,]* matches:
We stick the matched text back together, reordered as required:
\1\3\2 is composed of:
^([^,]*,[^,]*,[^,]*, *)(RV[^,]*),(.*) matches
Code: Select all
^ the beginning of a line
( start of captured text number 1
[^,]* any text within a line not containing commas (see below)
, a comma
[^,]* any text within a line not containing commas (see below)
, a comma
[^,]* any text within a line not containing commas (see below)
, a comma
* [there's a space before the *] a (possibly empty) sequence of spaces
) end of captured text number 1
( start of captured text number 2
RV the literal text "RV"
[^,]* any text within a line not containing commas (see below)
) end of captured text number 2
, a comma
( start of captured text number 3
.* any text within a line
) end of captured text number 3
[^,]* matches:
Code: Select all
[^,] any character except newline or comma
* ... any number (possibly zero) of times
\1\3\2 is composed of:
Code: Select all
\1 captured text number 1
\3 captured text number 3
\2 captured text number 2