Moving section of text to end of line

BBowers · Post by **BBowers** » Tue Sep 21, 2010 10:35 pm

I have a file with approx. 1000 lines with over half of the lines that begin with RV... following the third comma, similar to this.

3488,10033,some txt, RV text,I-80 Exit 2,Big Lake,ID,84822,(423)405-1191,

I'd like to move the part that begins with "RV to next comma", ie. "RV text", and add it to the end of the line following the phone number. The "text" represents different text. It could be RV Dump, RV Propane, etc. etc. so that it looks like the following:

3488,10033,some txt, I-80 Exit 2,Big Lake,ID,84822,(423)405-1191,RV text

I've tried but I can't figure out how the replacement piece of the puzzle works to get it right.

ben_josephs · Post by **ben_josephs** » Wed Sep 22, 2010 7:37 am

Find what: ^([^,]*,[^,]*,[^,]*, *)(RV[^,]*),(.*)
Replace with: \1\3\2

[X] Regular expression

Replace All

This assumes you are using Posix regular expression syntax:

Configure | Preferences | Editor

[X] Use POSIX regular expression syntax

BBowers · Post by **BBowers** » Wed Sep 22, 2010 4:05 pm

Thanks ben, very helpful.

BBowers · Post by **BBowers** » Wed Sep 22, 2010 10:15 pm

Ben, I wonder if you could talk me through the syntax of this line you provided? Including the "Replace with". It gets a little confusing. Especially when I try to test some of the different methods. A couple of times the \1\2 etc. doesn't work the way I'd expect and then I get stuck. I think I've gotten through part of it, but I get a little lost.

ben_josephs · Post by **ben_josephs** » Thu Sep 23, 2010 10:50 am

If part of a regex is parenthesised, whatever text that part matches is captured for use in the replacement expression. The parentheses are numbered from left to right by the position of their open-parenthesis symbols. In the replacement expression, \1 represents what was captured by the first parenthesised expression, \2 represents what was captured by the second, and so on.

^([^,]*,[^,]*,[^,]*, *)(RV[^,]*),(.*) matches

Code: Select all

^           the beginning of a line
(           start of captured text number 1
  [^,]*     any text within a line not containing commas (see below)
  ,         a comma
  [^,]*     any text within a line not containing commas (see below)
  ,         a comma
  [^,]*     any text within a line not containing commas (see below)
  ,         a comma
   *        [there's a space before the *] a (possibly empty) sequence of spaces
)           end of captured text number 1
(           start of captured text number 2
  RV        the literal text "RV"
  [^,]*     any text within a line not containing commas (see below)
)           end of captured text number 2
,           a comma
(           start of captured text number 3
  .*        any text within a line
)           end of captured text number 3

where
[^,]* matches:

Code: Select all

[^,]        any character except newline or comma
*           ... any number (possibly zero) of times

We stick the matched text back together, reordered as required:
\1\3\2 is composed of:

Code: Select all

\1    captured text number 1
\3    captured text number 3
\2    captured text number 2