I want to delete the duplicate file names (always JPG). They occur in every group, always the first and last entries. And I want the remaining filename to be followed by a tab and then the descriptive text.
So for example I want this:
20030820-113000-TP01.jpg
Shortly after leaving Kemble station, on the way to Thames Head, the recognised Thames source.
20030820-113000-TP01.jpg
20030820-120300-TP03-43.jpg
Wed 20th Aug 2003. At the source.
20030820-120300-TP03-43.jpg
20030820-120000-TP02.jpg
At the dry source, Wed 20th August 2003.
Here's another line to screw things up.
20030820-120000-TP02.jpg
20030820-120300-TP03.jpg
Source
20030820-120300-TP03.jpg
to become this:
20030820-113000-TP01.jpg Shortly after leaving Kemble station, on the way to Thames Head, the recognised Thames source.
20030820-120300-TP03-43.jpg Wed 20th Aug 2003. At the source.
20030820-120000-TP02.jpg At the dry source, Wed 20th August 2003. Here's another line to screw things up.
20030820-120300-TP03.jpg Source
I see my tabs have not been replicated here. This is what that example looks like in TextPad:
https://dl.dropbox.com/u/4019461/TextPad-RE-1.jpg
So far my attempts have failed. Is it possible in TextPad please?
--
Terry, East Grinstead, UK
Is this possible?
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
Here's one way to do it. First enable POSIX regular expression syntax in Configure > Preferences > Editor.
Step 1: Collapse each group into a single tab-separated line
Find what: (.)\n(.)
Replace with: \1\t\2
Step 2: Delete the final tab and file name from each line
Find what: (.*)\t.*
Replace with: \1
Step 3: If necessary, fix the tab before the extra "line to screw things up"
Find what: ([^\t]*\t)(.*)\t(.*)
Replace with: \1\2_\3 [replace the underscore with a space]
Step 1: Collapse each group into a single tab-separated line
Find what: (.)\n(.)
Replace with: \1\t\2
Step 2: Delete the final tab and file name from each line
Find what: (.*)\t.*
Replace with: \1
Step 3: If necessary, fix the tab before the extra "line to screw things up"
Find what: ([^\t]*\t)(.*)\t(.*)
Replace with: \1\2_\3 [replace the underscore with a space]
-
ben_josephs
- Posts: 2464
- Joined: Sun Mar 02, 2003 9:22 pm
I'd already typed this, so I might as well post it, to show a very similar, but slightly different, approach.
1. Delete the filename at the end of each group:
1. Delete the filename at the end of each group:
2. Replace the newline following each filename with a tab:Find what: .*\.jpg\n\n
Replace with: \n
3. Join the lines of each group:Find what: \.jpg\n
Replace with: .jpg\t
Find what: (.)\n(.)
Replace with: \1 \2
