Delete Duplicate Lines (without sorting)

Ideas for new features

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply

How important is this feature for future versions of TextPad?

Important, even if the size/efficiency of TextPad increases/degrades
3
13%
Somewhat important
6
26%
I'm okay either way
3
13%
Somewhat unimportant
7
30%
Unimportant, even if the size/efficiency of TextPad remains unaffected
4
17%
 
Total votes: 23

User avatar
talleyrand
Posts: 625
Joined: Mon Jul 21, 2003 6:56 pm
Location: Kansas City, MO, USA
Contact:

Delete Duplicate Lines (without sorting)

Post by talleyrand »

I've been using TP for a few years now and this if the first time I've thought about this feature so it'll get a somewhat vote from me but I think it'd be nice to have the delete duplicate lines functionality (for a file or highlighted lines) outside of the sort tool. I can always shell out to an external program to accomplish this (which I'll do now) but in the future if it's not a bother, I could see it being handy.
I choose to fight with a sack of angry cats.
csalsa
Posts: 20
Joined: Mon Jul 14, 2003 1:36 am

Post by csalsa »

This function could be easily written in a scripting lanaguage like Perl or Python. Why not write such a script and call it from the 'Tools' menu?

Personally I find Python more friendly than Perl and Python is JIT compiled at runtime.
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

Full multiline regex-support is much more important.

And with that, it would be a simple replacement:

^(.*)$\n\1\n
by
\1\n
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

I think that will only work in one instance, if the duplicate lines are directly in sequence with no other lines in between.

:?: Thoughts:
Assume line 1 has duplicates in the file.

I think that will only work if line 2 is the only duplicate.

What if the duplicate line is line 3 or higher?

What if there are two or more duplicates of line 1 - lines 4,7,15?
Hope this was helpful.............good luck,
Bob
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

Then use

^(.*)$\n(.*\n)?\1\n
by
\1\2\n

if necessary, repeat until no more occurences exist...
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

AHA!
Full multiline regex-support
Not used to having that tool here. Thanks MudGuard for a good example.
Hope this was helpful.............good luck,
Bob
iangalbraith
Posts: 3
Joined: Thu Apr 22, 2004 3:26 pm
Location: UK
Contact:

Post by iangalbraith »

I just picked up on this thread. The ability to use full multiline regex, as in Mudguard's example, would be enormously helpful. De-duping of records in a file is a regular requirement of mine - often hundreds of duplicates are present. (I can always sort to get them adjacent, so their initial order is not important.) The need to write or even to use somebody else's code is a pain when in principle a regex find/replace pair could do everything.

Ian
Post Reply