Hello,
It seems it is only possible to delete duplicated lines when you sort a file. I tried to do it using regular expressions but I didn't success:
replace:
\(^.*\n\)\([.]*\)\1
by:
\1\2
report the following error message:
Unmatched '( or }'
Does someone have another idea?
Regards,
Luoji.
Delete duplicated lines
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
Re: Delete duplicated lines
Yes, it would be nice to have a Sort option to "Delete subsequent lines with duplicate keys." Then you could add a padded index column, run the sort on a key other than the index column, and then sort on the index column to restore the starting positions of your records.
I don't see a way to do this, but it would be a great enhancement. You could then use it to massage a data file to cull duplicate keys before loading to a SQL table with a no duplicate constraint.
If you find a way, please post yor method!
Roy
I don't see a way to do this, but it would be a great enhancement. You could then use it to massage a data file to cull duplicate keys before loading to a SQL table with a no duplicate constraint.
If you find a way, please post yor method!
Roy
Re: Delete duplicated lines
I have an awk script which will delete duplicate lines, but I am having a problem setting up Text pad to run this as a tool. I have to save the file I want to dedup and run the awk script from a Cygwin Bash shell.
How do I set up Textpad run the awk script? Any ideas? Here is the awk script:
#! D:/Applications/Cygwin/bin/awk -f
BEGIN {
if (data[$0]++ == 0)
lines[++count] = $0
}
END {
for (i = 1; i <= count; i++)
print lines
}
How do I set up Textpad run the awk script? Any ideas? Here is the awk script:
#! D:/Applications/Cygwin/bin/awk -f
BEGIN {
if (data[$0]++ == 0)
lines[++count] = $0
}
END {
for (i = 1; i <= count; i++)
print lines
}
Re: Delete duplicated lines
search the forum for "dedup" I posted a corrected awk script to remove duplicate lines and how to implement it in another thread.