Page 1 of 1
Is it possible to use the Compare Tool for this?
Posted: Mon Jul 08, 2013 8:27 am
by Kelly
I have two lists. The first list (UMM) contains (75) towns and cities in Maine for parcels layers provided by the University of Maine at Machias and in the second list (MEGIS) there are (235) towns and cities in Maine for parcels layers provided by the Maine Office of GIS.
The Compare Files... tool produces an output that is slightly bewildering to me, but short of this goal:
Create separate Lists Showing
1) Only those towns and cities which are in UMM, but not MEGIS
2) Only those towns and cities which are in MEGIS, but not UMM
3) Only those towns and cities which are in both MEGIS and UMM
I went looking in hopes of finding a
macro or add-on that might help to create the uncommon / common sets but couldn't discover any, nor did doing searches in the forums ... and I'm still wondering: Is it possible to use the Compare Tool for this?
Thank you very much for any hints and pointers.
Kindest regards,
Kelly
Posted: Tue Jul 09, 2013 8:16 am
by ak47wong
You could use the Compare Files tool, but it's not ideal for this. Here's one solution, which doesn't actually use TextPad and which also requires a separate program, namely the
comm utility from UNIX.
The only prerequisite is that your files,
UMM.txt and
MEGIS.txt, must first be sorted. Then follow these steps:
- Download this set of utilities: http://sourceforge.net/projects/unxutils/
- Extract the file usr\local\wbin\comm.exe from the archive UnxUtils.zip.
- At the Windows Command Prompt, enter these commands:
comm -2 -3 UMM.txt MEGIS.txt >UMM_only.txt
comm -1 -3 UMM.txt MEGIS.txt >MEGIS_only.txt
comm -1 -2 UMM.txt MEGIS.txt >common.txt
This produces the following three files:
1) Only those towns and cities which are in UMM, but not MEGIS:
UMM_only.txt
2) Only those towns and cities which are in MEGIS, but not UMM:
MEGIS_only.txt
3) Only those towns and cities which are in both MEGIS and UMM:
common.txt
Perhaps
ben_josephs can offer a Perl script to do the same thing!
Posted: Tue Jul 09, 2013 11:39 am
by Kelly
ak47wong,
Thank you very much for the kind reply and suggestion.
I'll give comm a try once I have replaced the CMOS backup battery (CR2032) for my Ubuntu 12.04 LTS box (old Dell Dimension 8300 desktop) - hopefully later today
The first battery lasted from 2003 to about 2010 when I retired the 8300, but the CR2032's replacement (from Radio Shack) less than 9 months!
Ah, but now I see I was a little confused - this
is for Windows! Good to learn. And for anybody else (like me

looking for an expansion of
here's one link:
http://www.gnu.org/software/coreutils/m ... ation.html
I'll still want to get that battery soon
Thanks again! Very nice of you.
Kelly
Posted: Tue Jul 09, 2013 11:40 am
by ben_josephs
ak47wong wrote:Perhaps ben_josephs can offer a Perl script to do the same thing!
I could offer such a script; it would be easy to write.
But I'm not sure there would be much benefit in writing a new command-line tool when, as you have pointed out, a perfectly good one already exists. And it would require the installation of Perl, which might be of no other use to the OP, who I suspect is not a programmer.
I have no idea whether the versions of unix tools that you recommend are good implementations. I use Cygwin (
http://www.cygwin.com/), which is a huge and regularly updated collection of linux tools for Windows.
Posted: Tue Jul 09, 2013 11:49 am
by Kelly
Hi Ben - you suspect correctly! - I'm no programmer (hello world! is about as far as I got
And just to be sure, is the default sorting that's afforded through TextPad 7.0.9 sufficient for comm's purposes? for example, should the Properties dialog box for both files being compared need the 'Strip trailing spaces from lines saving' check box checked?
Posted: Tue Jul 09, 2013 12:06 pm
by ben_josephs
Yes, if your files use only single-byte characters (not Unicode) then I believe TextPad's ascending, case-sensitive sort is what is required.
And yes, if the files may contain different amounts of trailing white space you should strip that white space, as comm treats lines that differ in the amount of white space as different.
Posted: Tue Jul 09, 2013 12:18 pm
by ben_josephs
Of course, if you're using the command line you might use Windows or linux command-line sort to sort the files. And you might use sed to remove the trailing white space.
Posted: Tue Jul 09, 2013 12:54 pm
by Kelly
Thanks Ben! Actually, the sorting and trail stripping by TextPad proved sufficient for comm.
The -1, -2, -3 switches are good to know about, but actually, running comm without any switch produced three columns showing all three sets in one output (just needed a couple extra tabs between columns).
Very cool little utility - thank you both for your help.
Kelly
Posted: Thu Jul 18, 2013 10:57 pm
by jeffy
Great find! Thanks ak47wong!
Posted: Fri Jul 19, 2013 5:09 am
by ak47wong
Thanks jeffy, but it wasn't really a "find" on my part; the comm utility has been a part of UNIX for 40 years now

Posted: Fri Jul 19, 2013 7:51 am
by Kelly
Hi AK47wong, I'm with Jeff - it was great for me to find it with many thanks to you for sharing it
Best,
Kelly