Extract Email addresses

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
Tommasino
Posts: 2
Joined: Thu Nov 12, 2009 6:40 pm

Extract Email addresses

Post by Tommasino »

I need to extract email addresses from a large messy txt file. The addresses are between various tags, but there does not seem to be much consistency and there are no line brakes. Here is an example:

What I have:
---------------------------------------------------
aage.something@clp.noKommunal LandspensjonskasseInsurance Company<_tags/>Asomething@Summitsoemthing.comAngeloSomething Biz MgmtMulti-Dweller OfficeUSARoswellGA(622) 355-3116<_tags/>aa_se@macalusterinstitution.eduMacalester CollegeUniversity<_tags/>cd@bidart-reiman.com

What I need (comma or tab delimited):
---------------------------------------------------
aage.something@clp.no, Asomething@Summitsoemthing.com, aa_se@macalusterinstitution.edu, cd@bidart-reiman.com

Any suggestions for a regex that would get me there (or close to)?
Thanks.
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

Here is an approach based on the simple sample file you supplied.

Take two steps:

1. Insert a line break "\n" at the end of each tag so that each email will start at the beginning of the lines.
Search for: <_tags/>
Replace with: \0\n


2. Extract the email address from the front.
Search for: ^(.*@.*\.[a-z]{1,3}[^[:upper:]]).*
Replace with: \1

You can then replace each line break with a comma space or a tab.

Use the following settings:
-----------------------------------------
[X] Match case
[X] Regular expression
Replace All
-----------------------------------------
Configure | Preferences | Editor
[X] Use POSIX regular expression syntax
-----------------------------------------

Email addresses can be more complex and there are many more complex Search strings to be considered, but, as noted above, if they meet the format shown in you example, this will probably be OK for you.
Hope this was helpful.............good luck,
Bob
Tommasino
Posts: 2
Joined: Thu Nov 12, 2009 6:40 pm

Post by Tommasino »

Wow! It worked great. Just some minor manual cleaning. Thanks!
Do you know of any resource (online, books) where I can learn more regex?
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

TextPad HELP Using Regular Expressions, for help with their unique strings.

Mastering Regular Expressions, Jeffrey Friedl, O'Reilly Publications. (2nd or 3rd editions).

Regular Expressions Cookbook, Jan Goyvaertes and Steven Levithan, O'Reilly Publications (Includes a tutorial).
Hope this was helpful.............good luck,
Bob
Post Reply