Page 1 of 1

New and Frustrated...Need help with RegEx code

Posted: Thu Aug 19, 2010 3:06 am
by Pierce
I have 12K .html files that have variants of this text in them:
named=acantha+leaf'>acantha leaf
named=ambrominas+leaf'>ambrominas leaf
named=angelica+root'>angelica root
named=barley+grass'>barley grass
named=blaestonberry+blossom'>blaestonberry blossom
named=blue+trafel+mushroom'>blue trafel mushroom
named=bright+red+teaberry'>bright red teaberry
named=bunch+of+wild+grapes'>bunch of wild grapes
named=crimson+dragonstalk'>crimson dragonstalk
named=daggerstalk+mushroom'>daggerstalk mushroom
named=fennel+bulb'>fennel bulb
I need the end result to look like this:

acantha leaf, ambrominas leaf, angelica root, barley grass, etc.

If anyone can give me a pointer that will just kill all the data from "named" to ">" it will be a great help. I've been fighting with this for hours and am about ready to toss my laptop across the room.

Thanks.

Posted: Thu Aug 19, 2010 11:40 am
by SteveH
You need to perform the following search and replace:

Find what:
^.*>
Replace with:
Nothing - leave it blank

With regular expression enabled and POSIX regular expression syntax enabled in the general preferences.

This searches for repetition '*' of any character '.' following a new line '^' up to a '>' character and deletes them. You could omit the ^ character but I would tend to leave it in as it limits the number of matches to one on each line.

Hope this helps.

Posted: Thu Aug 19, 2010 12:01 pm
by Mike Olds
Hello,

Difference here: not using Posix, and running in the lines.

I can do this in two steps:

This is using Perl regular expressions:

First:

Search:

named=.*'>

Replace:
[empty]

Then:

Search:

[space]\R

Replace: X literal

,[space]



----------------------

PS: When you are battling your way through the learning curve it can help to set up a test file in TextPad, along with a Clip library of regular expression bits and scraps.

Then in WE: >File, Load Test File
and: Edit: Test Find and Test Replace

Saves a lot of grief over using live files.

Posted: Thu Aug 19, 2010 1:01 pm
by Pierce
Thank you both very much - totally worked.

Ordered my copy of Mastering Regular Expressions by Friedl last night. Hopefully I'll be able to contribute in the future.

Made my morning coming in and seeing this solution.

~ Pierce