How to find every nth word?

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
Kelly
Posts: 34
Joined: Sat May 28, 2011 8:59 am
Location: Ellsworth, ME

How to find every nth word?

Post by Kelly »

Could anybody please tell me how to write a regular expression to find every nth word?

Thank you very kindly,

Kelly
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

You can search for sequences of, say, 5 words with some variation of
(?:\w+\W+){4}(\w+)\W*

You can create a list of every 5th word using a replacement expression similar to
$1\n

If you don't want to find sequences that straddle newlines, try something like
(?:\w+[^\n\w]+){4}(\w+)[^\n\w]*
Kelly
Posts: 34
Joined: Sat May 28, 2011 8:59 am
Location: Ellsworth, ME

Post by Kelly »

Thank you so much Ben!

I'm just starting to begin to appreciate the power of regex - pretty amazing.

Kelly
ben_josephs
Posts: 2464
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

Here's an improvement. With this regex the replacement will remove the residue of words after all the groups of 5 have been matched:
(?:(?:\w+\W+){4}(\w+)\W*|.+)
(This relies on the fact that in TextPad's (and most other) regex recognisers, alternation (...|...) is not greedy. The alternatives are tried one by one from the left. Once one of them has matched all subsequent ones are ignored, even if they might have found a longer match.)
Post Reply