Find x in log file, find url, eliminate all lines with that

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
User avatar
Mike Olds
Posts: 226
Joined: Wed Sep 30, 2009 3:27 pm
Contact:

Find x in log file, find url, eliminate all lines with that

Post by Mike Olds »

Greetings,

I am trying to clean up my log files for analysis to eliminate crackers which slip by the various ways I can eliminate listings (robots, etc) and would like to create a regex that says:

Find any line containing x, identify the url from that line, and eliminate all other lines with that url.

This is orders of magnitude beyond my knowledge of regex and would appreciate any help offered.

Thanks in advance!
Thank you, and
Best Wishes,
Obo
http://buddhadust.net/
check out the What's New? Oblog:
http://buddhadust.net/dhammatalk/dhamma ... ts.new.htm
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

in my opinion, this can not be done using one regex.

I'd probably use a perl script with roughly this algorithm:

Code: Select all

open a copy of the file for writing
open the log file for reading
while not at end of log file
  read a line
  if it contains x
    find the url in this line and remember it
  if it does not contain the url
    write the line
delete log file
rename copy to original name
Or, if you want to do it without script:

Code: Select all

load the file into textpad, 
search for x, 
select the url in the first found line 
search for the using mark all
then delete bookmarked lines
User avatar
Mike Olds
Posts: 226
Joined: Wed Sep 30, 2009 3:27 pm
Contact:

Post by Mike Olds »

EDIT 2: A simpler way to do this is to do a sort first on the IP# (first col) and then do the search, then just delete all the other lines (now easily identifiable) from the same IP#.

EDIT: Thank you again MudGuard. I am successfully using the two=stroke regex routine.

Best,
mo


Thanks Mudguard, scripts are out for me, but I will give your regex suggestions a try and report back.
Thank you, and
Best Wishes,
Obo
http://buddhadust.net/
check out the What's New? Oblog:
http://buddhadust.net/dhammatalk/dhamma ... ts.new.htm
Post Reply