What I have so far
def unique_ips():
f = open('logfile','r')
ips = set()
for line in f:
ip = line.split()[0]
print ip
for date in ip:
logdate = line.split()[3]
print "\t", logdate
for entry in logdate:
info = line.split()[5:11]
print "\t\t", info
ips.add(ip)
unique_ips()
The part I am having trouble with is:
for entry in logdate:
info = line.split()[5:20]
print "\t\t", info
I have a log file that I have to sort first by ip, then by time then errors
should look like:
199.21.99.83
[30/Jun/2013:07:18:30
['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"']
but instead I'm getting:
199.21.99.83
[30/Jun/2013:07:18:30
['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"']
['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"']
['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"']
['"GET', '/searchme/index.php?f=man_soweth', 'HTTP/1.1"', '200', '8676', '"-"']
...
I'm sure I am running into some sort of syntax issue but would appreciate the help!
Log file looks like:
99.21.99.83 - - [30/Jun/2013:07:15:50 -0500] "GET /lenny/index.php?f=13 HTTP/1.1" 200 11244 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.83 - - [30/Jun/2013:07:16:13 -0500] "GET /searchme/index.php?f=being_fruitful HTTP/1.1" 200 7526 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.83 - - [30/Jun/2013:07:16:45 -0500] "GET /searchme/index.php?f=comparing_themselves HTTP/1.1" 200 7369 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
66.249.73.40 - - [30/Jun/2013:07:16:56 -0500] "GET /espanol/displayAncient.cgi?ref=isa%2054:3 HTTP/1.1" 500 167 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
199.21.99.83 - - [30/Jun/2013:07:17:00 -0500] "GET /searchme/index.php?f=tribulation HTTP/1.1" 200 7060 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.83 - - [30/Jun/2013:07:17:15 -0500] "GET /searchme/index.php?f=proud HTTP/1.1" 200 7080 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.83 - - [30/Jun/2013:07:17:34 -0500] "GET /searchme/index.php?f=soul HTTP/1.1" 200 7063 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
199.21.99.83 - - [30/Jun/2013:07:17:38 -0500] "GET /searchme/index.php?f=the_flesh_lusteth HTTP/1.1" 200 6951 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.c
The question was a little confusing because of the sample output, but I'm pretty sure that you want something like this:
def unique_ips():
f = open('logfile','r')
ips = {}
# This for loop collects all of the ips with their associated errors
for line in f:
ip = line.split()[0]
try:
ips[ip].append(line)
except KeyError:
ips[ip] = [line]
# This for loop goes through all the ips that were collected
# and prints out all errors for those ips
for ip, errors in ips.iteritems():
print ip
errors.sort()
for e in errors:
logdate = e.split()[3]
print "\t", logdate
info = e.split()[5:11]
print "\t\t", info
f.close()
Which produces this output from your sample file:
199.21.99.83
[30/Jun/2013:07:16:13
['"GET', '/searchme/index.php?f=being_fruitful', 'HTTP/1.1"', '200', '7526', '"-"']
[30/Jun/2013:07:16:45
['"GET', '/searchme/index.php?f=comparing_themselves', 'HTTP/1.1"', '200', '7369', '"-"']
[30/Jun/2013:07:17:00
['"GET', '/searchme/index.php?f=tribulation', 'HTTP/1.1"', '200', '7060', '"-"']
[30/Jun/2013:07:17:15
['"GET', '/searchme/index.php?f=proud', 'HTTP/1.1"', '200', '7080', '"-"']
[30/Jun/2013:07:17:34
['"GET', '/searchme/index.php?f=soul', 'HTTP/1.1"', '200', '7063', '"-"']
[30/Jun/2013:07:17:38
['"GET', '/searchme/index.php?f=the_flesh_lusteth', 'HTTP/1.1"', '200', '6951', '"-"']
66.249.73.40
[30/Jun/2013:07:16:56
['"GET', '/espanol/displayAncient.cgi?ref=isa%2054:3', 'HTTP/1.1"', '500', '167', '"-"']
99.21.99.83
[30/Jun/2013:07:15:50
['"GET', '/lenny/index.php?f=13', 'HTTP/1.1"', '200', '11244', '"-"']
You have too many loops. You don't need the for entry in logdate loop. You're already looping over each line.
Remove the for entry in logdate and outdent the info assignment and print statements.
(The comments already mentioned this.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With