Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #2525

Re: better way to do this in python

References <BANLkTimnvP0mrvt6dOj_EJ3aC8+0sRfNXg@mail.gmail.com> <BANLkTimxzZXaLvj=L4B33zYtQOBhedbtCQ@mail.gmail.com> <BANLkTinfNzHN+e7B4B01MNbGQ_r9SfZEKQ@mail.gmail.com>
Date 2011-04-03 08:06 -0400
Subject Re: better way to do this in python
From Mag Gam <magawake@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.167.1301832402.2990.python-list@python.org> (permalink)

Show all headers | View raw


Thanks for the responses.


Basically, I have a large file with this format,

Date INFO username command srcipaddress filename


I would like to do statistics on:
total number of usernames and who they are
username and commands
username and filenames
unique source ip addresses
unique filenames

Then I would like to bucket findings with days (date).

Overall, I would like to build a log file analyzer.



On Sat, Apr 2, 2011 at 10:59 PM, Dan Stromberg <drsalists@gmail.com> wrote:
>
> On Sat, Apr 2, 2011 at 5:24 PM, Chris Angelico <rosuav@gmail.com> wrote:
>>
>> On Sun, Apr 3, 2011 at 9:58 AM, Mag Gam <magawake@gmail.com> wrote:
>> > I suppose I can do something like this.
>> > (pseudocode)
>> >
>> > d={}
>> > try:
>> >  d[key]+=1
>> > except KeyError:
>> >  d[key]=1
>> >
>> >
>> > I was wondering if there is a pythonic way of doing this? I plan on
>> > doing this many times for various files. Would the python collections
>> > class be sufficient?
>>
>> I think you want collections.Counter. From the docs: "Counter objects
>> have a dictionary interface except that they return a zero count for
>> missing items instead of raising a KeyError".
>>
>> ChrisA
>
> I realize you (Mag) asked for a Python solution, but since you mention
> awk... you can also do this with "sort < input | uniq -c" - one line of
> "code".  GNU sort doesn't use as nice an algorithm as a hashing-based
> solution (like you'd probably use with Python), but for a sort, GNU sort's
> quite good.
>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

Re: better way to do this in python Mag Gam <magawake@gmail.com> - 2011-04-03 08:06 -0400
  Re: better way to do this in python nn <pruebauno@latinmail.com> - 2011-04-04 09:10 -0700

csiph-web