Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #2525
| References | <BANLkTimnvP0mrvt6dOj_EJ3aC8+0sRfNXg@mail.gmail.com> <BANLkTimxzZXaLvj=L4B33zYtQOBhedbtCQ@mail.gmail.com> <BANLkTinfNzHN+e7B4B01MNbGQ_r9SfZEKQ@mail.gmail.com> |
|---|---|
| Date | 2011-04-03 08:06 -0400 |
| Subject | Re: better way to do this in python |
| From | Mag Gam <magawake@gmail.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.167.1301832402.2990.python-list@python.org> (permalink) |
Thanks for the responses.
Basically, I have a large file with this format,
Date INFO username command srcipaddress filename
I would like to do statistics on:
total number of usernames and who they are
username and commands
username and filenames
unique source ip addresses
unique filenames
Then I would like to bucket findings with days (date).
Overall, I would like to build a log file analyzer.
On Sat, Apr 2, 2011 at 10:59 PM, Dan Stromberg <drsalists@gmail.com> wrote:
>
> On Sat, Apr 2, 2011 at 5:24 PM, Chris Angelico <rosuav@gmail.com> wrote:
>>
>> On Sun, Apr 3, 2011 at 9:58 AM, Mag Gam <magawake@gmail.com> wrote:
>> > I suppose I can do something like this.
>> > (pseudocode)
>> >
>> > d={}
>> > try:
>> > d[key]+=1
>> > except KeyError:
>> > d[key]=1
>> >
>> >
>> > I was wondering if there is a pythonic way of doing this? I plan on
>> > doing this many times for various files. Would the python collections
>> > class be sufficient?
>>
>> I think you want collections.Counter. From the docs: "Counter objects
>> have a dictionary interface except that they return a zero count for
>> missing items instead of raising a KeyError".
>>
>> ChrisA
>
> I realize you (Mag) asked for a Python solution, but since you mention
> awk... you can also do this with "sort < input | uniq -c" - one line of
> "code". GNU sort doesn't use as nice an algorithm as a hashing-based
> solution (like you'd probably use with Python), but for a sort, GNU sort's
> quite good.
>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Re: better way to do this in python Mag Gam <magawake@gmail.com> - 2011-04-03 08:06 -0400 Re: better way to do this in python nn <pruebauno@latinmail.com> - 2011-04-04 09:10 -0700
csiph-web