Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #44320 > unrolled thread

Pythonic way to count sequences

Started byCM <cmpython@gmail.com>
First post2013-04-24 22:05 -0700
Last post2013-04-25 22:40 -0400
Articles 8 — 7 participants

Back to article view | Back to comp.lang.python


Contents

  Pythonic way to count sequences CM <cmpython@gmail.com> - 2013-04-24 22:05 -0700
    Re: Pythonic way to count sequences Chris Angelico <rosuav@gmail.com> - 2013-04-25 15:26 +1000
      Re: Pythonic way to count sequences CM <cmpython@gmail.com> - 2013-04-25 19:40 -0700
    Re: Pythonic way to count sequences Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-25 06:14 +0000
    Re: Pythonic way to count sequences Serhiy Storchaka <storchaka@gmail.com> - 2013-04-25 15:36 +0300
    Re: Pythonic way to count sequences Denis McMahon <denismfmcmahon@gmail.com> - 2013-04-25 23:29 +0000
      Re: Pythonic way to count sequences Modulok <modulok@gmail.com> - 2013-04-25 19:16 -0600
      Re: Pythonic way to count sequences Matthew Gilson <m.gilson1@gmail.com> - 2013-04-25 22:40 -0400

#44320 — Pythonic way to count sequences

FromCM <cmpython@gmail.com>
Date2013-04-24 22:05 -0700
SubjectPythonic way to count sequences
Message-ID<bfc57a5f-bbcb-46a4-b59a-e7fa55527da1@y12g2000yqb.googlegroups.com>
I have to count the number of various two-digit sequences in a list
such as this:

mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]  # (Here the (2,4)
sequence appears 2 times.)

and tally up the results, assigning each to a variable.  The inelegant
first pass at this was something like...

# Create names and set them all to 0
alpha = 0
beta = 0
delta = 0
gamma = 0
# etc...

# loop over all the tuple sequences and increment appropriately
for sequence_tuple in list_of_tuples:
    if sequence_tuple == (1,2):
        alpha += 1
    if sequence_tuple == (2,4):
        beta += 1
    if sequence_tuple == (2,5):
        delta +=1
# etc... But I actually have more than 10 sequence types.

# Finally, I need a list created like this:
result_list = [alpha, beta, delta, gamma] #etc...in that order

I can sense there is very likely an elegant/Pythonic way to do this,
and probably with a dict, or possibly with some Python structure I
don't typically use.  Suggestions sought.  Thanks.

[toc] | [next] | [standalone]


#44321

FromChris Angelico <rosuav@gmail.com>
Date2013-04-25 15:26 +1000
Message-ID<mailman.1051.1366867611.3114.python-list@python.org>
In reply to#44320
On Thu, Apr 25, 2013 at 3:05 PM, CM <cmpython@gmail.com> wrote:
> I have to count the number of various two-digit sequences in a list
> such as this:
>
> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]  # (Here the (2,4)
> sequence appears 2 times.)
>
> and tally up the results, assigning each to a variable.

You can use a tuple as a dictionary key, just like you would a string.
So you can count them up directly with a dictionary:

count = {}
for sequence_tuple in list_of_tuples:
    count[sequence_tuple] = count.get(sequence_tuple,0) + 1

Also, since this is such a common thing to do, there's a standard
library way of doing it:

import collections
count = collections.Counter(list_of_tuples)

This doesn't depend on knowing ahead of time what your elements will
be. At the end of it, you can simply iterate over 'count' and get all
your counts:

for sequence,number in count.items():
	print("%d of %r" % (number,sequence))

ChrisA

[toc] | [prev] | [next] | [standalone]


#44377

FromCM <cmpython@gmail.com>
Date2013-04-25 19:40 -0700
Message-ID<2a29a04e-ace9-43d2-98b8-9c45c9b963a4@g9g2000yqh.googlegroups.com>
In reply to#44321
Thank you, everyone, for the answers.  Very helpful and knowledge-
expanding.

[toc] | [prev] | [next] | [standalone]


#44325

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-04-25 06:14 +0000
Message-ID<5178c9b8$0$29977$c3e8da3$5496439d@news.astraweb.com>
In reply to#44320
On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:

> I have to count the number of various two-digit sequences in a list such
> as this:
> 
> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]  # (Here the (2,4) sequence
> appears 2 times.)
> 
> and tally up the results, assigning each to a variable.  The inelegant
> first pass at this was something like...
> 
> # Create names and set them all to 0
> alpha = 0
> beta = 0
> delta = 0
> gamma = 0
> # etc...

Do they absolutely have to be global variables like that? Seems like a 
bad design, especially if you don't know in advance exactly how many 
there are.


> # loop over all the tuple sequences and increment appropriately for
> sequence_tuple in list_of_tuples:
>     if sequence_tuple == (1,2):
>         alpha += 1
>     if sequence_tuple == (2,4):
>         beta += 1
>     if sequence_tuple == (2,5):
>         delta +=1
> # etc... But I actually have more than 10 sequence types.

counts = {}
for t in list_of_tuples:
    counts[t] = counts.get(t, 0) + 1


Or, use collections.Counter:

from collections import Counter
counts = Counter(list_of_tuples)


> # Finally, I need a list created like this: result_list = [alpha, beta,
> delta, gamma] #etc...in that order

Dicts are unordered, so getting the results in a specific order will be a 
bit tricky. You could do this:

results = sorted(counts.items(), key=lambda t: t[0])
results = [t[1] for t in results]

if you are lucky enough to have the desired order match the natural order 
of the tuples. Otherwise:

desired_order = [(2, 3), (3, 1), (1, 2), ...]
results = [counts.get(t, 0) for t in desired_order]



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#44334

FromSerhiy Storchaka <storchaka@gmail.com>
Date2013-04-25 15:36 +0300
Message-ID<mailman.1057.1366893429.3114.python-list@python.org>
In reply to#44320
25.04.13 08:26, Chris Angelico написав(ла):
> So you can count them up directly with a dictionary:
>
> count = {}
> for sequence_tuple in list_of_tuples:
>      count[sequence_tuple] = count.get(sequence_tuple,0) + 1

Or alternatives:

count = {}
for sequence_tuple in list_of_tuples:
      if sequence_tuple] in count:
           count[sequence_tuple] += 1
      else:
           count[sequence_tuple] = 1

count = {}
for sequence_tuple in list_of_tuples:
      try:
           count[sequence_tuple] += 1
      except KeyError:
           count[sequence_tuple] = 1

import collections
count = collections.defaultdict(int)
for sequence_tuple in list_of_tuples:
      count[sequence_tuple] += 1

But of course collections.Counter is a preferable way now.

[toc] | [prev] | [next] | [standalone]


#44371

FromDenis McMahon <denismfmcmahon@gmail.com>
Date2013-04-25 23:29 +0000
Message-ID<klce8o$2pp$4@dont-email.me>
In reply to#44320
On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:

> I have to count the number of various two-digit sequences in a list such
> as this:
> 
> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]  # (Here the (2,4) sequence
> appears 2 times.)
> 
> and tally up the results, assigning each to a variable.  The inelegant
> first pass at this was something like...
> 
> # Create names and set them all to 0 alpha = 0 beta = 0 delta = 0 gamma
> = 0 # etc...
> 
> # loop over all the tuple sequences and increment appropriately for
> sequence_tuple in list_of_tuples:
>     if sequence_tuple == (1,2):
>         alpha += 1
>     if sequence_tuple == (2,4):
>         beta += 1
>     if sequence_tuple == (2,5):
>         delta +=1
> # etc... But I actually have more than 10 sequence types.
> 
> # Finally, I need a list created like this:
> result_list = [alpha, beta, delta, gamma] #etc...in that order
> 
> I can sense there is very likely an elegant/Pythonic way to do this, and
> probably with a dict, or possibly with some Python structure I don't
> typically use.  Suggestions sought.  Thanks.

mylist = [ (3,3), (1,2), "fred", ("peter",1,7), 1, 19, 37, 28.312, 
("monkey"), "fred", "fred", (1,2) ]

bits = {}

for thing in mylist:
	if thing in bits:
		bits[thing] += 1
	else:
		bits[thing] = 1

for thing in bits:
	print thing, " occurs ", bits[thing], " times"

outputs:

(1, 2)  occurs  2  times
1  occurs  1  times
('peter', 1, 7)  occurs  1  times
(3, 3)  occurs  1  times
28.312  occurs  1  times
fred  occurs  3  times
19  occurs  1  times
monkey  occurs  1  times
37  occurs  1  times

if you want to check that thing is a 2 int tuple then use something like:

for thing in mylist:
	if isinstance( thing, tuple ) and len( thing ) == 2 and isinstance
( thing[0], ( int, long ) ) and isinstance( thing[1], ( int, long) ):
		if thing in bits:
			bits[thing] += 1
		else:
			bits[thing] = 1

-- 
Denis McMahon, denismfmcmahon@gmail.com

[toc] | [prev] | [next] | [standalone]


#44374

FromModulok <modulok@gmail.com>
Date2013-04-25 19:16 -0600
Message-ID<mailman.1074.1366938979.3114.python-list@python.org>
In reply to#44371
On 4/25/13, Denis McMahon <denismfmcmahon@gmail.com> wrote:
> On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:
>
>> I have to count the number of various two-digit sequences in a list such
>> as this:
>>
>> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]  # (Here the (2,4) sequence
>> appears 2 times.)
>>
>> and tally up the results, assigning each to a variable.
...

Consider using the ``collections`` module::


    from collections import Counter

    mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
    count = Counter()
    for k in mylist:
        count[k] += 1

    print(count)

    # Output looks like this:
    # Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})


You then have access to methods to return the most common items, etc. See more
examples here:

http://docs.python.org/3.3/library/collections.html#collections.Counter


Good luck!
-Modulok-

[toc] | [prev] | [next] | [standalone]


#44378

FromMatthew Gilson <m.gilson1@gmail.com>
Date2013-04-25 22:40 -0400
Message-ID<mailman.1076.1366944043.3114.python-list@python.org>
In reply to#44371
A Counter is definitely the way to go about this.  Just as a little more 
information.  The below example can be simplified:

     from collections import Counter
     count = Counter(mylist)

With the other example, you could have achieved the same thing (and been 
backward compatible to python2.5) with

    from collections import defaultdict
    count = defaultdict(int)
    for k in mylist:
         count[k] += 1



On 4/25/13 9:16 PM, Modulok wrote:
> On 4/25/13, Denis McMahon <denismfmcmahon@gmail.com> wrote:
>> On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:
>>
>>> I have to count the number of various two-digit sequences in a list such
>>> as this:
>>>
>>> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]  # (Here the (2,4) sequence
>>> appears 2 times.)
>>>
>>> and tally up the results, assigning each to a variable.
> ...
>
> Consider using the ``collections`` module::
>
>
>      from collections import Counter
>
>      mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
>      count = Counter()
>      for k in mylist:
>          count[k] += 1
>
>      print(count)
>
>      # Output looks like this:
>      # Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})
>
>
> You then have access to methods to return the most common items, etc. See more
> examples here:
>
> http://docs.python.org/3.3/library/collections.html#collections.Counter
>
>
> Good luck!
> -Modulok-

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web