Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #44320 > unrolled thread
| Started by | CM <cmpython@gmail.com> |
|---|---|
| First post | 2013-04-24 22:05 -0700 |
| Last post | 2013-04-25 22:40 -0400 |
| Articles | 8 — 7 participants |
Back to article view | Back to comp.lang.python
Pythonic way to count sequences CM <cmpython@gmail.com> - 2013-04-24 22:05 -0700
Re: Pythonic way to count sequences Chris Angelico <rosuav@gmail.com> - 2013-04-25 15:26 +1000
Re: Pythonic way to count sequences CM <cmpython@gmail.com> - 2013-04-25 19:40 -0700
Re: Pythonic way to count sequences Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-25 06:14 +0000
Re: Pythonic way to count sequences Serhiy Storchaka <storchaka@gmail.com> - 2013-04-25 15:36 +0300
Re: Pythonic way to count sequences Denis McMahon <denismfmcmahon@gmail.com> - 2013-04-25 23:29 +0000
Re: Pythonic way to count sequences Modulok <modulok@gmail.com> - 2013-04-25 19:16 -0600
Re: Pythonic way to count sequences Matthew Gilson <m.gilson1@gmail.com> - 2013-04-25 22:40 -0400
| From | CM <cmpython@gmail.com> |
|---|---|
| Date | 2013-04-24 22:05 -0700 |
| Subject | Pythonic way to count sequences |
| Message-ID | <bfc57a5f-bbcb-46a4-b59a-e7fa55527da1@y12g2000yqb.googlegroups.com> |
I have to count the number of various two-digit sequences in a list
such as this:
mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4)
sequence appears 2 times.)
and tally up the results, assigning each to a variable. The inelegant
first pass at this was something like...
# Create names and set them all to 0
alpha = 0
beta = 0
delta = 0
gamma = 0
# etc...
# loop over all the tuple sequences and increment appropriately
for sequence_tuple in list_of_tuples:
if sequence_tuple == (1,2):
alpha += 1
if sequence_tuple == (2,4):
beta += 1
if sequence_tuple == (2,5):
delta +=1
# etc... But I actually have more than 10 sequence types.
# Finally, I need a list created like this:
result_list = [alpha, beta, delta, gamma] #etc...in that order
I can sense there is very likely an elegant/Pythonic way to do this,
and probably with a dict, or possibly with some Python structure I
don't typically use. Suggestions sought. Thanks.
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-04-25 15:26 +1000 |
| Message-ID | <mailman.1051.1366867611.3114.python-list@python.org> |
| In reply to | #44320 |
On Thu, Apr 25, 2013 at 3:05 PM, CM <cmpython@gmail.com> wrote:
> I have to count the number of various two-digit sequences in a list
> such as this:
>
> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4)
> sequence appears 2 times.)
>
> and tally up the results, assigning each to a variable.
You can use a tuple as a dictionary key, just like you would a string.
So you can count them up directly with a dictionary:
count = {}
for sequence_tuple in list_of_tuples:
count[sequence_tuple] = count.get(sequence_tuple,0) + 1
Also, since this is such a common thing to do, there's a standard
library way of doing it:
import collections
count = collections.Counter(list_of_tuples)
This doesn't depend on knowing ahead of time what your elements will
be. At the end of it, you can simply iterate over 'count' and get all
your counts:
for sequence,number in count.items():
print("%d of %r" % (number,sequence))
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | CM <cmpython@gmail.com> |
|---|---|
| Date | 2013-04-25 19:40 -0700 |
| Message-ID | <2a29a04e-ace9-43d2-98b8-9c45c9b963a4@g9g2000yqh.googlegroups.com> |
| In reply to | #44321 |
Thank you, everyone, for the answers. Very helpful and knowledge- expanding.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-04-25 06:14 +0000 |
| Message-ID | <5178c9b8$0$29977$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #44320 |
On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:
> I have to count the number of various two-digit sequences in a list such
> as this:
>
> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
> appears 2 times.)
>
> and tally up the results, assigning each to a variable. The inelegant
> first pass at this was something like...
>
> # Create names and set them all to 0
> alpha = 0
> beta = 0
> delta = 0
> gamma = 0
> # etc...
Do they absolutely have to be global variables like that? Seems like a
bad design, especially if you don't know in advance exactly how many
there are.
> # loop over all the tuple sequences and increment appropriately for
> sequence_tuple in list_of_tuples:
> if sequence_tuple == (1,2):
> alpha += 1
> if sequence_tuple == (2,4):
> beta += 1
> if sequence_tuple == (2,5):
> delta +=1
> # etc... But I actually have more than 10 sequence types.
counts = {}
for t in list_of_tuples:
counts[t] = counts.get(t, 0) + 1
Or, use collections.Counter:
from collections import Counter
counts = Counter(list_of_tuples)
> # Finally, I need a list created like this: result_list = [alpha, beta,
> delta, gamma] #etc...in that order
Dicts are unordered, so getting the results in a specific order will be a
bit tricky. You could do this:
results = sorted(counts.items(), key=lambda t: t[0])
results = [t[1] for t in results]
if you are lucky enough to have the desired order match the natural order
of the tuples. Otherwise:
desired_order = [(2, 3), (3, 1), (1, 2), ...]
results = [counts.get(t, 0) for t in desired_order]
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Serhiy Storchaka <storchaka@gmail.com> |
|---|---|
| Date | 2013-04-25 15:36 +0300 |
| Message-ID | <mailman.1057.1366893429.3114.python-list@python.org> |
| In reply to | #44320 |
25.04.13 08:26, Chris Angelico написав(ла):
> So you can count them up directly with a dictionary:
>
> count = {}
> for sequence_tuple in list_of_tuples:
> count[sequence_tuple] = count.get(sequence_tuple,0) + 1
Or alternatives:
count = {}
for sequence_tuple in list_of_tuples:
if sequence_tuple] in count:
count[sequence_tuple] += 1
else:
count[sequence_tuple] = 1
count = {}
for sequence_tuple in list_of_tuples:
try:
count[sequence_tuple] += 1
except KeyError:
count[sequence_tuple] = 1
import collections
count = collections.defaultdict(int)
for sequence_tuple in list_of_tuples:
count[sequence_tuple] += 1
But of course collections.Counter is a preferable way now.
[toc] | [prev] | [next] | [standalone]
| From | Denis McMahon <denismfmcmahon@gmail.com> |
|---|---|
| Date | 2013-04-25 23:29 +0000 |
| Message-ID | <klce8o$2pp$4@dont-email.me> |
| In reply to | #44320 |
On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:
> I have to count the number of various two-digit sequences in a list such
> as this:
>
> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
> appears 2 times.)
>
> and tally up the results, assigning each to a variable. The inelegant
> first pass at this was something like...
>
> # Create names and set them all to 0 alpha = 0 beta = 0 delta = 0 gamma
> = 0 # etc...
>
> # loop over all the tuple sequences and increment appropriately for
> sequence_tuple in list_of_tuples:
> if sequence_tuple == (1,2):
> alpha += 1
> if sequence_tuple == (2,4):
> beta += 1
> if sequence_tuple == (2,5):
> delta +=1
> # etc... But I actually have more than 10 sequence types.
>
> # Finally, I need a list created like this:
> result_list = [alpha, beta, delta, gamma] #etc...in that order
>
> I can sense there is very likely an elegant/Pythonic way to do this, and
> probably with a dict, or possibly with some Python structure I don't
> typically use. Suggestions sought. Thanks.
mylist = [ (3,3), (1,2), "fred", ("peter",1,7), 1, 19, 37, 28.312,
("monkey"), "fred", "fred", (1,2) ]
bits = {}
for thing in mylist:
if thing in bits:
bits[thing] += 1
else:
bits[thing] = 1
for thing in bits:
print thing, " occurs ", bits[thing], " times"
outputs:
(1, 2) occurs 2 times
1 occurs 1 times
('peter', 1, 7) occurs 1 times
(3, 3) occurs 1 times
28.312 occurs 1 times
fred occurs 3 times
19 occurs 1 times
monkey occurs 1 times
37 occurs 1 times
if you want to check that thing is a 2 int tuple then use something like:
for thing in mylist:
if isinstance( thing, tuple ) and len( thing ) == 2 and isinstance
( thing[0], ( int, long ) ) and isinstance( thing[1], ( int, long) ):
if thing in bits:
bits[thing] += 1
else:
bits[thing] = 1
--
Denis McMahon, denismfmcmahon@gmail.com
[toc] | [prev] | [next] | [standalone]
| From | Modulok <modulok@gmail.com> |
|---|---|
| Date | 2013-04-25 19:16 -0600 |
| Message-ID | <mailman.1074.1366938979.3114.python-list@python.org> |
| In reply to | #44371 |
On 4/25/13, Denis McMahon <denismfmcmahon@gmail.com> wrote:
> On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:
>
>> I have to count the number of various two-digit sequences in a list such
>> as this:
>>
>> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
>> appears 2 times.)
>>
>> and tally up the results, assigning each to a variable.
...
Consider using the ``collections`` module::
from collections import Counter
mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
count = Counter()
for k in mylist:
count[k] += 1
print(count)
# Output looks like this:
# Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})
You then have access to methods to return the most common items, etc. See more
examples here:
http://docs.python.org/3.3/library/collections.html#collections.Counter
Good luck!
-Modulok-
[toc] | [prev] | [next] | [standalone]
| From | Matthew Gilson <m.gilson1@gmail.com> |
|---|---|
| Date | 2013-04-25 22:40 -0400 |
| Message-ID | <mailman.1076.1366944043.3114.python-list@python.org> |
| In reply to | #44371 |
A Counter is definitely the way to go about this. Just as a little more
information. The below example can be simplified:
from collections import Counter
count = Counter(mylist)
With the other example, you could have achieved the same thing (and been
backward compatible to python2.5) with
from collections import defaultdict
count = defaultdict(int)
for k in mylist:
count[k] += 1
On 4/25/13 9:16 PM, Modulok wrote:
> On 4/25/13, Denis McMahon <denismfmcmahon@gmail.com> wrote:
>> On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:
>>
>>> I have to count the number of various two-digit sequences in a list such
>>> as this:
>>>
>>> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
>>> appears 2 times.)
>>>
>>> and tally up the results, assigning each to a variable.
> ...
>
> Consider using the ``collections`` module::
>
>
> from collections import Counter
>
> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
> count = Counter()
> for k in mylist:
> count[k] += 1
>
> print(count)
>
> # Output looks like this:
> # Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})
>
>
> You then have access to methods to return the most common items, etc. See more
> examples here:
>
> http://docs.python.org/3.3/library/collections.html#collections.Counter
>
>
> Good luck!
> -Modulok-
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web