Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #29494 > unrolled thread

sum works in sequences (Python 3)

Started byFranck Ditter <franck@ditter.org>
First post2012-09-19 16:41 +0200
Last post2012-09-20 00:25 +0200
Articles 14 — 9 participants

Back to article view | Back to comp.lang.python


Contents

  sum works in sequences (Python 3) Franck Ditter <franck@ditter.org> - 2012-09-19 16:41 +0200
    Re: sum works in sequences (Python 3) Joel Goldstick <joel.goldstick@gmail.com> - 2012-09-19 10:57 -0400
    Re: sum works in sequences (Python 3) Neil Cerutti <neilc@norwich.edu> - 2012-09-19 14:57 +0000
    Re: sum works in sequences (Python 3) Ian Kelly <ian.g.kelly@gmail.com> - 2012-09-19 09:03 -0600
      Re: sum works in sequences (Python 3) Neil Cerutti <neilc@norwich.edu> - 2012-09-19 15:06 +0000
        Re: sum works in sequences (Python 3) Ian Kelly <ian.g.kelly@gmail.com> - 2012-09-19 09:24 -0600
        Re: sum works in sequences (Python 3) Steve Howell <showell30@yahoo.com> - 2012-09-19 08:37 -0700
          Re: sum works in sequences (Python 3) Ian Kelly <ian.g.kelly@gmail.com> - 2012-09-19 12:33 -0600
            Re: sum works in sequences (Python 3) Steve Howell <showell30@yahoo.com> - 2012-09-19 11:43 -0700
      Re: sum works in sequences (Python 3) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-19 16:14 +0000
    Re: sum works in sequences (Python 3) Alister <alister.ware@ntlworld.com> - 2012-09-19 15:07 +0000
      Re: sum works in sequences (Python 3) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-19 16:18 +0000
      Re: sum works in sequences (Python 3) Terry Reedy <tjreedy@udel.edu> - 2012-09-19 14:49 -0400
      Re: sum works in sequences (Python 3) Hans Mulder <hansmu@xs4all.nl> - 2012-09-20 00:25 +0200

#29494 — sum works in sequences (Python 3)

FromFranck Ditter <franck@ditter.org>
Date2012-09-19 16:41 +0200
Subjectsum works in sequences (Python 3)
Message-ID<franck-E8B1EB.16412019092012@news.free.fr>
Hello,
I wonder why sum does not work on the string sequence in Python 3 :

>>> sum((8,5,9,3))
25
>>> sum([5,8,3,9,2])
27
>>> sum('rtarze')
TypeError: unsupported operand type(s) for +: 'int' and 'str'

I naively thought that sum('abc') would expand to 'a'+'b'+'c' 
And the error message is somewhat cryptic...

    franck

[toc] | [next] | [standalone]


#29495

FromJoel Goldstick <joel.goldstick@gmail.com>
Date2012-09-19 10:57 -0400
Message-ID<mailman.919.1348066629.27098.python-list@python.org>
In reply to#29494
On Wed, Sep 19, 2012 at 10:41 AM, Franck Ditter <franck@ditter.org> wrote:
> Hello,
> I wonder why sum does not work on the string sequence in Python 3 :
>
>>>> sum((8,5,9,3))
> 25
>>>> sum([5,8,3,9,2])
> 27
>>>> sum('rtarze')
> TypeError: unsupported operand type(s) for +: 'int' and 'str'
>
> I naively thought that sum('abc') would expand to 'a'+'b'+'c'
> And the error message is somewhat cryptic...
>
>     franck
> --
> http://mail.python.org/mailman/listinfo/python-list
Help on built-in function sum in module __builtin__:

sum(...)
    sum(sequence[, start]) -> value

    Returns the sum of a sequence of numbers (NOT strings) plus the value
    of parameter 'start' (which defaults to 0).  When the sequence is
    empty, returns start.
~



-- 
Joel Goldstick

[toc] | [prev] | [next] | [standalone]


#29496

FromNeil Cerutti <neilc@norwich.edu>
Date2012-09-19 14:57 +0000
Message-ID<abu4rjFblbqU1@mid.individual.net>
In reply to#29494
On 2012-09-19, Franck Ditter <franck@ditter.org> wrote:
> Hello,
> I wonder why sum does not work on the string sequence in Python 3 :
>
>>>> sum((8,5,9,3))
> 25
>>>> sum([5,8,3,9,2])
> 27
>>>> sum('rtarze')
> TypeError: unsupported operand type(s) for +: 'int' and 'str'
>
> I naively thought that sum('abc') would expand to 'a'+'b'+'c'
> And the error message is somewhat cryptic...

You got that error message because the default value for the
second 'start' argument is 0. The function tried to add 'r' to 0.
That said:

>>> sum('rtarze', '')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#29497

FromIan Kelly <ian.g.kelly@gmail.com>
Date2012-09-19 09:03 -0600
Message-ID<mailman.920.1348067016.27098.python-list@python.org>
In reply to#29494
On Wed, Sep 19, 2012 at 8:41 AM, Franck Ditter <franck@ditter.org> wrote:
> Hello,
> I wonder why sum does not work on the string sequence in Python 3 :
>
>>>> sum((8,5,9,3))
> 25
>>>> sum([5,8,3,9,2])
> 27
>>>> sum('rtarze')
> TypeError: unsupported operand type(s) for +: 'int' and 'str'
>
> I naively thought that sum('abc') would expand to 'a'+'b'+'c'
> And the error message is somewhat cryptic...

It notes in the doc string that it does not work on strings:

sum(...)
    sum(sequence[, start]) -> value

    Returns the sum of a sequence of numbers (NOT strings) plus the value
    of parameter 'start' (which defaults to 0).  When the sequence is
    empty, returns start.

I think this restriction is mainly for efficiency.  sum(['a', 'b',
'c', 'd', 'e']) would be the equivalent of 'a' + 'b' + 'c' + 'd' +
'e', which is an inefficient way to add together strings.  You should
use ''.join instead:

>>> ''.join('abc')
'abc'

[toc] | [prev] | [next] | [standalone]


#29498

FromNeil Cerutti <neilc@norwich.edu>
Date2012-09-19 15:06 +0000
Message-ID<abu5c3FblbqU2@mid.individual.net>
In reply to#29497
On 2012-09-19, Ian Kelly <ian.g.kelly@gmail.com> wrote:
> It notes in the doc string that it does not work on strings:
>
> sum(...)
>     sum(sequence[, start]) -> value
>
>     Returns the sum of a sequence of numbers (NOT strings) plus
>     the value of parameter 'start' (which defaults to 0).  When
>     the sequence is empty, returns start.
>
> I think this restriction is mainly for efficiency.  sum(['a',
> 'b', 'c', 'd', 'e']) would be the equivalent of 'a' + 'b' + 'c'
> + 'd' + 'e', which is an inefficient way to add together
> strings.  You should use ''.join instead:

While the docstring is still useful, it has diverged from the
documentation a little bit.

  sum(iterable[, start]) 

  Sums start and the items of an iterable from left to right and
  returns the total. start defaults to 0. The iterable‘s items
  are normally numbers, and the start value is not allowed to be
  a string.

  For some use cases, there are good alternatives to sum(). The
  preferred, fast way to concatenate a sequence of strings is by
  calling ''.join(sequence). To add floating point values with
  extended precision, see math.fsum(). To concatenate a series of
  iterables, consider using itertools.chain().

Are iterables and sequences different enough to warrant posting a
bug report?

-- 
Neil Cerutti

[toc] | [prev] | [next] | [standalone]


#29501

FromIan Kelly <ian.g.kelly@gmail.com>
Date2012-09-19 09:24 -0600
Message-ID<mailman.922.1348068321.27098.python-list@python.org>
In reply to#29498
On Wed, Sep 19, 2012 at 9:06 AM, Neil Cerutti <neilc@norwich.edu> wrote:
> Are iterables and sequences different enough to warrant posting a
> bug report?

The glossary is specific about the definitions of both, so I would say yes.

http://docs.python.org/dev/glossary.html#term-iterable
http://docs.python.org/dev/glossary.html#term-sequence

[toc] | [prev] | [next] | [standalone]


#29502

FromSteve Howell <showell30@yahoo.com>
Date2012-09-19 08:37 -0700
Message-ID<cf126c38-759a-4d56-bd1d-98c54c1f9f16@u15g2000yql.googlegroups.com>
In reply to#29498
On Sep 19, 8:06 am, Neil Cerutti <ne...@norwich.edu> wrote:
> On 2012-09-19, Ian Kelly <ian.g.ke...@gmail.com> wrote:
>
> > It notes in the doc string that it does not work on strings:
>
> > sum(...)
> >     sum(sequence[, start]) -> value
>
> >     Returns the sum of a sequence of numbers (NOT strings) plus
> >     the value of parameter 'start' (which defaults to 0).  When
> >     the sequence is empty, returns start.
>
> > I think this restriction is mainly for efficiency.  sum(['a',
> > 'b', 'c', 'd', 'e']) would be the equivalent of 'a' + 'b' + 'c'
> > + 'd' + 'e', which is an inefficient way to add together
> > strings.  You should use ''.join instead:
>
> While the docstring is still useful, it has diverged from the
> documentation a little bit.
>
>   sum(iterable[, start])
>
>   Sums start and the items of an iterable from left to right and
>   returns the total. start defaults to 0. The iterable‘s items
>   are normally numbers, and the start value is not allowed to be
>   a string.
>
>   For some use cases, there are good alternatives to sum(). The
>   preferred, fast way to concatenate a sequence of strings is by
>   calling ''.join(sequence). To add floating point values with
>   extended precision, see math.fsum(). To concatenate a series of
>   iterables, consider using itertools.chain().
>
> Are iterables and sequences different enough to warrant posting a
> bug report?
>

Sequences are iterables, so I'd say the docs are technically correct,
but maybe I'm misunderstanding what you would be trying to clarify.

[toc] | [prev] | [next] | [standalone]


#29516

FromIan Kelly <ian.g.kelly@gmail.com>
Date2012-09-19 12:33 -0600
Message-ID<mailman.935.1348079664.27098.python-list@python.org>
In reply to#29502
On Wed, Sep 19, 2012 at 9:37 AM, Steve Howell <showell30@yahoo.com> wrote:
> Sequences are iterables, so I'd say the docs are technically correct,
> but maybe I'm misunderstanding what you would be trying to clarify.

The doc string suggests that the argument to sum() must be a sequence,
when in fact any iterable will do.  The restriction in the docs should
be relaxed to match the reality.

[toc] | [prev] | [next] | [standalone]


#29517

FromSteve Howell <showell30@yahoo.com>
Date2012-09-19 11:43 -0700
Message-ID<22cf6162-f969-46db-832a-b472489245a6@m3g2000vby.googlegroups.com>
In reply to#29516
On Sep 19, 11:34 am, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> On Wed, Sep 19, 2012 at 9:37 AM, Steve Howell <showel...@yahoo.com> wrote:
> > Sequences are iterables, so I'd say the docs are technically correct,
> > but maybe I'm misunderstanding what you would be trying to clarify.
>
> The doc string suggests that the argument to sum() must be a sequence,
> when in fact any iterable will do.  The restriction in the docs should
> be relaxed to match the reality.

Ah.  The docstring looks to be fixed in 3.1.3, but not in Python 2.


Python 3.1.3 (r313:86834, Mar 13 2011, 00:40:38)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> sum.__doc__
"sum(iterable[, start]) -> value\n\nReturns the sum of an iterable of
numbers (NOT strings) plus the value\nof parameter 'start' (which
defaults to 0).  When the iterable is\nempty, returns start."


Python 2.6.6 (r266:84292, Mar 13 2011, 00:35:19)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> sum.__doc__
"sum(sequence[, start]) -> value\n\nReturns the sum of a sequence of
numbers (NOT strings) plus the value\nof parameter 'start' (which
defaults to 0).  When the sequence is\nempty, returns start."
>>>

[toc] | [prev] | [next] | [standalone]


#29504

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-09-19 16:14 +0000
Message-ID<5059ef62$0$29981$c3e8da3$5496439d@news.astraweb.com>
In reply to#29497
On Wed, 19 Sep 2012 09:03:03 -0600, Ian Kelly wrote:

> I think this restriction is mainly for efficiency.  sum(['a', 'b', 'c',
> 'd', 'e']) would be the equivalent of 'a' + 'b' + 'c' + 'd' + 'e', which
> is an inefficient way to add together strings.

It might not be obvious to some people why repeated addition is so 
inefficient, and in fact if people try it with modern Python (version 2.3 
or better), they may not notice any inefficiency.

But the example given, 'a' + 'b' + 'c' + 'd' + 'e', potentially ends up 
creating four strings, only to immediately throw away three of them:

* first it concats 'a' to 'b', giving the new string 'ab'
* then 'ab' + 'c', creating a new string 'abc'
* then 'abc' + 'd', creating a new string 'abcd'
* then 'abcd' + 'e', creating a new string 'abcde'

Each new string requires a block of memory to be allocated, potentially 
requiring other blocks of memory to be moved out of the way (at least for 
large blocks).

With only five characters in total, you won't really notice any slowdown, 
but with large enough numbers of strings, Python could potentially spend 
a lot of time building, and throwing away, intermediate strings. Pure 
wasted effort.

For another look at this, see:
http://www.joelonsoftware.com/articles/fog0000000319.html

I say "could" because starting in about Python 2.3, there is a nifty 
optimization in Python (CPython only, not Jython or IronPython) that can 
*sometimes* recognise repeated string concatenation and make it less 
inefficient. It depends on the details of the specific strings used, and 
the operating system's memory management. When it works, it can make 
string concatenation almost as efficient as ''.join(). When it doesn't 
work, repeated concatenation is PAINFULLY slow, hundreds or thousands of 
times slower than join.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#29499

FromAlister <alister.ware@ntlworld.com>
Date2012-09-19 15:07 +0000
Message-ID<scl6s.509699$Ax.328580@fx18.am4>
In reply to#29494
On Wed, 19 Sep 2012 16:41:20 +0200, Franck Ditter wrote:

> Hello,
> I wonder why sum does not work on the string sequence in Python 3 :
> 
>>>> sum((8,5,9,3))
> 25
>>>> sum([5,8,3,9,2])
> 27
>>>> sum('rtarze')
> TypeError: unsupported operand type(s) for +: 'int' and 'str'
> 
> I naively thought that sum('abc') would expand to 'a'+'b'+'c'
> And the error message is somewhat cryptic...
> 
>     franck

Summation is a mathematical function that works on numbers
Concatenation is the process of appending 1 string to another

although they are not related to each other they do share the same 
operator(+) which is the cause of confusion.
attempting to duck type this function would cause ambiguity for example 
what would you expect from

sum ('a','b',3,4)

'ab34' or 'ab7' ?

even 'A' + 7 would return this error for same reason.
 



-- 
It is the nature of extreme self-lovers, as they will set an house on 
fire,
and it were but to roast their eggs.
		-- Francis Bacon

[toc] | [prev] | [next] | [standalone]


#29505

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-09-19 16:18 +0000
Message-ID<5059f042$0$29981$c3e8da3$5496439d@news.astraweb.com>
In reply to#29499
On Wed, 19 Sep 2012 15:07:04 +0000, Alister wrote:

> Summation is a mathematical function that works on numbers Concatenation
> is the process of appending 1 string to another
> 
> although they are not related to each other they do share the same
> operator(+) which is the cause of confusion. attempting to duck type
> this function would cause ambiguity for example what would you expect
> from
> 
> sum ('a','b',3,4)
> 
> 'ab34' or 'ab7' ?

Neither. I would expect sum to do exactly what the + operator does if 
given two incompatible arguments: raise an exception.

And in fact, that's exactly what it does.

py> sum ([1, 2, 'a'])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'str'



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#29518

FromTerry Reedy <tjreedy@udel.edu>
Date2012-09-19 14:49 -0400
Message-ID<mailman.936.1348080608.27098.python-list@python.org>
In reply to#29499
On 9/19/2012 11:07 AM, Alister wrote:

> Summation is a mathematical function that works on numbers
> Concatenation is the process of appending 1 string to another
>
> although they are not related to each other they do share the same
> operator(+) which is the cause of confusion.

If one represents counts in unary, as a sequence or tally of 1s (or 
other markers indicating 'successor' or 'increment'), then count 
addition is sequence concatenation. I think Guido got it right.

It happens that when the members of all sequences are identical, there 
is a much more compact exponential place value notation that enables 
more efficient addition and other operations. When not, other tricks are 
needed to avoid so much copying that an inherently O(N) operation 
balloons into an O(N*N) operation.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#29531

FromHans Mulder <hansmu@xs4all.nl>
Date2012-09-20 00:25 +0200
Message-ID<505a4664$0$6961$e4fe514c@news2.news.xs4all.nl>
In reply to#29499
On 19/09/12 17:07:04, Alister wrote:
> On Wed, 19 Sep 2012 16:41:20 +0200, Franck Ditter wrote:
> 
>> Hello,
>> I wonder why sum does not work on the string sequence in Python 3 :
>>
>>>>> sum((8,5,9,3))
>> 25
>>>>> sum([5,8,3,9,2])
>> 27
>>>>> sum('rtarze')
>> TypeError: unsupported operand type(s) for +: 'int' and 'str'
>>
>> I naively thought that sum('abc') would expand to 'a'+'b'+'c'
>> And the error message is somewhat cryptic...
>>
>>     franck
> 
> Summation is a mathematical function that works on numbers
> Concatenation is the process of appending 1 string to another

Actually, the 'sum' builtin function is quite capable of
concatenatig objects, for example lists:

>>> sum(([2,3], [5,8], [13,21]), [])
[2, 3, 5, 8, 13, 21]

But if you pass a string as a starting value, you get an error:

>>> sum([], '')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]

In fact, you can bamboozle 'sum' into concatenating string by
by tricking it with a non-string starting value:

>>> class not_a_string(object):
...   def __add__(self, other):
...     return other
...
>>> sum("rtarze", not_a_string())
'rtarze'
>>> sum(["Monty ", "Python", "'s Fly", "ing Ci", "rcus"],
...     not_a_string())
"Monty Python's Flying Circus"
>>>


Hope this helps,

-- HansM

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web