Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #104490 > unrolled thread

Simple exercise

Started byRodrick Brown <rodrick.brown@gmail.com>
First post2016-03-10 04:02 -0500
Last post2016-03-17 22:28 +0100
Articles 20 on this page of 35 — 19 participants

Back to article view | Back to comp.lang.python


Contents

  Simple exercise Rodrick Brown <rodrick.brown@gmail.com> - 2016-03-10 04:02 -0500
    Re: Simple exercise Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2016-03-10 11:30 +0100
      Re: Simple exercise Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2016-03-10 12:07 +0100
      Re: Simple exercise Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2016-03-10 17:05 +0100
        Re: Simple exercise Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2016-03-10 17:08 +0100
    Re: Simple exercise Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2016-03-11 12:24 +1300
      Re: Simple exercise Chris Angelico <rosuav@gmail.com> - 2016-03-11 10:38 +1100
    Re: Simple exercise BartC <bc@freeuk.com> - 2016-03-11 00:05 +0000
      Re: Simple exercise Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-11 01:21 +0000
        Re: Simple exercise BartC <bc@freeuk.com> - 2016-03-11 01:45 +0000
          Re: Simple exercise Larry Martell <larry.martell@gmail.com> - 2016-03-10 20:53 -0500
          Re: Simple exercise "Martin A. Brown" <martin@linux-ip.net> - 2016-03-10 17:56 -0800
          Re: Simple exercise Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-11 02:03 +0000
            Re: Simple exercise BartC <bc@freeuk.com> - 2016-03-11 02:18 +0000
            Re: Simple exercise Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-14 07:35 -0700
              Re: Simple exercise Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-03-14 15:06 +0000
                Re: Simple exercise Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-14 09:00 -0700
                Re: Simple exercise Steven D'Aprano <steve@pearwood.info> - 2016-03-15 10:59 +1100
                  Re: Simple exercise Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-03-15 07:26 +0200
                    Re: Simple exercise Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-03-15 19:39 +1100
                      Re: Simple exercise Chris Angelico <rosuav@gmail.com> - 2016-03-15 19:53 +1100
                      Re: Simple exercise Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-03-15 11:04 +0200
                  Re: Simple exercise Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-03-15 11:09 +0000
              Re: Simple exercise Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-14 09:16 -0600
                Re: Simple exercise Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-14 09:11 -0700
              Re: Simple exercise Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-14 15:23 +0000
              Re: Simple exercise Peter Otten <__peter__@web.de> - 2016-03-14 17:00 +0100
          Re: Simple exercise Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-11 02:05 +0000
        Re: Simple exercise Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-14 07:07 -0700
          Re: Simple exercise Larry Martell <larry.martell@gmail.com> - 2016-03-14 10:13 -0400
          Re: Simple exercise alister <alister.ware@ntlworld.com> - 2016-03-14 14:18 +0000
            Re: Simple exercise Rick Johnson <rantingrickjohnson@gmail.com> - 2016-03-14 08:22 -0700
              Re: Simple exercise MRAB <python@mrabarnett.plus.com> - 2016-03-14 15:57 +0000
      Re: Simple exercise Chris Kaynor <ckaynor@zindagigames.com> - 2016-03-10 18:14 -0800
    Re: Simple exercise boffi <boffi@casa.sua> - 2016-03-17 22:28 +0100

Page 1 of 2  [1] 2  Next page →


#104490 — Simple exercise

FromRodrick Brown <rodrick.brown@gmail.com>
Date2016-03-10 04:02 -0500
SubjectSimple exercise
Message-ID<mailman.117.1457600573.15725.python-list@python.org>
>From the following input

9
BANANA FRIES 12
POTATO CHIPS 30
APPLE JUICE 10
CANDY 5
APPLE JUICE 10
CANDY 5
CANDY 5
CANDY 5
POTATO CHIPS 30

I'm expecting the following output
BANANA FRIES 12
POTATO CHIPS 60
APPLE JUICE 20
CANDY 20

However my code seems be returning incorrect value

#!/usr/bin/env python3

import sys
import re
from collections import OrderedDict

if __name__ == '__main__':

  od = OrderedDict()
  recs = int(input())

  for _ in range(recs):
    file_input = sys.stdin.readline().strip()
    m = re.search(r"(\w.+)\s+(\d+)", file_input)

    if m:
      if m.group(1) not in od.keys():
        od[m.group(1)] = int(m.group(2))
      else:
        od[m.group(1)] += int(od.get(m.group(1),0))
  for k,v in od.items():
    print(k,v)

What's really going on here?

$ cat groceries.txt | ./groceries.py
BANANA FRIES 12
POTATO CHIPS 60
APPLE JUICE 20
CANDY 40

[toc] | [next] | [standalone]


#104498

FromThomas 'PointedEars' Lahn <PointedEars@web.de>
Date2016-03-10 11:30 +0100
Message-ID<15045007.7fdgliee2q@PointedEars.de>
In reply to#104490
Rodrick Brown wrote:

> […]
>     if m:
>       if m.group(1) not in od.keys():
>         od[m.group(1)] = int(m.group(2))
>       else:
>         od[m.group(1)] += int(od.get(m.group(1),0))
> […]

This program logic appears to be wrong as you are not adding the value that 
you just read to the dictionary entry for the key that you just read but the 
value that you had in the dictionary for that key before.  Perhaps you were 
looking for this (I also optimized a bit):

        key = m.group(1)
        value = int(m.group(1))

        if key not in od:
          od[key] = value
        else:
          od[key] += value

But there is probably an even more pythonic way to do this.

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.

[toc] | [prev] | [next] | [standalone]


#104500

FromThomas 'PointedEars' Lahn <PointedEars@web.de>
Date2016-03-10 12:07 +0100
Message-ID<42888973.jzIuzHLKNz@PointedEars.de>
In reply to#104498
Thomas 'PointedEars' Lahn wrote:

>         key = m.group(1)
>         value = int(m.group(1))

        value = int(m.group(2))

 
>         if key not in od:
>           od[key] = value
>         else:
>           od[key] += value

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.

[toc] | [prev] | [next] | [standalone]


#104529

FromThomas 'PointedEars' Lahn <PointedEars@web.de>
Date2016-03-10 17:05 +0100
Message-ID<4942558.HO6WSNh2JR@PointedEars.de>
In reply to#104498
Thomas 'PointedEars' Lahn wrote:

[
>         key = m.group(1)
>         value = int(m.group(2))
> 
>         if key not in od:
>           od[key] = value
>         else:
>           od[key] += value
> 
> But there is probably an even more pythonic way to do this.
]

For example, based on the original code:

    recs = int(input())
    od = OrderedDict()
    items = []

    for _ in range(recs):
        file_input = sys.stdin.readline().strip()
        m = re.search(r"(\w.+)\s+(\d+)", file_input)
        if m: items.append(m.group(1, 2))

    od = OrderedDict(map(lambda item: (item[0], 0), items))
    for item in items: od[item[0]] += item[1]

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.

[toc] | [prev] | [next] | [standalone]


#104530

FromThomas 'PointedEars' Lahn <PointedEars@web.de>
Date2016-03-10 17:08 +0100
Message-ID<2968809.0JbxDi5QZk@PointedEars.de>
In reply to#104529
Thomas 'PointedEars' Lahn wrote:

>     od = OrderedDict()

This is pointless, then.

>     […]
>     od = OrderedDict(map(lambda item: (item[0], 0), items))
>     for item in items: od[item[0]] += item[1]

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.

[toc] | [prev] | [next] | [standalone]


#104560

FromGregory Ewing <greg.ewing@canterbury.ac.nz>
Date2016-03-11 12:24 +1300
Message-ID<dkee0qF5s1dU1@mid.individual.net>
In reply to#104490
Rodrick Brown wrote:
>       if m.group(1) not in od.keys():
>         od[m.group(1)] = int(m.group(2))
>       else:
>         od[m.group(1)] += int(od.get(m.group(1),0))

Others have pointed out what's wrong with this, but here's
a general tip: Don't repeat complicated subexpressions
such as m.group(1). Doing so makes the code hard to read
and therefore hard to spot errors in (and less efficient
as well, although that's a secondary consideration).

Instead, pull them out and give them meaningful names.
Doing so with the above code gives:

   name = m.group(1)
   value = m.group(2)
   if name not in od.keys():
     od[name] = int(value)
   else:
     od[name] += int(od.get(name, 0))

Now it's a lot eaier to see that you haven't used the
value anywhere in the second case, which should alert
you that something isn't right.

-- 
Greg

[toc] | [prev] | [next] | [standalone]


#104561

FromChris Angelico <rosuav@gmail.com>
Date2016-03-11 10:38 +1100
Message-ID<mailman.160.1457653085.15725.python-list@python.org>
In reply to#104560
On Fri, Mar 11, 2016 at 10:24 AM, Gregory Ewing
<greg.ewing@canterbury.ac.nz> wrote:
> Instead, pull them out and give them meaningful names.
> Doing so with the above code gives:
>
>   name = m.group(1)
>   value = m.group(2)
>   if name not in od.keys():
>     od[name] = int(value)
>   else:
>     od[name] += int(od.get(name, 0))
>
> Now it's a lot eaier to see that you haven't used the
> value anywhere in the second case, which should alert
> you that something isn't right.

Although in this case, the code is majorly redundant - and could be
replaced entirely with a defaultdict(int).

ChrisA

[toc] | [prev] | [next] | [standalone]


#104563

FromBartC <bc@freeuk.com>
Date2016-03-11 00:05 +0000
Message-ID<nbt1vd$k00$1@dont-email.me>
In reply to#104490
On 10/03/2016 09:02, Rodrick Brown wrote:
>>From the following input
>
> 9
> BANANA FRIES 12
> POTATO CHIPS 30
> APPLE JUICE 10
> CANDY 5
> APPLE JUICE 10
> CANDY 5
> CANDY 5
> CANDY 5
> POTATO CHIPS 30
>
> I'm expecting the following output
> BANANA FRIES 12
> POTATO CHIPS 60
> APPLE JUICE 20
> CANDY 20


Here's a rather un-Pythonic and clunky version. But it gives the 
expected results. (I've dispensed with file input, but that can easily 
be added back.)

def last(a):
     return a[-1]

def init(a):                 # all except last element
     return a[0:len(a)-1]

data =["BANANA FRIES 12",    # 1+ items/line, last must be numeric
        "POTATO CHIPS 30",
        "APPLE JUICE 10",
        "CANDY 5",
        "APPLE JUICE 10",
        "CANDY 5",
        "CANDY 5",
        "CANDY 5",
        "POTATO CHIPS 30"]

names  = []                        # serve as key/value sets
totals = []

for line in data:                  # banana fries 12
     parts = line.split(" ")        # ['banana','fries','12']
     value = int(last(parts))       # 12
     name  =  " ".join(init(parts)) # 'banana fries'

     try:
         n = names.index(name)      # update existing entry
         totals[n] += value
     except:
         names.append(name)         # new entry
         totals.append(value)

for i in range(len(names)):
     print (names[i],totals[i])


-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#104570

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2016-03-11 01:21 +0000
Message-ID<mailman.163.1457659326.15725.python-list@python.org>
In reply to#104563
On 11/03/2016 00:05, BartC wrote:
> On 10/03/2016 09:02, Rodrick Brown wrote:
>>> From the following input
>>
>> 9
>> BANANA FRIES 12
>> POTATO CHIPS 30
>> APPLE JUICE 10
>> CANDY 5
>> APPLE JUICE 10
>> CANDY 5
>> CANDY 5
>> CANDY 5
>> POTATO CHIPS 30
>>
>> I'm expecting the following output
>> BANANA FRIES 12
>> POTATO CHIPS 60
>> APPLE JUICE 20
>> CANDY 20
>
>
> Here's a rather un-Pythonic and clunky version. But it gives the
> expected results. (I've dispensed with file input, but that can easily
> be added back.)
>
> def last(a):
>      return a[-1]
>
> def init(a):                 # all except last element
>      return a[0:len(a)-1]

What is wrong with a[0:1] ?

>
> data =["BANANA FRIES 12",    # 1+ items/line, last must be numeric
>         "POTATO CHIPS 30",
>         "APPLE JUICE 10",
>         "CANDY 5",
>         "APPLE JUICE 10",
>         "CANDY 5",
>         "CANDY 5",
>         "CANDY 5",
>         "POTATO CHIPS 30"]
>
> names  = []                        # serve as key/value sets
> totals = []
>
> for line in data:                  # banana fries 12
>      parts = line.split(" ")        # ['banana','fries','12']
>      value = int(last(parts))       # 12
>      name  =  " ".join(init(parts)) # 'banana fries'
>
>      try:
>          n = names.index(name)      # update existing entry
>          totals[n] += value
>      except:

Never use a bare except.  Better still, use an appropriate collection 
rather than two lists.  Off of the top of my head a counter or a 
defaultdict.

>          names.append(name)         # new entry
>          totals.append(value)
>
> for i in range(len(names)):
>      print (names[i],totals[i])
>

Always a code smell when range() and len() are combined.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#104572

FromBartC <bc@freeuk.com>
Date2016-03-11 01:45 +0000
Message-ID<nbt7qr$4fa$1@dont-email.me>
In reply to#104570
On 11/03/2016 01:21, Mark Lawrence wrote:
> On 11/03/2016 00:05, BartC wrote:

>> def last(a):
>>      return a[-1]
>>
>> def init(a):                 # all except last element
>>      return a[0:len(a)-1]
>
> What is wrong with a[0:1] ?

The returns the head of the list. I need everything except the last 
element ('init' is from Haskell).

>> for i in range(len(names)):
>>      print (names[i],totals[i])
>>
>
> Always a code smell when range() and len() are combined.

Any other way of traversing two lists in parallel?

-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#104573

FromLarry Martell <larry.martell@gmail.com>
Date2016-03-10 20:53 -0500
Message-ID<mailman.165.1457661268.15725.python-list@python.org>
In reply to#104572
On Thu, Mar 10, 2016 at 8:45 PM, BartC <bc@freeuk.com> wrote:
> Any other way of traversing two lists in parallel?

zip

[toc] | [prev] | [next] | [standalone]


#104574

From"Martin A. Brown" <martin@linux-ip.net>
Date2016-03-10 17:56 -0800
Message-ID<mailman.166.1457661399.15725.python-list@python.org>
In reply to#104572
>>> for i in range(len(names)):
>>>     print (names[i],totals[i])
>>
>> Always a code smell when range() and len() are combined.
>
> Any other way of traversing two lists in parallel?

Yes.  Builtin function called 'zip'.

  https://docs.python.org/3/library/functions.html#zip

Toy example:

  import string
  alpha = string.ascii_lowercase
  nums = range(len(alpha))
  for N, A in zip(nums, alpha):
      print(N, A)

Good luck,

-Martin

-- 
Martin A. Brown
http://linux-ip.net/

[toc] | [prev] | [next] | [standalone]


#104575

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2016-03-11 02:03 +0000
Message-ID<mailman.167.1457661831.15725.python-list@python.org>
In reply to#104572
On 11/03/2016 01:45, BartC wrote:
> On 11/03/2016 01:21, Mark Lawrence wrote:
>> On 11/03/2016 00:05, BartC wrote:
>
>>> def last(a):
>>>      return a[-1]
>>>
>>> def init(a):                 # all except last element
>>>      return a[0:len(a)-1]
>>
>> What is wrong with a[0:1] ?
>
> The returns the head of the list. I need everything except the last
> element ('init' is from Haskell).

I missed out one character, it should of course have been:-

a[0:-1]

>
>>> for i in range(len(names)):
>>>      print (names[i],totals[i])
>>>
>>
>> Always a code smell when range() and len() are combined.
>
> Any other way of traversing two lists in parallel?
>

Use zip(), but as I suggested in my earlier reply there are better data 
structures than two lists in parallel for this problem.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#104578

FromBartC <bc@freeuk.com>
Date2016-03-11 02:18 +0000
Message-ID<nbt9o9$9a1$1@dont-email.me>
In reply to#104575
On 11/03/2016 02:03, Mark Lawrence wrote:
> On 11/03/2016 01:45, BartC wrote:
>> On 11/03/2016 01:21, Mark Lawrence wrote:
>>> On 11/03/2016 00:05, BartC wrote:
>>
>>>> def last(a):
>>>>      return a[-1]
>>>>
>>>> def init(a):                 # all except last element
>>>>      return a[0:len(a)-1]
>>>
>>> What is wrong with a[0:1] ?
>>
>> The returns the head of the list. I need everything except the last
>> element ('init' is from Haskell).
>
> I missed out one character, it should of course have been:-
>
> a[0:-1]

I tried that, but I must have got something wrong.

-- 
Bartc

[toc] | [prev] | [next] | [standalone]


#104814

FromRick Johnson <rantingrickjohnson@gmail.com>
Date2016-03-14 07:35 -0700
Message-ID<88c5b5fa-66a0-461a-8ae4-b3264b32f679@googlegroups.com>
In reply to#104575
On Thursday, March 10, 2016 at 8:04:04 PM UTC-6, Mark Lawrence wrote:
> On 11/03/2016 01:45, BartC wrote:
> > [...]
> > Any other way of traversing two lists in parallel?
> >
>
> Use zip()

Sure, the zip function is quite handy, but it can produce
subtle bugs when both sequences are not of the same length.
Consider the following:

# BEGIN INTERACTIVE SESSION
>>> a = [1,2,3]
>>> b = list('abcde')
>>> for _ in zip(a, b):
...     print(_)
(1, 'a')
(2, 'b')
(3, 'c')
# END INTERACTIVE SESSION

Hey kids, the letter of the day is "e" , and the noun of the
day is "ether", and the verb of the day is, you guessed it:
"evaporate"!

I would strongly warn anyone against using the zip function
unless they are absolutely, one hundred percent, not
guilty... urm, oops, sorry to steal your line OJ. And BTW,
did you ever find your wife's killer? But i digress.

I meant to say: absolutely, one hundred percent *SURE*, that
both sequences are of the same length, or, absolutely one
hundred percent *SURE*, that dropping values is not going to
matter. For that reason, i avoid the zip function like the
plague. I would much rather get an index error, than let an
error pass silently.

PS: Hmm, why does that last sentence have such a familiar
"ring" to it?

[toc] | [prev] | [next] | [standalone]


#104819

FromOscar Benjamin <oscar.j.benjamin@gmail.com>
Date2016-03-14 15:06 +0000
Message-ID<mailman.101.1457968006.12893.python-list@python.org>
In reply to#104814
On 14 March 2016 at 14:35, Rick Johnson <rantingrickjohnson@gmail.com> wrote:
>
> I would strongly warn anyone against using the zip function
> unless
...
> I meant to say: absolutely, one hundred percent *SURE*, that
> both sequences are of the same length, or, absolutely one
> hundred percent *SURE*, that dropping values is not going to
> matter. For that reason, i avoid the zip function like the
> plague. I would much rather get an index error, than let an
> error pass silently.

I also think it's unfortunate that zip silently discards items. Almost
always when I use zip I would prefer to see an error when the two
iterables are not of the same length. Of course you're not necessarily
safer with len and range:

a = [1, 2, 3]
b = 'abcde'

for n in range(len(a)):
    print(a[n], b[n])

--
Oscar

[toc] | [prev] | [next] | [standalone]


#104831

FromRick Johnson <rantingrickjohnson@gmail.com>
Date2016-03-14 09:00 -0700
Message-ID<aa78efbd-b89e-45d2-b528-dc9768c43b89@googlegroups.com>
In reply to#104819
On Monday, March 14, 2016 at 10:06:56 AM UTC-5, Oscar Benjamin wrote:
> On 14 March 2016 at 14:35, Rick Johnson <rantingrickjohnson@gmail.com> wrote:
> >
> > I would strongly warn anyone against using the zip function
> > unless
> ...
> > I meant to say: absolutely, one hundred percent *SURE*, that
> > both sequences are of the same length, or, absolutely one
> > hundred percent *SURE*, that dropping values is not going to
> > matter. For that reason, i avoid the zip function like the
> > plague. I would much rather get an index error, than let an
> > error pass silently.
> 
> I also think it's unfortunate that zip silently discards items. Almost
> always when I use zip I would prefer to see an error when the two
> iterables are not of the same length. 

Yes. zip is no doubt more Pythonic than any indexing will
ever be, but without a way to manage this "discarding
issue", i can't justify using the function. There are three
possible ways to solve this dilemma:

  (1) Add a keyword argument to zip, something like
  "validateLengths" -- which will default to False. 
  
  (2) Create a new zip function called "strictzip" which
  will always throw and error when all of the sequences
  don't share the same length. 
  
  (3) Encourage every programmer to write their own wrapper
  around zip.

And since we've recently learned that Python programmers
have an aversion to typing, number three is out the
question.

> Of course you're not necessarily safer with len and range:
> 
> a = [1, 2, 3]
> b = 'abcde'
> 
> for n in range(len(a)):
>     print(a[n], b[n])

You make a valid point here. So i'm not 100% protected using
indexing, however, i am 100% unprotected using zip. Whew...
I'm still right, but *ONLY* because i'm not 100% wrong.

PS: For second there, i was afraid my impeccable reputation
might have been in jeopardy. O:-)

[toc] | [prev] | [next] | [standalone]


#104878

FromSteven D'Aprano <steve@pearwood.info>
Date2016-03-15 10:59 +1100
Message-ID<56e75076$0$22142$c3e8da3$5496439d@news.astraweb.com>
In reply to#104819
On Tue, 15 Mar 2016 02:06 am, Oscar Benjamin wrote:

> On 14 March 2016 at 14:35, Rick Johnson <rantingrickjohnson@gmail.com>
> wrote:
>>
>> I would strongly warn anyone against using the zip function
>> unless
> ...
>> I meant to say: absolutely, one hundred percent *SURE*, that
>> both sequences are of the same length, or, absolutely one
>> hundred percent *SURE*, that dropping values is not going to
>> matter. For that reason, i avoid the zip function like the
>> plague. I would much rather get an index error, than let an
>> error pass silently.
> 
> I also think it's unfortunate that zip silently discards items. 

Are you aware of itertools.zip_longest?

That makes it easy to build a zip_strict:

def zip_strict(*iterables):
    pad = object()
    for t in itertools.zip_longest(*iterables, fillvalue=pad):
        if pad in t:
            raise ValueError("iterables of different length")
        yield t


Unfortunate or not, it seems to be quite common that "zip" (convolution)
discards items when sequences are of different lengths. I think the usual
intent is so that you can zip an infinite (or near infinite) sequence of
counters 1, 2, 3, 4, ... with the sequence you actually want, to get the
equivalent of Python's enumerate().

Clojure, Common Lisp, Haskell all halt on the shortest sequence, like
Python; D is configurable with a stopping policy:

(shortest, longest, requiresSameLength)

but the effect of these are not documented well.

http://dlang.org/phobos/std_range.html#zip

Ruby's zip pads missing values with nil, but only relative to the *first*
argument:

irb(main):001:0> a = [1, 2, 3]
=> [1, 2, 3]
irb(main):002:0> b = [10, 20]
=> [10, 20]
irb(main):003:0> a.zip(b)
=> [[1, 10], [2, 20], [3, nil]]
irb(main):004:0> b.zip(a)
=> [[10, 1], [20, 2]]


F# also has a zip function, but I don't know what it does.

Scheme doesn't appear to have a built-in zip function, but it is easily
written using map, giving the halt-on-shortest behaviour:

(define (zip l1 l2)(map list l1 l2))


See https://en.wikipedia.org/wiki/Convolution_%28computer_science%29



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#104917

FromJussi Piitulainen <jussi.piitulainen@helsinki.fi>
Date2016-03-15 07:26 +0200
Message-ID<lf5y49kia6g.fsf@ling.helsinki.fi>
In reply to#104878
Steven D'Aprano writes:

> Unfortunate or not, it seems to be quite common that "zip"
> (convolution) discards items when sequences are of different lengths.

Citation needed. Where is zip called convolution?

Why should zip be called convolution?

> See https://en.wikipedia.org/wiki/Convolution_%28computer_science%29

"This article possibly contains original research. Please improve it by
verifying the claims made and adding inline citations."

Now it looks like the ultimate source is a PlanetMath stub, which in
turn looks like someone is making stuff up.

[toc] | [prev] | [next] | [standalone]


#104924

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2016-03-15 19:39 +1100
Message-ID<56e7ca7f$0$11093$c3e8da3@news.astraweb.com>
In reply to#104917
On Tuesday 15 March 2016 16:26, Jussi Piitulainen wrote:

> Steven D'Aprano writes:
> 
>> Unfortunate or not, it seems to be quite common that "zip"
>> (convolution) discards items when sequences are of different lengths.
> 
> Citation needed. Where is zip called convolution?

Wikipedia :-)

Unfortunately "convolution" is one of those technical terms with many 
related but slightly different meanings. It's used in calculus, signal 
processing, geology, biology, probability theory, formal languages, and 
more. I don't have a citation for it being used in functional programming, 
but is it so hard to believe? A "convolution" is usually described as 
something being folded over another thing, which sounds rather like zip, 
doesn't it?

Take two pieces of paper, say one white and one black, one on top of the 
other, and fold them in half, then in half again, then again:

zero folds = W B

one fold = W B B W

two folds = W B B W W B B W


which is not that far from what zip would give you:

W B W B W B W B ...


I don't know enough about the lambda calculus and other theoretical computer 
science topics to give a definitive citation for "zip" being a convolution, 
but I do know enough to accept it as plausible.



> Why should zip be called convolution?

Why should anything be called anything?

Don't worry, I'm not suggesting that the zip function be renamed.


>> See https://en.wikipedia.org/wiki/Convolution_%28computer_science%29
> 
> "This article possibly contains original research. Please improve it by
> verifying the claims made and adding inline citations."

Meh, there are Wikipedia editors that seem to flag just about every article 
with that. You could write "water is wet" and technically that's "original 
research" that needs a citation. It is so over-used that it is practically 
meaningless.


-- 
Steve

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web