Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #71973 > unrolled thread

Numpy Array of Sets

Started byLuis José Novoa <luisjosenovoa@gmail.com>
First post2014-05-24 15:05 -0700
Last post2014-05-25 08:17 -0700
Articles 8 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  Numpy Array of Sets Luis José Novoa <luisjosenovoa@gmail.com> - 2014-05-24 15:05 -0700
    Re: Numpy Array of Sets Robert Kern <robert.kern@gmail.com> - 2014-05-24 23:14 +0100
    Re: Numpy Array of Sets Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2014-05-25 00:25 +0200
      Re: Numpy Array of Sets LJ <luisjosenovoa@gmail.com> - 2014-05-25 05:29 -0700
        Re: Numpy Array of Sets Peter Otten <__peter__@web.de> - 2014-05-25 15:26 +0200
          Re: Numpy Array of Sets LJ <luisjosenovoa@gmail.com> - 2014-05-25 07:14 -0700
            Re: Numpy Array of Sets Peter Otten <__peter__@web.de> - 2014-05-25 17:12 +0200
              Re: Numpy Array of Sets LJ <luisjosenovoa@gmail.com> - 2014-05-25 08:17 -0700

#71973 — Numpy Array of Sets

FromLuis José Novoa <luisjosenovoa@gmail.com>
Date2014-05-24 15:05 -0700
SubjectNumpy Array of Sets
Message-ID<38836877-cd87-44ce-b9df-1eda702e7164@googlegroups.com>
Hi All, 

Hope you're doing great. One quick question. I am defining an array of sets using numpy as:

a=array([set([])]*3)

Now, if I want to add an element to the set in, lets say, a[0], and I use the .add(4) operation, which results in:

array([set([4]), set([4]), set([4])], dtype=object)

which I do not want. If I use the union operator 

a[0] = a[0] | set([4])

then I obtain what I want:

array([set([4]), set([]), set([])], dtype=object)

Can anyone explain whay this happens?

Thank you very much.

[toc] | [next] | [standalone]


#71975

FromRobert Kern <robert.kern@gmail.com>
Date2014-05-24 23:14 +0100
Message-ID<mailman.10272.1400969687.18130.python-list@python.org>
In reply to#71973
On 2014-05-24 23:05, Luis José Novoa wrote:
> Hi All,
>
> Hope you're doing great. One quick question. I am defining an array of sets using numpy as:
>
> a=array([set([])]*3)
>
> Now, if I want to add an element to the set in, lets say, a[0], and I use the .add(4) operation, which results in:
>
> array([set([4]), set([4]), set([4])], dtype=object)
>
> which I do not want. If I use the union operator
>
> a[0] = a[0] | set([4])
>
> then I obtain what I want:
>
> array([set([4]), set([]), set([])], dtype=object)
>
> Can anyone explain whay this happens?

Same reason why you shouldn't make a list of lists like so: [[]]*3

https://docs.python.org/2/faq/programming.html#how-do-i-create-a-multidimensional-list

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco

[toc] | [prev] | [next] | [standalone]


#71976

FromWolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de>
Date2014-05-25 00:25 +0200
Message-ID<mailman.10273.1400970363.18130.python-list@python.org>
In reply to#71973
On 25.05.2014 00:14, Robert Kern wrote:
> On 2014-05-24 23:05, Luis José Novoa wrote:
>> Hi All,
>>
>> Hope you're doing great. One quick question. I am defining an array of
>> sets using numpy as:
>>
>> a=array([set([])]*3)
>>

Has nothing to do with numpy, but the problem is exclusively with your 
innermost expression [set([])]*3.

>> Now, if I want to add an element to the set in, lets say, a[0], and I
>> use the .add(4) operation, which results in:
>>

with .add you are modifying the *existing* set.

>> array([set([4]), set([4]), set([4])], dtype=object)
>>
>> which I do not want. If I use the union operator
>>
>> a[0] = a[0] | set([4])
>>

here you are forming a *new* set and put it in a[0] replacing the old 
set at this position.

>> then I obtain what I want:
>>
>> array([set([4]), set([]), set([])], dtype=object)
>>
>> Can anyone explain whay this happens?
>
> Same reason why you shouldn't make a list of lists like so: [[]]*3
>
> https://docs.python.org/2/faq/programming.html#how-do-i-create-a-multidimensional-list
>

The above link explains the underlying problem.

Best,
Wolfgang

[toc] | [prev] | [next] | [standalone]


#72000

FromLJ <luisjosenovoa@gmail.com>
Date2014-05-25 05:29 -0700
Message-ID<f55843c2-17f3-4551-a1c6-b608c09fd6d8@googlegroups.com>
In reply to#71976
Wolfgang, thank you very much for your reply.

Following the example in the link, the problem appears:

>>> A = [[0]*2]*3
>>> A
[[0, 0], [0, 0], [0, 0]]
>>> A[0][0] = 5
>>> A
[[5, 0], [5, 0], [5, 0]]

Now, if I use a numpy array:

>>> d=array([[0]*2]*3)
>>> d
array([[0, 0],
       [0, 0],
       [0, 0]])
>>> d[0][0]=5
>>> d
array([[5, 0],
       [0, 0],
       [0, 0]])


What is the difference here?

Thank you,

[toc] | [prev] | [next] | [standalone]


#72006

FromPeter Otten <__peter__@web.de>
Date2014-05-25 15:26 +0200
Message-ID<mailman.10294.1401024422.18130.python-list@python.org>
In reply to#72000
LJ wrote:

> Wolfgang, thank you very much for your reply.
> 
> Following the example in the link, the problem appears:
> 
>>>> A = [[0]*2]*3

You can see this as a shortcut for

value = 0
inner = [value, value]
A = [inner, inner, inner]

When the value is mutable (like your original set) a modification of the 
value shows in all six entries. Likewise if you change the `inner` list the 
modification shows in all three rows.

>>>> A
> [[0, 0], [0, 0], [0, 0]]
>>>> A[0][0] = 5
>>>> A
> [[5, 0], [5, 0], [5, 0]]
> 
> Now, if I use a numpy array:
> 
>>>> d=array([[0]*2]*3)
>>>> d
> array([[0, 0],
>        [0, 0],
>        [0, 0]])
>>>> d[0][0]=5
>>>> d
> array([[5, 0],
>        [0, 0],
>        [0, 0]])
> 
> 
> What is the difference here?

Basically a numpy array doesn't reference the lists, it uses them to 
determine the required shape of the array. A simplified implementation might 
be

class Array:
    def __init__(self, data):
        self.shape = (len(data), len(data[0]))
        self._data = []
        for row in data: self._data.extend(row)
    def __getitem__(self, index):
        y, x = index
        return self._data[y * self.shape[1] + x]

With that approach you may only see simultaneous changes of multiple entries 
when using mutable values.

[toc] | [prev] | [next] | [standalone]


#72010

FromLJ <luisjosenovoa@gmail.com>
Date2014-05-25 07:14 -0700
Message-ID<9929f123-705c-4656-a400-171e48935244@googlegroups.com>
In reply to#72006
Thank you for the reply.

So, as long as I access and modify the elements of, for example, 

A=array([[set([])]*4]*3)


as (for example):

a[0][1] = a[0][1] | set([1,2])

or:

a[0][1]=set([1,2])

then I should have no problems?

[toc] | [prev] | [next] | [standalone]


#72012

FromPeter Otten <__peter__@web.de>
Date2014-05-25 17:12 +0200
Message-ID<mailman.10299.1401030762.18130.python-list@python.org>
In reply to#72010
LJ wrote:

> Thank you for the reply.
> 
> So, as long as I access and modify the elements of, for example,
> 
> A=array([[set([])]*4]*3)
> 
> 
> as (for example):
> 
> a[0][1] = a[0][1] | set([1,2])
> 
> or:
> 
> a[0][1]=set([1,2])
> 
> then I should have no problems?

As long as you set (i. e. replace) elements you're fine, but modifying means 
trouble. You can prevent accidental modification by using immutable values 
-- in your case frozenset:

>>> b = numpy.array([[frozenset()]*4]*3)
>>> b[0,0].update("123")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'frozenset' object has no attribute 'update'

Or you take the obvious approach and ensure that there are no shared values. 
I don't know if there's a canonical form to do this in numpy, but

>>> a = numpy.array([[set()]*3]*4) 
>>> a |= set()

works:

>>> assert len(set(map(id, a.flat))) == 3*4

[toc] | [prev] | [next] | [standalone]


#72013

FromLJ <luisjosenovoa@gmail.com>
Date2014-05-25 08:17 -0700
Message-ID<bf08d969-5ff1-4ee4-ad8e-a51935c794c9@googlegroups.com>
In reply to#72012
Thank you very much!

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web