Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #71973 > unrolled thread
| Started by | Luis José Novoa <luisjosenovoa@gmail.com> |
|---|---|
| First post | 2014-05-24 15:05 -0700 |
| Last post | 2014-05-25 08:17 -0700 |
| Articles | 8 — 5 participants |
Back to article view | Back to comp.lang.python
Numpy Array of Sets Luis José Novoa <luisjosenovoa@gmail.com> - 2014-05-24 15:05 -0700
Re: Numpy Array of Sets Robert Kern <robert.kern@gmail.com> - 2014-05-24 23:14 +0100
Re: Numpy Array of Sets Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2014-05-25 00:25 +0200
Re: Numpy Array of Sets LJ <luisjosenovoa@gmail.com> - 2014-05-25 05:29 -0700
Re: Numpy Array of Sets Peter Otten <__peter__@web.de> - 2014-05-25 15:26 +0200
Re: Numpy Array of Sets LJ <luisjosenovoa@gmail.com> - 2014-05-25 07:14 -0700
Re: Numpy Array of Sets Peter Otten <__peter__@web.de> - 2014-05-25 17:12 +0200
Re: Numpy Array of Sets LJ <luisjosenovoa@gmail.com> - 2014-05-25 08:17 -0700
| From | Luis José Novoa <luisjosenovoa@gmail.com> |
|---|---|
| Date | 2014-05-24 15:05 -0700 |
| Subject | Numpy Array of Sets |
| Message-ID | <38836877-cd87-44ce-b9df-1eda702e7164@googlegroups.com> |
Hi All, Hope you're doing great. One quick question. I am defining an array of sets using numpy as: a=array([set([])]*3) Now, if I want to add an element to the set in, lets say, a[0], and I use the .add(4) operation, which results in: array([set([4]), set([4]), set([4])], dtype=object) which I do not want. If I use the union operator a[0] = a[0] | set([4]) then I obtain what I want: array([set([4]), set([]), set([])], dtype=object) Can anyone explain whay this happens? Thank you very much.
[toc] | [next] | [standalone]
| From | Robert Kern <robert.kern@gmail.com> |
|---|---|
| Date | 2014-05-24 23:14 +0100 |
| Message-ID | <mailman.10272.1400969687.18130.python-list@python.org> |
| In reply to | #71973 |
On 2014-05-24 23:05, Luis José Novoa wrote: > Hi All, > > Hope you're doing great. One quick question. I am defining an array of sets using numpy as: > > a=array([set([])]*3) > > Now, if I want to add an element to the set in, lets say, a[0], and I use the .add(4) operation, which results in: > > array([set([4]), set([4]), set([4])], dtype=object) > > which I do not want. If I use the union operator > > a[0] = a[0] | set([4]) > > then I obtain what I want: > > array([set([4]), set([]), set([])], dtype=object) > > Can anyone explain whay this happens? Same reason why you shouldn't make a list of lists like so: [[]]*3 https://docs.python.org/2/faq/programming.html#how-do-i-create-a-multidimensional-list -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco
[toc] | [prev] | [next] | [standalone]
| From | Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> |
|---|---|
| Date | 2014-05-25 00:25 +0200 |
| Message-ID | <mailman.10273.1400970363.18130.python-list@python.org> |
| In reply to | #71973 |
On 25.05.2014 00:14, Robert Kern wrote: > On 2014-05-24 23:05, Luis José Novoa wrote: >> Hi All, >> >> Hope you're doing great. One quick question. I am defining an array of >> sets using numpy as: >> >> a=array([set([])]*3) >> Has nothing to do with numpy, but the problem is exclusively with your innermost expression [set([])]*3. >> Now, if I want to add an element to the set in, lets say, a[0], and I >> use the .add(4) operation, which results in: >> with .add you are modifying the *existing* set. >> array([set([4]), set([4]), set([4])], dtype=object) >> >> which I do not want. If I use the union operator >> >> a[0] = a[0] | set([4]) >> here you are forming a *new* set and put it in a[0] replacing the old set at this position. >> then I obtain what I want: >> >> array([set([4]), set([]), set([])], dtype=object) >> >> Can anyone explain whay this happens? > > Same reason why you shouldn't make a list of lists like so: [[]]*3 > > https://docs.python.org/2/faq/programming.html#how-do-i-create-a-multidimensional-list > The above link explains the underlying problem. Best, Wolfgang
[toc] | [prev] | [next] | [standalone]
| From | LJ <luisjosenovoa@gmail.com> |
|---|---|
| Date | 2014-05-25 05:29 -0700 |
| Message-ID | <f55843c2-17f3-4551-a1c6-b608c09fd6d8@googlegroups.com> |
| In reply to | #71976 |
Wolfgang, thank you very much for your reply.
Following the example in the link, the problem appears:
>>> A = [[0]*2]*3
>>> A
[[0, 0], [0, 0], [0, 0]]
>>> A[0][0] = 5
>>> A
[[5, 0], [5, 0], [5, 0]]
Now, if I use a numpy array:
>>> d=array([[0]*2]*3)
>>> d
array([[0, 0],
[0, 0],
[0, 0]])
>>> d[0][0]=5
>>> d
array([[5, 0],
[0, 0],
[0, 0]])
What is the difference here?
Thank you,
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2014-05-25 15:26 +0200 |
| Message-ID | <mailman.10294.1401024422.18130.python-list@python.org> |
| In reply to | #72000 |
LJ wrote:
> Wolfgang, thank you very much for your reply.
>
> Following the example in the link, the problem appears:
>
>>>> A = [[0]*2]*3
You can see this as a shortcut for
value = 0
inner = [value, value]
A = [inner, inner, inner]
When the value is mutable (like your original set) a modification of the
value shows in all six entries. Likewise if you change the `inner` list the
modification shows in all three rows.
>>>> A
> [[0, 0], [0, 0], [0, 0]]
>>>> A[0][0] = 5
>>>> A
> [[5, 0], [5, 0], [5, 0]]
>
> Now, if I use a numpy array:
>
>>>> d=array([[0]*2]*3)
>>>> d
> array([[0, 0],
> [0, 0],
> [0, 0]])
>>>> d[0][0]=5
>>>> d
> array([[5, 0],
> [0, 0],
> [0, 0]])
>
>
> What is the difference here?
Basically a numpy array doesn't reference the lists, it uses them to
determine the required shape of the array. A simplified implementation might
be
class Array:
def __init__(self, data):
self.shape = (len(data), len(data[0]))
self._data = []
for row in data: self._data.extend(row)
def __getitem__(self, index):
y, x = index
return self._data[y * self.shape[1] + x]
With that approach you may only see simultaneous changes of multiple entries
when using mutable values.
[toc] | [prev] | [next] | [standalone]
| From | LJ <luisjosenovoa@gmail.com> |
|---|---|
| Date | 2014-05-25 07:14 -0700 |
| Message-ID | <9929f123-705c-4656-a400-171e48935244@googlegroups.com> |
| In reply to | #72006 |
Thank you for the reply. So, as long as I access and modify the elements of, for example, A=array([[set([])]*4]*3) as (for example): a[0][1] = a[0][1] | set([1,2]) or: a[0][1]=set([1,2]) then I should have no problems?
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2014-05-25 17:12 +0200 |
| Message-ID | <mailman.10299.1401030762.18130.python-list@python.org> |
| In reply to | #72010 |
LJ wrote:
> Thank you for the reply.
>
> So, as long as I access and modify the elements of, for example,
>
> A=array([[set([])]*4]*3)
>
>
> as (for example):
>
> a[0][1] = a[0][1] | set([1,2])
>
> or:
>
> a[0][1]=set([1,2])
>
> then I should have no problems?
As long as you set (i. e. replace) elements you're fine, but modifying means
trouble. You can prevent accidental modification by using immutable values
-- in your case frozenset:
>>> b = numpy.array([[frozenset()]*4]*3)
>>> b[0,0].update("123")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'frozenset' object has no attribute 'update'
Or you take the obvious approach and ensure that there are no shared values.
I don't know if there's a canonical form to do this in numpy, but
>>> a = numpy.array([[set()]*3]*4)
>>> a |= set()
works:
>>> assert len(set(map(id, a.flat))) == 3*4
[toc] | [prev] | [next] | [standalone]
| From | LJ <luisjosenovoa@gmail.com> |
|---|---|
| Date | 2014-05-25 08:17 -0700 |
| Message-ID | <bf08d969-5ff1-4ee4-ad8e-a51935c794c9@googlegroups.com> |
| In reply to | #72012 |
Thank you very much!
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web