Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #20043 > unrolled thread
| Started by | Nathan Rice <nathan.alexander.rice@gmail.com> |
|---|---|
| First post | 2012-02-08 22:43 -0500 |
| Last post | 2012-02-09 11:50 -0500 |
| Articles | 18 — 9 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: frozendict Nathan Rice <nathan.alexander.rice@gmail.com> - 2012-02-08 22:43 -0500
Re: frozendict Duncan Booth <duncan.booth@invalid.invalid> - 2012-02-09 10:33 +0000
Re: frozendict Nathan Rice <nathan.alexander.rice@gmail.com> - 2012-02-09 09:36 -0500
Re: frozendict Duncan Booth <duncan.booth@invalid.invalid> - 2012-02-09 14:52 +0000
Re: frozendict Nathan Rice <nathan.alexander.rice@gmail.com> - 2012-02-09 10:19 -0500
Re: frozendict Duncan Booth <duncan.booth@invalid.invalid> - 2012-02-09 18:47 +0000
Re: frozendict Ian Kelly <ian.g.kelly@gmail.com> - 2012-02-09 09:35 -0700
Re: frozendict Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-02-10 01:24 +0000
Re: frozendict Nathan Rice <nathan.alexander.rice@gmail.com> - 2012-02-09 21:30 -0500
Re: frozendict Terry Reedy <tjreedy@udel.edu> - 2012-02-09 22:33 -0500
Re: frozendict Chris Angelico <rosuav@gmail.com> - 2012-02-10 21:08 +1100
Re: frozendict Nathan Rice <nathan.alexander.rice@gmail.com> - 2012-02-10 11:53 -0500
Re: frozendict Chris Rebert <clp2@rebertia.com> - 2012-02-10 09:00 -0800
Re: frozendict Nathan Rice <nathan.alexander.rice@gmail.com> - 2012-02-10 13:14 -0500
Re: frozendict John Nagle <nagle@animats.com> - 2012-02-10 10:57 -0800
Re: frozendict 88888 Dihedral <dihedral88888@googlemail.com> - 2012-02-10 21:52 -0800
Re: frozendict John Nagle <nagle@animats.com> - 2012-02-13 13:15 -0800
Re: frozendict Nathan Rice <nathan.alexander.rice@gmail.com> - 2012-02-09 11:50 -0500
| From | Nathan Rice <nathan.alexander.rice@gmail.com> |
|---|---|
| Date | 2012-02-08 22:43 -0500 |
| Subject | Re: frozendict |
| Message-ID | <mailman.5558.1328759017.27778.python-list@python.org> |
> Turn the question around: why should there be? > Python is intentionally parsimonious in adding builtins. For the same reason there are frozensets? I put dicts in sets all the time. I just tuple the items, but that means you have to re-dict it on the way out to do anything useful with it. I am too lazy to write a frozendict or import one, but I would use it if it was a builtin. Nathan
[toc] | [next] | [standalone]
| From | Duncan Booth <duncan.booth@invalid.invalid> |
|---|---|
| Date | 2012-02-09 10:33 +0000 |
| Message-ID | <Xns9FF46B35EDFF9duncanbooth@127.0.0.1> |
| In reply to | #20043 |
Nathan Rice <nathan.alexander.rice@gmail.com> wrote: > I put dicts in sets all the time. I just tuple the items, but that > means you have to re-dict it on the way out to do anything useful with > it. I am too lazy to write a frozendict or import one, but I would > use it if it was a builtin. > I hope you sort the items before putting them in a tuple, otherwise how do you handle two identical dicts that return their items in a different order? -- Duncan Booth http://kupuguy.blogspot.com
[toc] | [prev] | [next] | [standalone]
| From | Nathan Rice <nathan.alexander.rice@gmail.com> |
|---|---|
| Date | 2012-02-09 09:36 -0500 |
| Message-ID | <mailman.5588.1328798176.27778.python-list@python.org> |
| In reply to | #20063 |
On Thu, Feb 9, 2012 at 5:33 AM, Duncan Booth <duncan.booth@invalid.invalid> wrote: > Nathan Rice <nathan.alexander.rice@gmail.com> wrote: > >> I put dicts in sets all the time. I just tuple the items, but that >> means you have to re-dict it on the way out to do anything useful with >> it. I am too lazy to write a frozendict or import one, but I would >> use it if it was a builtin. >> > I hope you sort the items before putting them in a tuple, otherwise how do > you handle two identical dicts that return their items in a different > order? Two dicts created from the same inputs will return items in the same arbitrary order. As long as you don't insert or delete a key you're fine. Nathan
[toc] | [prev] | [next] | [standalone]
| From | Duncan Booth <duncan.booth@invalid.invalid> |
|---|---|
| Date | 2012-02-09 14:52 +0000 |
| Message-ID | <Xns9FF497406322duncanbooth@127.0.0.1> |
| In reply to | #20078 |
Nathan Rice <nathan.alexander.rice@gmail.com> wrote:
> On Thu, Feb 9, 2012 at 5:33 AM, Duncan Booth
><duncan.booth@invalid.invalid> wrote:
>> Nathan Rice <nathan.alexander.rice@gmail.com> wrote:
>>
>>> I put dicts in sets all the time. I just tuple the items, but that
>>> means you have to re-dict it on the way out to do anything useful
>>> with it. I am too lazy to write a frozendict or import one, but I
>>> would use it if it was a builtin.
>>>
>> I hope you sort the items before putting them in a tuple, otherwise
>> how d
> o
>> you handle two identical dicts that return their items in a different
>> order?
>
> Two dicts created from the same inputs will return items in the same
> arbitrary order. As long as you don't insert or delete a key you're
> fine.
>
Two dicts that contain the same keys and values may or may not return them
in the same order:
>>> dict.fromkeys('ia')
{'i': None, 'a': None}
>>> dict.fromkeys('ai')
{'a': None, 'i': None}
Would your system count those two dicts as the same?
If the sequence in which the keys were added to the dict (and deleted if
appropriate) is exactly the same then it is likely but still not guaranteed
that they will have the same order. The only ordering guarantee is that
within a single dict keys and values will appear in a consistent order so
long as you don't add/delete keys between calls.
--
Duncan Booth http://kupuguy.blogspot.com
[toc] | [prev] | [next] | [standalone]
| From | Nathan Rice <nathan.alexander.rice@gmail.com> |
|---|---|
| Date | 2012-02-09 10:19 -0500 |
| Message-ID | <mailman.5593.1328800790.27778.python-list@python.org> |
| In reply to | #20080 |
>> Two dicts created from the same inputs will return items in the same
>> arbitrary order. As long as you don't insert or delete a key you're
>> fine.
>>
> Two dicts that contain the same keys and values may or may not return them
> in the same order:
>
>>>> dict.fromkeys('ia')
> {'i': None, 'a': None}
>>>> dict.fromkeys('ai')
> {'a': None, 'i': None}
>
> Would your system count those two dicts as the same?
>
> If the sequence in which the keys were added to the dict (and deleted if
> appropriate) is exactly the same then it is likely but still not guaranteed
> that they will have the same order. The only ordering guarantee is that
> within a single dict keys and values will appear in a consistent order so
> long as you don't add/delete keys between calls.
As I said, two dictionaries created from the same input will be the
same... 'ai' != 'ia'. If I need to hash a dict that I don't know was
created in a deterministic order, I'd frozenset(thedict.items()).
Nathan
[toc] | [prev] | [next] | [standalone]
| From | Duncan Booth <duncan.booth@invalid.invalid> |
|---|---|
| Date | 2012-02-09 18:47 +0000 |
| Message-ID | <Xns9FF4BF1DD69ADduncanbooth@127.0.0.1> |
| In reply to | #20081 |
Nathan Rice <nathan.alexander.rice@gmail.com> wrote:
> As I said, two dictionaries created from the same input will be the
> same... 'ai' != 'ia'. If I need to hash a dict that I don't know was
> created in a deterministic order, I'd frozenset(thedict.items()).
>
Fair enough, the idea scares me, but it's your code and your risk.
BTW, make sure that if you ever copy any of those dictionaries they all get
copied the same number of times as simply copying a dict can change the key
order.
>>> dict.fromkeys('me')
{'e': None, 'm': None}
>>> dict(dict.fromkeys('me'))
{'m': None, 'e': None}
>>> dict(dict(dict.fromkeys('me')))
{'e': None, 'm': None}
--
Duncan Booth http://kupuguy.blogspot.com
[toc] | [prev] | [next] | [standalone]
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Date | 2012-02-09 09:35 -0700 |
| Message-ID | <mailman.5597.1328805384.27778.python-list@python.org> |
| In reply to | #20080 |
On Thu, Feb 9, 2012 at 8:19 AM, Nathan Rice <nathan.alexander.rice@gmail.com> wrote: > As I said, two dictionaries created from the same input will be the > same... That's an implementation detail, not a guarantee. It will hold for current versions of CPython but not necessarily for other Python implementations.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-02-10 01:24 +0000 |
| Message-ID | <4f3471e9$0$29986$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #20086 |
On Thu, 09 Feb 2012 09:35:52 -0700, Ian Kelly wrote: > On Thu, Feb 9, 2012 at 8:19 AM, Nathan Rice > <nathan.alexander.rice@gmail.com> wrote: >> As I said, two dictionaries created from the same input will be the >> same... > > That's an implementation detail, not a guarantee. It will hold for > current versions of CPython but not necessarily for other Python > implementations. That day may be sooner than you think. It is very likely that in Python 3.3, dict order will be randomized on creation as a side-effect of adding a random salt to hashes to prevent a serious vulnerability in dicts. http://securitytracker.com/id/1026478 http://bugs.python.org/issue13703 If there is anyone still assuming that dicts have a predictable order, they're going to be in for a nasty surprise one of these days. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Nathan Rice <nathan.alexander.rice@gmail.com> |
|---|---|
| Date | 2012-02-09 21:30 -0500 |
| Message-ID | <mailman.5630.1328841051.27778.python-list@python.org> |
| In reply to | #20126 |
On Thu, Feb 9, 2012 at 8:24 PM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > On Thu, 09 Feb 2012 09:35:52 -0700, Ian Kelly wrote: > >> On Thu, Feb 9, 2012 at 8:19 AM, Nathan Rice >> <nathan.alexander.rice@gmail.com> wrote: >>> As I said, two dictionaries created from the same input will be the >>> same... >> >> That's an implementation detail, not a guarantee. It will hold for >> current versions of CPython but not necessarily for other Python >> implementations. > > That day may be sooner than you think. It is very likely that in Python > 3.3, dict order will be randomized on creation as a side-effect of adding > a random salt to hashes to prevent a serious vulnerability in dicts. > > http://securitytracker.com/id/1026478 > > http://bugs.python.org/issue13703 > > > If there is anyone still assuming that dicts have a predictable order, > they're going to be in for a nasty surprise one of these days. The only thing needed to avoid the hash collision is that your hash function is not not 100% predictable just by looking at the python source code. I don't see why every dict would have to be created differently. I would think having the most ubiquitous data structure in your language be more predictable would be a priority. Oh well.... Nathan
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2012-02-09 22:33 -0500 |
| Message-ID | <mailman.5635.1328844905.27778.python-list@python.org> |
| In reply to | #20126 |
On 2/9/2012 9:30 PM, Nathan Rice wrote: >> That day may be sooner than you think. It is very likely that in Python >> 3.3, dict order will be randomized on creation as a side-effect of adding >> a random salt to hashes to prevent a serious vulnerability in dicts. >> >> http://securitytracker.com/id/1026478 >> >> http://bugs.python.org/issue13703 >> >> >> If there is anyone still assuming that dicts have a predictable order, >> they're going to be in for a nasty surprise one of these days. > > The only thing needed to avoid the hash collision is that your hash > function is not not 100% predictable just by looking at the python > source code. I don't see why every dict would have to be created > differently. I would think having the most ubiquitous data structure > in your language be more predictable would be a priority. Oh well.... I believe 'on creation' means 'on process startup', not on dict creation. There have, however, been several variants suggested, and the focus has been on choosing one first for past and current versions. 3.3 is about 6 months off and hash for it may still be debated. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-02-10 21:08 +1100 |
| Message-ID | <mailman.5645.1328868491.27778.python-list@python.org> |
| In reply to | #20126 |
On Fri, Feb 10, 2012 at 1:30 PM, Nathan Rice <nathan.alexander.rice@gmail.com> wrote: > The only thing needed to avoid the hash collision is that your hash > function is not not 100% predictable just by looking at the python > source code. I don't see why every dict would have to be created > differently. I would think having the most ubiquitous data structure > in your language be more predictable would be a priority. Oh well.... It's perfectly predictable. If you put a series of keys into it, you get those same keys back. Nobody ever promised anything about order. If your hash function is not 100% predictable, that means it varies on the basis of something that isn't part of either the Python interpreter or the script being run. That means that, from one execution to another of the exact same code, the results could be different. The keys will come out in different orders. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Nathan Rice <nathan.alexander.rice@gmail.com> |
|---|---|
| Date | 2012-02-10 11:53 -0500 |
| Message-ID | <mailman.5670.1328892792.27778.python-list@python.org> |
| In reply to | #20126 |
On Fri, Feb 10, 2012 at 5:08 AM, Chris Angelico <rosuav@gmail.com> wrote: > On Fri, Feb 10, 2012 at 1:30 PM, Nathan Rice > <nathan.alexander.rice@gmail.com> wrote: >> The only thing needed to avoid the hash collision is that your hash >> function is not not 100% predictable just by looking at the python >> source code. I don't see why every dict would have to be created >> differently. I would think having the most ubiquitous data structure >> in your language be more predictable would be a priority. Oh well.... > > It's perfectly predictable. If you put a series of keys into it, you > get those same keys back. Nobody ever promised anything about order. > > If your hash function is not 100% predictable, that means it varies on > the basis of something that isn't part of either the Python > interpreter or the script being run. That means that, from one > execution to another of the exact same code, the results could be > different. The keys will come out in different orders. I think having a hash function that is not referentially transparent is a bad thing. Basing your language on a non-deterministic function? Yeah... A type factory that produces the dict type on interpreter initialization (or, replaces the hash function, rather), and uses time/system information/etc would solve the problem, while limiting the introduced non-determinism. I don't care if the order of iteration for keys is different from interpreter run to run. I have used frozenset(mydict.items()) when my requirements dictated. It is a minor performance hit. Lets also not forget that knowing an object is immutable lets you do a lot of optimizations; it can be inlined, it is safe to convert to a contiguous block of memory and stuff in cache, etc. If you know the input to a function is guaranteed to be frozen you can just go crazy. Being able to freeze(anyobject) seems like a pretty clear win. Whether or not it is pythonic is debatable. I'd argue if the meaning of pythonic in some context is limiting, we should consider updating the term rather than being dogmatic. Just my 2 cents... Nathan
[toc] | [prev] | [next] | [standalone]
| From | Chris Rebert <clp2@rebertia.com> |
|---|---|
| Date | 2012-02-10 09:00 -0800 |
| Message-ID | <mailman.5671.1328893258.27778.python-list@python.org> |
| In reply to | #20126 |
On Fri, Feb 10, 2012 at 8:53 AM, Nathan Rice <nathan.alexander.rice@gmail.com> wrote: <snip> > Lets also not forget that knowing an object is immutable lets you do a > lot of optimizations; it can be inlined, it is safe to convert to a > contiguous block of memory and stuff in cache, etc. If you know the > input to a function is guaranteed to be frozen you can just go crazy. > Being able to freeze(anyobject) seems like a pretty clear win. > Whether or not it is pythonic is debatable. I'd argue if the meaning > of pythonic in some context is limiting, we should consider updating > the term rather than being dogmatic. It's been proposed previously and rejected: PEP 351 -- The freeze protocol http://www.python.org/dev/peps/pep-0351/ Cheers, Chris
[toc] | [prev] | [next] | [standalone]
| From | Nathan Rice <nathan.alexander.rice@gmail.com> |
|---|---|
| Date | 2012-02-10 13:14 -0500 |
| Message-ID | <mailman.5673.1328897669.27778.python-list@python.org> |
| In reply to | #20126 |
>> Lets also not forget that knowing an object is immutable lets you do a >> lot of optimizations; it can be inlined, it is safe to convert to a >> contiguous block of memory and stuff in cache, etc. If you know the >> input to a function is guaranteed to be frozen you can just go crazy. >> Being able to freeze(anyobject) seems like a pretty clear win. >> Whether or not it is pythonic is debatable. I'd argue if the meaning >> of pythonic in some context is limiting, we should consider updating >> the term rather than being dogmatic. Sweet, looking at the reason for rejection: 1. How dicts (and multiply nested objects) should be frozen was not completely obvious 2. "frozen()" implies in place, thus confusing users 3. freezing something like a list is confusing because some list methods would disappear or cause errors 4. Because automatic conversion in the proposal was seen as too involved to be so magical 5. Because frozendicts are the main end user benefit, and using dicts as keys was seen as "suspect" Honestly, as far as #1, we already have copy and deepcopy, the can of worms is already opened (and necessarily so). For 2, choose a better name? For 3, we have abstract base classes now, which make a nice distinction between mutable and immutable sequences; nominal types are a crutch, thinking in terms of structure is much more powerful. I agree with point 4, if magic does anything besides make the code conform to what an informed user would expect, it should be considered heretical. As for #5, I feel using a collection of key value relations as a key in another collection is not inherently bad, it just depends on the context... The mutability is the rub. Also, immutability provides scaffolding to improve performance and concurrency (both of which are top tier language features). I understand that this is one of those cases where Guido has a strong "bad feeling" about something, and I think a consequence of that is people tread lightly. Perhaps I'm a bit of a language communist in that regard (historically a dangerous philosophy :) As an aside, I find it kind of schizophrenic how on one hand Python is billed as a language for consenting adults (see duck typing, no data hiding, etc) and on the other hand users need to be protected from themselves. Better to serve just one flavor of kool-aid imo. Nathan
[toc] | [prev] | [next] | [standalone]
| From | John Nagle <nagle@animats.com> |
|---|---|
| Date | 2012-02-10 10:57 -0800 |
| Message-ID | <4f356898$0$11996$742ec2ed@news.sonic.net> |
| In reply to | #20179 |
On 2/10/2012 10:14 AM, Nathan Rice wrote:
>>> Lets also not forget that knowing an object is immutable lets you do a
>>> lot of optimizations; it can be inlined, it is safe to convert to a
>>> contiguous block of memory and stuff in cache, etc. If you know the
>>> input to a function is guaranteed to be frozen you can just go crazy.
>>> Being able to freeze(anyobject) seems like a pretty clear win.
>>> Whether or not it is pythonic is debatable. I'd argue if the meaning
>>> of pythonic in some context is limiting, we should consider updating
>>> the term rather than being dogmatic.
A real justification for the ability to make anything immutable is
to make it safely shareable between threads. If it's immutable, it
doesn't have to be locked for access. Mozilla's new "Rust"
language takes advantage of this. Take a look at Rust's concurrency
semantics. They've made some progress.
John Nagle
[toc] | [prev] | [next] | [standalone]
| From | 88888 Dihedral <dihedral88888@googlemail.com> |
|---|---|
| Date | 2012-02-10 21:52 -0800 |
| Message-ID | <11277623.419.1328939579823.JavaMail.geo-discussion-forums@pbcql8> |
| In reply to | #20180 |
在 2012年2月11日星期六UTC+8上午2时57分34秒,John Nagle写道: > On 2/10/2012 10:14 AM, Nathan Rice wrote: > >>> Lets also not forget that knowing an object is immutable lets you do a > >>> lot of optimizations; it can be inlined, it is safe to convert to a > >>> contiguous block of memory and stuff in cache, etc. If you know the > >>> input to a function is guaranteed to be frozen you can just go crazy. > >>> Being able to freeze(anyobject) seems like a pretty clear win. > >>> Whether or not it is pythonic is debatable. I'd argue if the meaning > >>> of pythonic in some context is limiting, we should consider updating > >>> the term rather than being dogmatic. > > A real justification for the ability to make anything immutable is > to make it safely shareable between threads. If it's immutable, it > doesn't have to be locked for access. Mozilla's new "Rust" > language takes advantage of this. Take a look at Rust's concurrency > semantics. They've made some progress. > > John Nagl Lets model the system as an asynchronous set of objects with multiple threads performing operatons on objects as in the above. This reminds me the old problem solved before in the digital hardware.
[toc] | [prev] | [next] | [standalone]
| From | John Nagle <nagle@animats.com> |
|---|---|
| Date | 2012-02-13 13:15 -0800 |
| Message-ID | <4f397d7a$0$12009$742ec2ed@news.sonic.net> |
| In reply to | #20212 |
On 2/10/2012 9:52 PM, 88888 Dihedral wrote:
> 在 2012年2月11日星期六UTC+8上午2时57分34秒,John Nagle写道:
>> On 2/10/2012 10:14 AM, Nathan Rice wrote:
>>>>> Lets also not forget that knowing an object is immutable lets you do a
>>>>> lot of optimizations; it can be inlined, it is safe to convert to a
>>>>> contiguous block of memory and stuff in cache, etc. If you know the
>>>>> input to a function is guaranteed to be frozen you can just go crazy.
>>>>> Being able to freeze(anyobject) seems like a pretty clear win.
>>>>> Whether or not it is pythonic is debatable. I'd argue if the meaning
>>>>> of pythonic in some context is limiting, we should consider updating
>>>>> the term rather than being dogmatic.
>>
>> A real justification for the ability to make anything immutable is
>> to make it safely shareable between threads. If it's immutable, it
>> doesn't have to be locked for access. Mozilla's new "Rust"
>> language takes advantage of this. Take a look at Rust's concurrency
>> semantics. They've made some progress.
>>
>> John Nagl
>
>
> Lets model the system as an asynchronous set of objects with multiple threads
> performing operatons on objects as in the above.
I'd argue for a concurrency system where everything is either
immutable, unshared, synchronized, or owned by a synchronized object.
This eliminates almost all explicit locking.
Python's use of immutability has potential in that direction, but
Python doesn't do anything with that concept.
John Nagle
[toc] | [prev] | [next] | [standalone]
| From | Nathan Rice <nathan.alexander.rice@gmail.com> |
|---|---|
| Date | 2012-02-09 11:50 -0500 |
| Message-ID | <mailman.5599.1328806255.27778.python-list@python.org> |
| In reply to | #20080 |
On Thu, Feb 9, 2012 at 11:35 AM, Ian Kelly <ian.g.kelly@gmail.com> wrote: > On Thu, Feb 9, 2012 at 8:19 AM, Nathan Rice > <nathan.alexander.rice@gmail.com> wrote: >> As I said, two dictionaries created from the same input will be the >> same... > > That's an implementation detail, not a guarantee. It will hold for > current versions of CPython but not necessarily for other Python > implementations. That is true. The implications of the function that creates dictionaries being non-deterministic are a little scary though, so I suspect that it will hold :) Nathan
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web