Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #28456 > unrolled thread
| Started by | Franck Ditter <franck@ditter.org> |
|---|---|
| First post | 2012-09-05 08:30 +0200 |
| Last post | 2012-09-05 14:40 -0400 |
| Articles | 20 on this page of 40 — 15 participants |
Back to article view | Back to comp.lang.python
is implemented with id ? Franck Ditter <franck@ditter.org> - 2012-09-05 08:30 +0200
Re: is implemented with id ? Benjamin Kaplan <benjamin.kaplan@case.edu> - 2012-09-04 23:40 -0700
Re: is implemented with id ? Franck Ditter <franck@ditter.org> - 2012-09-05 15:19 +0200
Re: is implemented with id ? Hans Mulder <hansmu@xs4all.nl> - 2012-09-05 15:48 +0200
Re: is implemented with id ? aahz@pythoncraft.com (Aahz) - 2012-11-03 12:41 -0700
Re: is implemented with id ? Hans Mulder <hansmu@xs4all.nl> - 2012-11-03 22:49 +0100
Re: is implemented with id ? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-11-03 22:18 +0000
Re: is implemented with id ? Chris Angelico <rosuav@gmail.com> - 2012-11-04 09:50 +1100
Re: is implemented with id ? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-11-04 01:14 +0000
Re: is implemented with id ? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-11-04 03:10 +0000
Re: is implemented with id ? Chris Angelico <rosuav@gmail.com> - 2012-11-04 14:19 +1100
Re: is implemented with id ? aahz@pythoncraft.com (Aahz) - 2012-11-03 22:09 -0700
Re: is implemented with id ? Hans Mulder <hansmu@xs4all.nl> - 2012-11-04 11:13 +0100
Re: is implemented with id ? Chris Angelico <rosuav@gmail.com> - 2012-11-04 12:22 +1100
Re: is implemented with id ? aahz@pythoncraft.com (Aahz) - 2012-11-03 22:08 -0700
Re: is implemented with id ? Roy Smith <roy@panix.com> - 2012-11-03 18:41 -0400
Re: is implemented with id ? aahz@pythoncraft.com (Aahz) - 2012-11-03 22:12 -0700
Re: is implemented with id ? Dave Angel <d@davea.name> - 2012-09-05 10:00 -0400
Re: is implemented with id ? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-05 14:41 +0000
Re: is implemented with id ? Dave Angel <d@davea.name> - 2012-09-05 11:09 -0400
Re: is implemented with id ? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-05 15:36 +0000
Re: is implemented with id ? Hans Mulder <hansmu@xs4all.nl> - 2012-09-05 18:47 +0200
Re: is implemented with id ? Dave Angel <d@davea.name> - 2012-09-05 13:19 -0400
Re: is implemented with id ? Terry Reedy <tjreedy@udel.edu> - 2012-09-05 14:31 -0400
Re: is implemented with id ? Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-09-05 22:08 -0400
Re: is implemented with id ? Duncan Booth <duncan.booth@invalid.invalid> - 2012-09-06 09:34 +0000
Re: is implemented with id ? Chris Angelico <rosuav@gmail.com> - 2012-09-06 19:50 +1000
Re: is implemented with id ? 88888 Dihedral <dihedral88888@googlemail.com> - 2012-11-04 01:33 -0700
Re: is implemented with id ? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-05 09:14 +0000
Re: is implemented with id ? Ramchandra Apte <maniandram01@gmail.com> - 2012-09-05 05:48 -0700
Re: is implemented with id ? Dave Angel <d@davea.name> - 2012-09-05 09:46 -0400
Re: is implemented with id ? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-05 14:13 +0000
Re: is implemented with id ? Ian Kelly <ian.g.kelly@gmail.com> - 2012-09-05 11:08 -0600
Re: is implemented with id ? Chris Angelico <rosuav@gmail.com> - 2012-09-06 19:07 +1000
Re: is implemented with id ? Terry Reedy <tjreedy@udel.edu> - 2012-09-05 14:27 -0400
Re: is implemented with id ? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-06 06:44 +0000
Re: is implemented with id ? Ramchandra Apte <maniandram01@gmail.com> - 2012-09-06 01:24 -0700
Re: is implemented with id ? Roy Smith <roy@panix.com> - 2012-09-06 08:16 -0400
Re: is implemented with id ? Ramchandra Apte <maniandram01@gmail.com> - 2012-09-06 06:30 -0700
Re: is implemented with id ? Dave Angel <d@davea.name> - 2012-09-05 14:40 -0400
Page 1 of 2 [1] 2 Next page →
| From | Franck Ditter <franck@ditter.org> |
|---|---|
| Date | 2012-09-05 08:30 +0200 |
| Subject | is implemented with id ? |
| Message-ID | <franck-9EED34.08303005092012@news.free.fr> |
Hi !
a is b <==> id(a) == id(b) in builtin classes.
Is that true ?
Thanks,
franck
[toc] | [next] | [standalone]
| From | Benjamin Kaplan <benjamin.kaplan@case.edu> |
|---|---|
| Date | 2012-09-04 23:40 -0700 |
| Message-ID | <mailman.213.1346827305.27098.python-list@python.org> |
| In reply to | #28456 |
On Tue, Sep 4, 2012 at 11:30 PM, Franck Ditter <franck@ditter.org> wrote: > Hi ! > a is b <==> id(a) == id(b) in builtin classes. > Is that true ? > Thanks, > > franck No. It is true that if a is b then id(a) == id(b) but the reverse is not necessarily true. id is only guaranteed to be unique among objects alive at the same time. If objects are discarded, their ids may be reused even though the objects are not the same.
[toc] | [prev] | [next] | [standalone]
| From | Franck Ditter <franck@ditter.org> |
|---|---|
| Date | 2012-09-05 15:19 +0200 |
| Message-ID | <franck-053A38.15194605092012@news.free.fr> |
| In reply to | #28457 |
Thanks to all, but :
- I should have said that I work with Python 3. Does that matter ?
- May I reformulate the queston : "a is b" and "id(a) == id(b)"
both mean : "a et b share the same physical address". Is that True ?
Thanks,
franck
In article <mailman.213.1346827305.27098.python-list@python.org>,
Benjamin Kaplan <benjamin.kaplan@case.edu> wrote:
> On Tue, Sep 4, 2012 at 11:30 PM, Franck Ditter <franck@ditter.org> wrote:
> > Hi !
> > a is b <==> id(a) == id(b) in builtin classes.
> > Is that true ?
> > Thanks,
> >
> > franck
>
> No. It is true that if a is b then id(a) == id(b) but the reverse is
> not necessarily true. id is only guaranteed to be unique among objects
> alive at the same time. If objects are discarded, their ids may be
> reused even though the objects are not the same.
[toc] | [prev] | [next] | [standalone]
| From | Hans Mulder <hansmu@xs4all.nl> |
|---|---|
| Date | 2012-09-05 15:48 +0200 |
| Message-ID | <50475822$0$6867$e4fe514c@news2.news.xs4all.nl> |
| In reply to | #28492 |
On 5/09/12 15:19:47, Franck Ditter wrote: > Thanks to all, but : > - I should have said that I work with Python 3. Does that matter ? > - May I reformulate the queston : "a is b" and "id(a) == id(b)" > both mean : "a et b share the same physical address". Is that True ? Yes. Keep in mind, though, that in some implementation (e.g. Jython), the physical address may change during the life time of an object. It's usually phrased as "a and b are the same object". If the object is mutable, then changing a will also change b. If a and b aren't mutable, then it doesn't really matter whether they share a physical address. Keep in mind that physical addresses can be reused when an object is destroyed. For example, in my Python3, id(math.sqrt(17)) == id(math.cos(17)) returns True, even though the floats involved are different, because the flaots have non-overlapping lifetimes and the physical address happens to be reused. Hope this helps, -- HansM
[toc] | [prev] | [next] | [standalone]
| From | aahz@pythoncraft.com (Aahz) |
|---|---|
| Date | 2012-11-03 12:41 -0700 |
| Message-ID | <k73s18$5b4$1@panix5.panix.com> |
| In reply to | #28495 |
[got some free time, catching up to threads two months old] In article <50475822$0$6867$e4fe514c@news2.news.xs4all.nl>, Hans Mulder <hansmu@xs4all.nl> wrote: >On 5/09/12 15:19:47, Franck Ditter wrote: >> >> - I should have said that I work with Python 3. Does that matter ? >> - May I reformulate the queston : "a is b" and "id(a) == id(b)" >> both mean : "a et b share the same physical address". Is that True ? > >Yes. > >Keep in mind, though, that in some implementation (e.g. Jython), the >physical address may change during the life time of an object. > >It's usually phrased as "a and b are the same object". If the object >is mutable, then changing a will also change b. If a and b aren't >mutable, then it doesn't really matter whether they share a physical >address. That last sentence is not quite true. intern() is used to ensure that strings share a physical address to save memory. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "....Normal is what cuts off your sixth finger and your tail..." --Siobhan
[toc] | [prev] | [next] | [standalone]
| From | Hans Mulder <hansmu@xs4all.nl> |
|---|---|
| Date | 2012-11-03 22:49 +0100 |
| Message-ID | <50959154$0$6880$e4fe514c@news2.news.xs4all.nl> |
| In reply to | #32705 |
On 3/11/12 20:41:28, Aahz wrote: > [got some free time, catching up to threads two months old] > > In article <50475822$0$6867$e4fe514c@news2.news.xs4all.nl>, > Hans Mulder <hansmu@xs4all.nl> wrote: >> On 5/09/12 15:19:47, Franck Ditter wrote: >>> >>> - I should have said that I work with Python 3. Does that matter ? >>> - May I reformulate the queston : "a is b" and "id(a) == id(b)" >>> both mean : "a et b share the same physical address". Is that True ? >> >> Yes. >> >> Keep in mind, though, that in some implementation (e.g. Jython), the >> physical address may change during the life time of an object. >> >> It's usually phrased as "a and b are the same object". If the object >> is mutable, then changing a will also change b. If a and b aren't >> mutable, then it doesn't really matter whether they share a physical >> address. > > That last sentence is not quite true. intern() is used to ensure that > strings share a physical address to save memory. That's a matter of perspective: in my book, the primary advantage of working with interned strings is that I can use 'is' rather than '==' to test for equality if I know my strings are interned. The space savings are minor; the time savings may be significant. -- HansM
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-11-03 22:18 +0000 |
| Message-ID | <50959827$0$29967$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #32706 |
On Sat, 03 Nov 2012 22:49:07 +0100, Hans Mulder wrote:
> On 3/11/12 20:41:28, Aahz wrote:
>> [got some free time, catching up to threads two months old]
>>
>> In article <50475822$0$6867$e4fe514c@news2.news.xs4all.nl>, Hans Mulder
>> <hansmu@xs4all.nl> wrote:
>>> On 5/09/12 15:19:47, Franck Ditter wrote:
>>>>
>>>> - I should have said that I work with Python 3. Does that matter ? -
>>>> May I reformulate the queston : "a is b" and "id(a) == id(b)"
>>>> both mean : "a et b share the same physical address". Is that True
>>>> ?
>>>
>>> Yes.
>>>
>>> Keep in mind, though, that in some implementation (e.g. Jython), the
>>> physical address may change during the life time of an object.
>>>
>>> It's usually phrased as "a and b are the same object". If the object
>>> is mutable, then changing a will also change b. If a and b aren't
>>> mutable, then it doesn't really matter whether they share a physical
>>> address.
>>
>> That last sentence is not quite true. intern() is used to ensure that
>> strings share a physical address to save memory.
>
> That's a matter of perspective: in my book, the primary advantage of
> working with interned strings is that I can use 'is' rather than '==' to
> test for equality if I know my strings are interned. The space savings
> are minor; the time savings may be significant.
Actually, for many applications, the space "savings" may actually be
*costs*, since interning forces Python to hold onto strings even after
they would normally be garbage collected. CPython interns strings that
look like identifiers. It really wouldn't be a good idea for it to
automatically intern every string.
You can make your own intern system with a simple dict:
interned_strings = {}
Then, for every string you care about, do:
s = interned_strings.set_default(s, s)
to ensure you are always working with a single string object for each
unique value. In some applications that will save time at the expense of
space.
And there is no need to write "is" instead of "==", because string
equality already optimizes the "strings are identical" case. By using ==,
you don't get into bad habits, you defend against the odd un-interned
string sneaking in, and you still have high speed equality tests.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-11-04 09:50 +1100 |
| Message-ID | <mailman.3247.1351983031.27098.python-list@python.org> |
| In reply to | #32707 |
On Sun, Nov 4, 2012 at 9:18 AM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Sat, 03 Nov 2012 22:49:07 +0100, Hans Mulder wrote:
> Actually, for many applications, the space "savings" may actually be
> *costs*, since interning forces Python to hold onto strings even after
> they would normally be garbage collected. CPython interns strings that
> look like identifiers. It really wouldn't be a good idea for it to
> automatically intern every string.
I don't know about that.
/* This dictionary holds all interned unicode strings. Note that references
to strings in this dictionary are *not* counted in the string's ob_refcnt.
When the interned string reaches a refcnt of 0 the string deallocation
function will delete the reference from this dictionary.
Another way to look at this is that to say that the actual reference
count of a string is: s->ob_refcnt + (s->state ? 2 : 0)
*/
static PyObject *interned;
Empirical testing (on a Linux 3.3a0 that I had lying around) showed
the process's memory usage drop, but I closed the terminal before
copying and pasting (oops). Attempting to recreate in IDLE on 3.2 on
Windows.
>>> a="$"*1024*1024*256 # Make $$$....$$$ fast!
>>> import sys
>>> sys.getsizeof(a) # Clearly this is a narrow build
536870942
>>> a="$"*1024*1024*256
--> MemoryError. Blah. This is what I get for only having a gig and a
half in this laptop. And I was working with 1024*1024*1024 on the
other box. Start over...
>>> import sys
>>> a="$"*1024*1024*128
>>> b="$"*1024*1024*128
>>> a is b
False
>>> a=sys.intern(a)
>>> b=sys.intern(b)
>>> c="$"*1024*1024*128
>>> c=sys.intern(c)
Memory usage (according to Task Mangler) goes up to ~512MB when I
create a new string (like c), then back down to ~256MB when I intern
it. So far so good.
>>> del a,b,c
Memory usage has dropped to 12MB. Unnecessarily-interned strings don't
cost anything. (The source does refer to immortal interned strings,
but AFAIK you can't create them in user-level code. At least, I didn't
find it in help(sys.intern) which is the obvious place to look.)
> You can make your own intern system with a simple dict:
>
> interned_strings = {}
>
> Then, for every string you care about, do:
>
> s = interned_strings.set_default(s, s)
>
> to ensure you are always working with a single string object for each
> unique value. In some applications that will save time at the expense of
> space.
Doing it manually like this _will_ leak like that, though, unless you
periodically check sys.getrefcount and dispose of unreferenced
entries.
> And there is no need to write "is" instead of "==", because string
> equality already optimizes the "strings are identical" case. By using ==,
> you don't get into bad habits, you defend against the odd un-interned
> string sneaking in, and you still have high speed equality tests.
This one I haven't checked the source for, but ISTR discussions on
this list about comparison of two unequal interned strings not being
optimized, so they'll end up being compared char-for-char. Using 'is'
guarantees that the check stops with identity. This may or may not be
significant, and as you say, defending against an uninterned string
slipping through is potentially critical.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Oscar Benjamin <oscar.j.benjamin@gmail.com> |
|---|---|
| Date | 2012-11-04 01:14 +0000 |
| Message-ID | <mailman.3248.1351991672.27098.python-list@python.org> |
| In reply to | #32707 |
On 3 November 2012 22:50, Chris Angelico <rosuav@gmail.com> wrote: > This one I haven't checked the source for, but ISTR discussions on > this list about comparison of two unequal interned strings not being > optimized, so they'll end up being compared char-for-char. Using 'is' > guarantees that the check stops with identity. This may or may not be > significant, and as you say, defending against an uninterned string > slipping through is potentially critical. The source is here (and it shows what you suggest): http://hg.python.org/cpython/file/6c639a1ff53d/Objects/unicodeobject.c#l6128 Comparing strings char for char is really not that big a deal though. This has been discussed before: you don't need to compare very many characters to conclude that strings are unequal (if I remember correctly you were part of that discussion). I can imagine cases where I might consider using intern on lots of strings to speed up comparisons but I would have to be involved in some seriously heavy and obscure string processing problem before I considered using 'is' to compare those interned strings. That is confusing to anyone who reads the code, prone to bugs and unlikely to achieve the desired outcome of speeding things up (noticeably). Oscar
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-11-04 03:10 +0000 |
| Message-ID | <5095dca0$0$29967$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #32710 |
On Sun, 04 Nov 2012 01:14:29 +0000, Oscar Benjamin wrote:
> On 3 November 2012 22:50, Chris Angelico <rosuav@gmail.com> wrote:
>> This one I haven't checked the source for, but ISTR discussions on this
>> list about comparison of two unequal interned strings not being
>> optimized, so they'll end up being compared char-for-char. Using 'is'
>> guarantees that the check stops with identity. This may or may not be
>> significant, and as you say, defending against an uninterned string
>> slipping through is potentially critical.
>
> The source is here (and it shows what you suggest):
> http://hg.python.org/cpython/file/6c639a1ff53d/Objects/
unicodeobject.c#l6128
I don't think it does, although I could be wrong, I find reading C to be
quite difficult.
The unicode_compare function compares character by character, true, but
it doesn't get called directly. The public interface is
PyUnicode_Compare, which includes this test before calling
unicode_compare:
/* Shortcut for empty or interned objects */
if (v == u) {
Py_DECREF(u);
Py_DECREF(v);
return 0;
}
result = unicode_compare(u, v);
where v and u are pointers to the unicode object.
So it appears that the test for strings being equal length have been
dropped, but the identity test is still present.
> Comparing strings char for char is really not that big a deal though.
Depends on how big the string and where the first difference is.
> This has been discussed before: you don't need to compare very many
> characters to conclude that strings are unequal (if I remember correctly
> you were part of that discussion).
On average. Worst case, you have to look at every character.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-11-04 14:19 +1100 |
| Message-ID | <mailman.3250.1351999198.27098.python-list@python.org> |
| In reply to | #32713 |
On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> /* Shortcut for empty or interned objects */
> if (v == u) {
> Py_DECREF(u);
> Py_DECREF(v);
> return 0;
> }
> result = unicode_compare(u, v);
>
> where v and u are pointers to the unicode object.
There's a shortcut if they're the same. There's no shortcut if they're
both interned and have different pointers, which is a guarantee that
they're distinct strings. They'll still be compared char-for-char
until there's a difference.
But it probably isn't enough of a performance penalty to be concerned
with. It's enough to technically prove the point that 'is' is faster
than '==' and is still safe if both strings are interned; it's not
enough to make 'is' better than '==', except in very specific
situations.
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | aahz@pythoncraft.com (Aahz) |
|---|---|
| Date | 2012-11-03 22:09 -0700 |
| Message-ID | <k74ta4$rr5$1@panix5.panix.com> |
| In reply to | #32715 |
In article <mailman.3250.1351999198.27098.python-list@python.org>,
Chris Angelico <rosuav@gmail.com> wrote:
>On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano
><steve+comp.lang.python@pearwood.info> wrote:
>>
>> /* Shortcut for empty or interned objects */
>> if (v == u) {
>> Py_DECREF(u);
>> Py_DECREF(v);
>> return 0;
>> }
>> result = unicode_compare(u, v);
>>
>> where v and u are pointers to the unicode object.
>
>There's a shortcut if they're the same. There's no shortcut if they're
>both interned and have different pointers, which is a guarantee that
>they're distinct strings. They'll still be compared char-for-char
>until there's a difference.
Without looking at the code, I'm pretty sure there's a hash check first.
--
Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/
"....Normal is what cuts off your sixth finger and your tail..." --Siobhan
[toc] | [prev] | [next] | [standalone]
| From | Hans Mulder <hansmu@xs4all.nl> |
|---|---|
| Date | 2012-11-04 11:13 +0100 |
| Message-ID | <50963fb4$0$6947$e4fe514c@news2.news.xs4all.nl> |
| In reply to | #32718 |
On 4/11/12 06:09:24, Aahz wrote:
> In article <mailman.3250.1351999198.27098.python-list@python.org>,
> Chris Angelico <rosuav@gmail.com> wrote:
>> On Sun, Nov 4, 2012 at 2:10 PM, Steven D'Aprano
>> <steve+comp.lang.python@pearwood.info> wrote:
>>>
>>> /* Shortcut for empty or interned objects */
>>> if (v == u) {
>>> Py_DECREF(u);
>>> Py_DECREF(v);
>>> return 0;
>>> }
>>> result = unicode_compare(u, v);
>>>
>>> where v and u are pointers to the unicode object.
>>
>> There's a shortcut if they're the same. There's no shortcut if they're
>> both interned and have different pointers, which is a guarantee that
>> they're distinct strings. They'll still be compared char-for-char
>> until there's a difference.
>
> Without looking at the code, I'm pretty sure there's a hash check first.
In 3.3, there is no such check.
It was recently proposed on python-dev to add such a check,
but AFAIK, no action was taken.
-- HansM
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-11-04 12:22 +1100 |
| Message-ID | <mailman.3249.1351992140.27098.python-list@python.org> |
| In reply to | #32707 |
On Sun, Nov 4, 2012 at 12:14 PM, Oscar Benjamin <oscar.j.benjamin@gmail.com> wrote: > On 3 November 2012 22:50, Chris Angelico <rosuav@gmail.com> wrote: >> This one I haven't checked the source for, but ISTR discussions on >> this list about comparison of two unequal interned strings not being >> optimized, so they'll end up being compared char-for-char. Using 'is' >> guarantees that the check stops with identity. This may or may not be >> significant, and as you say, defending against an uninterned string >> slipping through is potentially critical. > > The source is here (and it shows what you suggest): > http://hg.python.org/cpython/file/6c639a1ff53d/Objects/unicodeobject.c#l6128 > > Comparing strings char for char is really not that big a deal though. > This has been discussed before: you don't need to compare very many > characters to conclude that strings are unequal (if I remember > correctly you were part of that discussion). Yes, and a quite wide-ranging discussion it was too! What color did we end up whitewashing that bikeshed? *whistles innocently* > I can imagine cases where I might consider using intern on lots of > strings to speed up comparisons but I would have to be involved in > some seriously heavy and obscure string processing problem before I > considered using 'is' to compare those interned strings. That is > confusing to anyone who reads the code, prone to bugs and unlikely to > achieve the desired outcome of speeding things up (noticeably). Good point. It's still true that 'is' will be faster, it's just not worth it. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | aahz@pythoncraft.com (Aahz) |
|---|---|
| Date | 2012-11-03 22:08 -0700 |
| Message-ID | <k74t7o$ins$1@panix5.panix.com> |
| In reply to | #32707 |
In article <50959827$0$29967$c3e8da3$5496439d@news.astraweb.com>, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > >Actually, for many applications, the space "savings" may actually be >*costs*, since interning forces Python to hold onto strings even after >they would normally be garbage collected. That's old news, fixed in 2.5 or 2.6 IIRC -- interned strings now get collected by refcounting like everything else. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "....Normal is what cuts off your sixth finger and your tail..." --Siobhan
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2012-11-03 18:41 -0400 |
| Message-ID | <roy-FA15F8.18415003112012@news.panix.com> |
| In reply to | #32706 |
In article <50959154$0$6880$e4fe514c@news2.news.xs4all.nl>, Hans Mulder <hansmu@xs4all.nl> wrote: > That's a matter of perspective: in my book, the primary advantage of > working with interned strings is that I can use 'is' rather than '==' > to test for equality if I know my strings are interned. The space > savings are minor; the time savings may be significant. Depending on your problem domain, the space savings may be considerable.
[toc] | [prev] | [next] | [standalone]
| From | aahz@pythoncraft.com (Aahz) |
|---|---|
| Date | 2012-11-03 22:12 -0700 |
| Message-ID | <k74tfa$o81$1@panix5.panix.com> |
| In reply to | #32706 |
In article <50959154$0$6880$e4fe514c@news2.news.xs4all.nl>, Hans Mulder <hansmu@xs4all.nl> wrote: >On 3/11/12 20:41:28, Aahz wrote: >> In article <50475822$0$6867$e4fe514c@news2.news.xs4all.nl>, >> Hans Mulder <hansmu@xs4all.nl> wrote: >>> On 5/09/12 15:19:47, Franck Ditter wrote: >>>> >>>> - I should have said that I work with Python 3. Does that matter ? >>>> - May I reformulate the queston : "a is b" and "id(a) == id(b)" >>>> both mean : "a et b share the same physical address". Is that True ? >>> >>> Yes. >>> >>> Keep in mind, though, that in some implementation (e.g. Jython), the >>> physical address may change during the life time of an object. >>> >>> It's usually phrased as "a and b are the same object". If the object >>> is mutable, then changing a will also change b. If a and b aren't >>> mutable, then it doesn't really matter whether they share a physical >>> address. >> >> That last sentence is not quite true. intern() is used to ensure that >> strings share a physical address to save memory. > >That's a matter of perspective: in my book, the primary advantage of >working with interned strings is that I can use 'is' rather than '==' >to test for equality if I know my strings are interned. The space >savings are minor; the time savings may be significant. As others have pointed out, using ``is`` with strings is a Bad Habit likely leading to nasty, hard-to-find bugs. intern() costs time, but saves considerable space in any application with lots of duplicate computed strings (hundreds of megabytes in some cases). -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "....Normal is what cuts off your sixth finger and your tail..." --Siobhan
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <d@davea.name> |
|---|---|
| Date | 2012-09-05 10:00 -0400 |
| Message-ID | <mailman.239.1346853638.27098.python-list@python.org> |
| In reply to | #28492 |
Please don't top-post. Now your message is out of order, and if I have
to delete the part Benjamin said.
On 09/05/2012 09:19 AM, Franck Ditter wrote:
> Thanks to all, but :
> - I should have said that I work with Python 3. Does that matter ?
> - May I reformulate the queston : "a is b" and "id(a) == id(b)"
> both mean : "a et b share the same physical address". Is that True ?
> Thanks,
No, id() has nothing to do with physical address. The Python language
does not specify anything about physical addresses. Some
implementations may happen to use physical addresses, others arbitrary
integers. And they may reuse such integers, or not. Up to the
implementation.
And as others have pointed out, when you compare two id's, you're
risking that one of them may no longer be valid. For example, the
following expression:
flag = id(func1()) == id(func2())
could very well evaluate to True, even if func1() always returns a
string, and func2() always returns an int. On the other hand, the 'is'
expression makes sure the two expressions are bound to the same object.
If a and b are simple names, and not placeholders for arbitrary
expressions, then I THINK the following would be true:
"a is b" and "id(a) == id(b)" both mean that the names a and b are
bound to the same object at the time the statement is executed.
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-09-05 14:41 +0000 |
| Message-ID | <5047648f$0$29981$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #28497 |
On Wed, 05 Sep 2012 10:00:09 -0400, Dave Angel wrote: > On 09/05/2012 09:19 AM, Franck Ditter wrote: >> Thanks to all, but : >> - I should have said that I work with Python 3. Does that matter ? - >> May I reformulate the queston : "a is b" and "id(a) == id(b)" >> both mean : "a et b share the same physical address". Is that True ? >> Thanks, > > No, id() has nothing to do with physical address. The Python language > does not specify anything about physical addresses. Some > implementations may happen to use physical addresses, others arbitrary > integers. And they may reuse such integers, or not. Up to the > implementation. True. In principle, some day there might be a version of Python that runs on some exotic quantum computer where the very concept of "physical address" is meaningless. Or some sort of peptide or DNA computer, where the calculations are performed via molecular interactions rather than by flipping bits in fixed memory locations. But less exotically, Frank isn't entirely wrong. With current day computers, it is reasonable to say that any object has exactly one physical location at any time. In Jython, objects can move around; in CPython, they can't. But at any moment, any object has a specific location, and no other object can have that same location. Two objects cannot both be at the same memory address at the same time. So, for current day computers at least, it is reasonable to say that "a is b" implies that a and b are the same object at a single location. The second half of the question is more complex: "id(a) == id(b)" *only* implies that a and b are the same object at the same location if they exist at the same time. If they don't exist at the same time, then you can't conclude anything. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <d@davea.name> |
|---|---|
| Date | 2012-09-05 11:09 -0400 |
| Message-ID | <mailman.244.1346857794.27098.python-list@python.org> |
| In reply to | #28504 |
On 09/05/2012 10:41 AM, Steven D'Aprano wrote:
> On Wed, 05 Sep 2012 10:00:09 -0400, Dave Angel wrote:
>
>> On 09/05/2012 09:19 AM, Franck Ditter wrote:
>>> Thanks to all, but :
>>> - I should have said that I work with Python 3. Does that matter ? -
>>> May I reformulate the queston : "a is b" and "id(a) == id(b)"
>>> both mean : "a et b share the same physical address". Is that True ?
>>> Thanks,
>> No, id() has nothing to do with physical address. The Python language
>> does not specify anything about physical addresses. Some
>> implementations may happen to use physical addresses, others arbitrary
>> integers. And they may reuse such integers, or not. Up to the
>> implementation.
> True. In principle, some day there might be a version of Python that runs
> on some exotic quantum computer where the very concept of "physical
> address" is meaningless. Or some sort of peptide or DNA computer, where
> the calculations are performed via molecular interactions rather than by
> flipping bits in fixed memory locations.
>
> But less exotically, Frank isn't entirely wrong. With current day
> computers, it is reasonable to say that any object has exactly one
> physical location at any time. In Jython, objects can move around; in
> CPython, they can't. But at any moment, any object has a specific
> location, and no other object can have that same location. Two objects
> cannot both be at the same memory address at the same time.
>
> So, for current day computers at least, it is reasonable to say that
> "a is b" implies that a and b are the same object at a single location.
You're arguing against something i didn't say. I only said that id()
doesn't promise to be a memory address. i said nothing about what it
might mean if the "is" operator considers them the same.
> The second half of the question is more complex:
>
> "id(a) == id(b)" *only* implies that a and b are the same object at the
> same location if they exist at the same time. If they don't exist at the
> same time, then you can't conclude anything.
>
>
But by claiming that id() really means address, and that those addresses
might move during the lifetime of an object, then the fact that the id()
functions are not called simultaneously implies that one object might
move to where the other one used to be before the "move."
I don't claim to know the jython implementation. But you're claiming
that id() means the address of the object, even in jython. So if a
garbage collection can occur during the evaluation of the expression
id(a) == id(b)
then the comparing of id()'s would be useless in jython. Two distinct
objects could each be moved during evaluation, (very) coincidentally
causing the two to have the same addresses at the two times of
evaluation. Or more likely, a single object could move to a new
location, rendering the comparison false. Thus you have false positive
and false negative possible.
I think it much more likely that jython uses integer values for the id()
function, and not physical addresses. I doubt they'd want a race condition.
--
DaveA
[toc] | [prev] | [next] | [standalone]
Page 1 of 2 [1] 2 Next page →
Back to top | Article view | comp.lang.python
csiph-web