Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #19313

Re: String interning in Python 3 - missing or moved?

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <rosuav@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'subject:Python': 0.05; '(especially': 0.07; 'happily': 0.07; 'subject:missing': 0.07; 'terry': 0.07; 'worse': 0.07; 'python': 0.08; 'builtin': 0.09; 'intern': 0.09; 'broken': 0.12; 'received:209.85.210.174': 0.13; 'received:mail-iy0-f174.google.com': 0.13; 'devs': 0.16; 'fiddle': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'promises': 0.16; 'randomize': 0.16; 'reedy': 0.16; 'relied': 0.16; 'runs.': 0.16; 'subject:String': 0.16; 'unchanged,': 0.16; 'wrote:': 0.16; 'jan': 0.19; 'header:In-Reply-To:1': 0.22; 'changed': 0.23; "shouldn't": 0.23; 'string': 0.24; 'code': 0.25; 'pm,': 0.26; "i'm": 0.27; 'message-id:@mail.gmail.com': 0.28; '24,': 0.28; 'explicitly': 0.28; 'generally': 0.30; 'equality': 0.30; 'hash': 0.30; 'solved': 0.30; 'strings.': 0.30; 'subject:?': 0.30; 'quite': 0.31; 'done,': 0.32; 'minor': 0.32; 'tue,': 0.32; 'done': 0.33; 'it.': 0.33; 'to:addr:python-list': 0.33; 'anything': 0.34; 'something': 0.35; 'test': 0.35; 'optimization': 0.36; 'two': 0.37; 'but': 0.37; "there's": 0.37; 'received:google.com': 0.37; 'used,': 0.38; 'received:209.85': 0.38; 'think': 0.38; 'should': 0.38; 'first.': 0.38; 'either': 0.39; 'received:209': 0.39; 'subject:: ': 0.39; 'change': 0.40; 'to:addr:python.org': 0.40; 'worth': 0.61; 'your': 0.61; 'same,': 0.67; 'attention': 0.71; '(based': 0.84; 'subject:moved': 0.84; 'sudden': 0.84; '3.3': 0.91; 'attacks': 0.93; 'presumably': 0.93
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=o4J+1nbLcdvkP/oXgsf0EC9FMzZ37XrGiNjO2eug7O4=; b=xeDZsumBU3UZItVjtEjI1wnzM9fE8alLjMWWRNbnCWdMdtQ/YxoD3mXi3GmnJ/mOeD HOSzVOOZmMC5SuA5cUwIAiY4sIZwNCfgnHak58SBHXszWSeOj2suWu3zF1nEx9zLZt4c Ayewx6tzPEyzGvhxkqpASD0g7F9mePJOy5WFM=
MIME-Version 1.0
In-Reply-To <jflbf2$l09$1@dough.gmane.org>
References <CAPTjJmr5=8H9pmWMQ1Q==O1DFGe31ykOBXNzQ9YhS+W8aaAxfw@mail.gmail.com> <CAMZYqRQXfLTrwzqma=ryVKmJByzPtfQqQRa8OjGG7d+qZ7KNGA@mail.gmail.com> <jflbf2$l09$1@dough.gmane.org>
Date Tue, 24 Jan 2012 15:47:56 +1100
Subject Re: String interning in Python 3 - missing or moved?
From Chris Angelico <rosuav@gmail.com>
To python-list@python.org
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.5007.1327380479.27778.python-list@python.org> (permalink)
Lines 28
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1327380479 news.xs4all.nl 6876 [2001:888:2000:d::a6]:45890
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:19313

Show key headers only | View raw


On Tue, Jan 24, 2012 at 3:18 PM, Terry Reedy <tjreedy@udel.edu> wrote:
> I think that the devs decided that interning is a minor internal
> optimization that users generally should not fiddle with (especially how
> that so much is done automatically anyway*), while having it a builtin made
> it look like something they should pay attention to.
>
> *I am not sure but what hashes for strings either are or in 3.3 will always
> be cached.

I'm of the opinion that hash() shouldn't be relied upon, but
apparently there's code "out there" that would be broken if hash()
changed (and, quite reasonably, the devs don't want to make a sudden
change as a bug-fix release). String interning basically turns every
string into a completely opaque hash; you can use 'is' to test for
equality of two interned strings. Having intern() as a builtin cannot
encourage any worse behavior than relying on hash(), imho - both make
no promises of constancy across runs.

Lua and Pike both quite happily solved hash collision attacks in their
interning of strings by randomizing the hash used, because there's no
way to rely on it. Presumably (based on the intern() docs) Python can
do the same, if you explicitly intern your strings first. Is it worth
recommending that people do this with anything that is
client-provided, and then simply randomize the intern() hash? This
would allow hash() to be unchanged, intern() to still do exactly what
it's always done, and hash collision attacks to be eliminated.

ChrisA

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: String interning in Python 3 - missing or moved? Chris Angelico <rosuav@gmail.com> - 2012-01-24 15:47 +1100

csiph-web