Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #58944
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <rosuav@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.003 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'skip:[ 20': 0.04; 'syntax': 0.04; 'explicitly': 0.05; 'interpreter': 0.05; 'python)': 0.05; 'string.': 0.05; 'duplicate': 0.07; 'string': 0.09; 'check,': 0.09; 'function,': 0.09; 'subject:string': 0.09; 'python': 0.11; 'equality.': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'increment': 0.16; 'mandate': 0.16; 'objects.': 0.16; 'roy': 0.16; 'skip:[ 40': 0.16; 'language': 0.16; 'wrote:': 0.18; '>>>': 0.22; 'memory': 0.22; 'adds': 0.24; 'compare': 0.26; 'equivalent': 0.26; 'this:': 0.26; 'header:In-Reply-To:1': 0.27; 'chris': 0.29; 'am,': 0.29; '(like': 0.30; 'message-id:@mail.gmail.com': 0.30; 'comparison': 0.31; 'consumption': 0.31; 'equality': 0.31; 'object.': 0.31; 'languages': 0.32; 'checking': 0.33; 'actual': 0.34; 'there,': 0.34; 'knows': 0.35; 'something': 0.35; 'equal': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'false': 0.36; 'subject:?': 0.36; 'effort': 0.37; 'two': 0.37; 'nov': 0.38; 'to:addr:python- list': 0.38; 'does': 0.39; 'itself': 0.39; 'to:addr:python.org': 0.39; 'either': 0.39; 'improved': 0.60; 'free': 0.61; 'simply': 0.61; 'worth': 0.66; 'smith': 0.68; 'saving': 0.69; 'construction': 0.72; 'article': 0.77; 'potentially': 0.81; 'counts': 0.83; 'intern': 0.84; 'it"': 0.84; 'object:': 0.84; 'subject:long': 0.84; 'you;': 0.84; '2013': 0.98 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=TGM21AEOF4VjPP+fjDu6tXMZ9DH3glf3uAFqr+w8oJ8=; b=pExC1wTcsNmeYZJD9YkCo3TOqk+v6TBWIK051ux2lpzMdQNTcV0Lv6jWoJBdNa5pnB a46L4zGDOPMh0Eziy7j5EioXtB1h1zB7X3W/i71GzFkMYDfBzCtrFyoVMRKgLII9fczz eco9Ih9pxwMECdH7tm60V90lI55oNXNqoNFLUX9OLWO8abNuaESzqosuaWZaXugJ7Ibk jm9+ggbElbXWdm60F551yt6ZiP5ZKAYrdA+EtjpZxl05LLzJfJkkYfQkpY5rJL5ubcgD jlG0PKIn+r3UX4YqgXSak7/bcf9MOwg0kmJ+f49Qk3ak/dx317Dh3g3w2vZSuKOT6WMo AuFA== |
| MIME-Version | 1.0 |
| X-Received | by 10.68.240.2 with SMTP id vw2mr20862455pbc.80.1384009372514; Sat, 09 Nov 2013 07:02:52 -0800 (PST) |
| In-Reply-To | <roy-9831F9.09375409112013@news.panix.com> |
| References | <mailman.2232.1383932895.18130.python-list@python.org> <c1bb3377-4425-4707-9ae7-aa7251cebc75@googlegroups.com> <527d85e8$0$29983$c3e8da3$5496439d@news.astraweb.com> <39112f0b-f834-4e4a-86f2-ca19078e6de4@googlegroups.com> <mailman.2283.1383985583.18130.python-list@python.org> <roy-9831F9.09375409112013@news.panix.com> |
| Date | Sun, 10 Nov 2013 02:02:52 +1100 |
| Subject | Re: chunking a long string? |
| From | Chris Angelico <rosuav@gmail.com> |
| To | python-list@python.org |
| Content-Type | text/plain; charset=ISO-8859-1 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.2298.1384009376.18130.python-list@python.org> (permalink) |
| Lines | 57 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1384009376 news.xs4all.nl 15949 [2001:888:2000:d::a6]:48435 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:58944 |
Show key headers only | View raw
On Sun, Nov 10, 2013 at 1:37 AM, Roy Smith <roy@panix.com> wrote:
> In article <mailman.2283.1383985583.18130.python-list@python.org>,
> Chris Angelico <rosuav@gmail.com> wrote:
>
>> Some languages [intern] automatically for all strings, others
>> (like Python) only when you ask for it.
>
> What does "only when you ask for it" mean?
You can explicitly intern a Python string with the sys.intern()
function, which returns either the string itself or an
indistinguishable "interned" string. Two equal strings, when interned,
will return the same object:
>>> foo = "asdf"
>>> bar = "as"
>>> bar += "df"
>>> foo is bar
False
Note that the Python interpreter is free to answer True there, but
there's no mandate for it.
>>> foo = sys.intern(foo)
>>> bar = sys.intern(bar)
>>> foo is bar
True
Now it's mandated. The two strings must be the same object. Interning
in this way makes string equality come down to an 'is' check, which is
potentially a lot faster than actual string equality.
Some languages (eg Pike) do this automatically with all strings - the
construction of any string includes checking to see if it's a
duplicate of any other string. This adds cost to string manipulation
and speeds up string comparisons; since the engine knows that all
strings are interned, it can do the equivalent of an 'is' check for
any string equality.
So what I meant, in terms of storage/representation efficiency, is
that you can store duplicate strings very efficiently if you simply
increment the reference counts of the same few objects. Python won't
necessarily do that for you; check memory usage of something like
this:
strings = [open("some_big_file").read() for _ in range(10000)]
And compare against this:
strings = [sys.intern(open("some_big_file").read()) for _ in range(10000)]
In a language that guarantees string interning, the syntax of the
former would have the memory consumption of the latter. Whether that
memory saving and improved equality comparison is worth the effort of
dictionarification is one of those eternally-debatable points.
ChrisA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
chunking a long string? Roy Smith <roy@panix.com> - 2013-11-08 12:48 -0500
Re: chunking a long string? wxjmfauth@gmail.com - 2013-11-08 12:43 -0800
Re: chunking a long string? Chris Angelico <rosuav@gmail.com> - 2013-11-09 07:53 +1100
Re: chunking a long string? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-11-08 20:57 +0000
Re: chunking a long string? Tim Chase <python.list@tim.thechases.com> - 2013-11-08 15:04 -0600
Re: chunking a long string? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-11-08 21:06 +0000
Re: chunking a long string? Chris Angelico <rosuav@gmail.com> - 2013-11-09 08:04 +1100
Re: chunking a long string? Chris Angelico <rosuav@gmail.com> - 2013-11-09 08:17 +1100
Re: chunking a long string? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-09 00:46 +0000
Re: chunking a long string? wxjmfauth@gmail.com - 2013-11-09 00:14 -0800
Re: chunking a long string? Chris Angelico <rosuav@gmail.com> - 2013-11-09 19:26 +1100
Re: chunking a long string? Roy Smith <roy@panix.com> - 2013-11-09 09:37 -0500
Re: chunking a long string? Chris Angelico <rosuav@gmail.com> - 2013-11-10 02:02 +1100
Re: chunking a long string? Roy Smith <roy@panix.com> - 2013-11-09 10:21 -0500
Re: chunking a long string? Chris Angelico <rosuav@gmail.com> - 2013-11-10 02:30 +1100
Re: chunking a long string? Roy Smith <roy@panix.com> - 2013-11-09 10:35 -0500
Re: chunking a long string? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-09 15:37 +0000
Re: chunking a long string? Chris Angelico <rosuav@gmail.com> - 2013-11-10 09:14 +1100
Re: chunking a long string? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-10 06:39 +0000
Re: chunking a long string? Chris Angelico <rosuav@gmail.com> - 2013-11-10 19:46 +1100
Re: chunking a long string? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-11-09 10:13 +0000
Re: chunking a long string? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-09 00:54 +0000
csiph-web