Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!newsfeed.eweka.nl!eweka.nl!feeder3.eweka.nl!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=python.org; s=200901; t=1364594729; bh=ByEpaMpBtIJXC47DdMYjWbQey2fNQaZliGmKZWjEhcA=; h=To:From:Subject:Date:Message-ID:References:Mime-Version: Content-Type:Content-Transfer-Encoding:In-Reply-To; b=LgivzEYtd3HzMsUc+nEp/Yi1jeeVK3zwRpGmwBvpfwPABu79i+pTGv+wXwGA32GGz gQ+DD0tCf5+9EYkJc0/IPcIRIopQflAy8cQAcv5YxGYM2DF0FAcjrobhZYaQI8sgMd IqWQtakZKe+VvE232gJCFpOCVi5YwTwpaaD5+v2o= X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'memory.': 0.07; 'utf-8': 0.07; 'string': 0.09; 'apis': 0.09; 'mentions': 0.09; 'pep': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:string': 0.09; 'backward': 0.16; 'caching': 0.16; 'from:name:christian heimes': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'utf8': 0.16; 'from:addr:python.org': 0.16; 'ownership': 0.19; 'memory': 0.22; 'header:User-Agent:1': 0.23; 'pointer': 0.24; 'header:X -Complaints-To:1': 0.27; 'header:In-Reply-To:1': 0.27; 'function': 0.29; 'specifically': 0.29; 'motivation': 0.31; 'probably': 0.32; 'call.': 0.33; 'but': 0.35; 'christian': 0.38; 'subject:new': 0.38; 'to:addr:python-list': 0.38; 'expect': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'ian': 0.60; 'management.': 0.61; '(that': 0.65; 'kept': 0.65; 'therefore': 0.72 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Christian Heimes Subject: Re: Surrogate pairs in new flexible string representation Date: Fri, 29 Mar 2013 23:05:12 +0100 References: <987c4bd9-0e5e-4387-9c78-1075a77d3c47@c6g2000yqh.googlegroups.com> <51543f45$0$29998$c3e8da3$5496439d@news.astraweb.com> <944f195c-cbfe-47e1-a963-05fe3d98238d@5g2000yqz.googlegroups.com> <5154e2dd$0$29974$c3e8da3$5496439d@news.astraweb.com> <5154fe82$0$29974$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: f048192195.adsl.alicedsl.de User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130308 Thunderbird/17.0.4 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 15 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1364594730 news.xs4all.nl 6905 [2001:888:2000:d::a6]:51252 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:42285 Am 29.03.2013 07:22, schrieb Ian Kelly: > Since the PEP specifically mentions ParseTuple string conversion, I am > thinking that this is probably the motivation for caching it. A > string that is passed into a C function (that uses one of the various > UTF-8 char* format specifiers) is perhaps likely to be passed into > that function again at some point, so the UTF-8 representation is kept > around to avoid the need to recompose it at on each call. It's not just about caching but also about memory management. The additional utf8 member is required for backward compatibility. The APIs expect a pointer to an existing and shared block of memory. They don't take ownership of the memory block and therefore don't free() it. Christian