Path: csiph.com!usenet.pasdenom.info!news.etla.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'string.': 0.05; 'indexing': 0.07; 'purpose.': 0.07; 'tests.': 0.07; 'arrays': 0.09; 'immutable': 0.09; 'pep': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'strings.': 0.09; 'python': 0.11; 'jan': 0.12; '2.7': 0.14; '*and': 0.16; 'cons': 0.16; 'finds': 0.16; 'hidden,': 0.16; 'immutable,': 0.16; 'keypress': 0.16; 'rebuilding': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'reedy': 0.16; 'relevant.': 0.16; 'usage,': 0.16; 'wrote:': 0.18; "python's": 0.19; 'widget': 0.19; '(the': 0.22; 'header:User-Agent:1': 0.23; 'replace': 0.24; 'unicode': 0.24; 'fine': 0.24; 'source': 0.25; 'equivalent': 0.26; 'least': 0.26; 'header:X-Complaints-To:1': 0.27; 'header:In-Reply- To:1': 0.27; 'correct': 0.29; 'michael': 0.29; 'chris': 0.29; 'am,': 0.29; 'apparently': 0.31; 'correctly.': 0.31; 'releases,': 0.31; 'quite': 0.32; 'text': 0.33; 'could': 0.34; 'editor': 0.35; 'test': 0.35; 'but': 0.35; 'idle': 0.36; 'possible': 0.36; 'list': 0.37; 'list.': 0.37; 'represent': 0.38; 'basis.': 0.38; 'to:addr :python-list': 0.38; 'rather': 0.38; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'received:173': 0.61; 'simply': 0.61; 'name': 0.63; 'such': 0.63; 'different': 0.65; 'details': 0.65; 'direct': 0.67; 'between': 0.67; 'reads': 0.68; 'pro': 0.69; 'different.': 0.84; 'maybe,': 0.84; 'mock': 0.84; 'received:fios.verizon.net': 0.84; 'texts,': 0.84; 'imagine': 0.93 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Terry Reedy Subject: Re: RE Module Performance Date: Wed, 24 Jul 2013 13:52:39 -0400 References: <571a6dfe-fd66-42cf-92fc-8b97cbe6e9e4@googlegroups.com> <51DFDE65.5040001@Gmail.com> <4f1067f6-bc99-42ad-9166-37fb228b90e8@googlegroups.com> <51EFEC17.90303@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: pool-173-75-251-66.phlapa.fios.verizon.net User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7 In-Reply-To: <51EFEC17.90303@gmail.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 40 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1374688374 news.xs4all.nl 15920 [2001:888:2000:d::a6]:57460 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:51155 On 7/24/2013 11:00 AM, Michael Torrie wrote: > On 07/24/2013 08:34 AM, Chris Angelico wrote: >> Frankly, Python's strings are a *terrible* internal representation >> for an editor widget - not because of PEP 393, but simply because >> they are immutable, and every keypress would result in a rebuilding >> of the string. On the flip side, I could quite plausibly imagine >> using a list of strings; I used exactly this, a list of strings, for a Python-coded text-only mock editor to replace the tk Text widget in idle tests. It works fine for the purpose. For small test texts, the inefficiency of immutable strings is not relevant. Tk apparently uses a C-coded btree rather than a Python list. All details are hidden, unless one finds and reads the source ;-), but but it uses C arrays rather than Python strings. >> In this usage, the FSR is beneficial, as it's possible to have >> different strings at different widths. For my purpose, the mock Text works the same in 2.7 and 3.3+. > Maybe, but simply thinking logically, FSR and UCS-4 are equivalent in > pros and cons, They both have the pro that indexing is direct *and correct*. The cons are different. > and the cons of using UCS-2 (the old narrow builds) are > well known. UCS-2 simply cannot represent all of unicode correctly. Python's narrow builds, at least for several releases, were in between USC-2 and UTF-16 in that they used surrogates to represent all unicodes but did not correct indexing for the presence of astral chars. This is a nuisance for those who do use astral chars, such as emotes and CJK name chars, on an everyday basis. -- Terry Jan Reedy