Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed1a.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'string': 0.09; 'bits': 0.09; 'subject:into': 0.09; 'subject:string': 0.09; 'subject:How': 0.10; 'cc:addr:python-list': 0.11; 'python': 0.11; 'kurt': 0.12; 'windows': 0.15; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'ideally,': 0.16; 'newer,': 0.16; 'sat,': 0.16; 'wrote:': 0.18; 'possible,': 0.19; 'memory': 0.22; 'cc:addr:python.org': 0.22; 'unicode': 0.24; 'cc:2**0': 0.24; 'defined': 0.27; 'header:In-Reply-To:1': 0.27; 'character': 0.29; 'points': 0.29; "doesn't": 0.30; 'subject:list': 0.30; 'message- id:@mail.gmail.com': 0.30; 'code': 0.31; 'usually': 0.31; 'sep': 0.31; 'linux': 0.33; 'checking': 0.33; 'mac': 0.33; 'skip:u 20': 0.35; 'test': 0.35; 'received:google.com': 0.35; 'much.': 0.36; 'subject:?': 0.36; 'pm,': 0.38; 'highest': 0.39; 'even': 0.60; 'skip:u 10': 0.60; 'maximum': 0.63; 'precompiled': 0.84; 'to:none': 0.92 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=HcjKwkDt+y9Oad1Pz3hqRv1JkqdWmVHFSo2lr8laelE=; b=OjaDlm/2jtoBdsMaR0+p8dIz6vGaqE9zmFhmSNGCpvT/q+TD3s/OP1GMazB7WfLU3c PZ+OCNNwPMm0vXyOxW3rHwMpyscL+oOP+VQSmR2GAMlflhZF9mj8eeWI8lMTeLQyDQUL dtWkCLWuwU1T91s5Cfa9QNWqEA85CQ526Ar3mxqvb3smygxxw3uDBL9BZwM0yWvWxKjJ L5+VV6Gh8OJtMZI7mTf/8G9V1yqXk64h5/OKpA7BsvaET2nh5iuw2aMOP2sbSL+NB+d/ 67fMU6r9/m36oK+x9hn04VlTt85xGo38iPLlOosvfGi0hRCmlAJuovRVWld51Gf2oAZc 3AeA== MIME-Version: 1.0 X-Received: by 10.42.137.194 with SMTP id z2mr318954ict.85.1410006200417; Sat, 06 Sep 2014 05:23:20 -0700 (PDT) In-Reply-To: <200C8328-DBB2-4DE1-9419-4A0A53599C08@gmail.com> References: <1amjdb-p3n.ln1@chris.zbmc.eu> <1k9odb-1qs.ln1@chris.zbmc.eu> <540aa002$0$29968$c3e8da3$5496439d@news.astraweb.com> <200C8328-DBB2-4DE1-9419-4A0A53599C08@gmail.com> Date: Sat, 6 Sep 2014 22:23:20 +1000 Subject: Re: How to turn a string into a list of integers? From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 23 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1410006208 news.xs4all.nl 2926 [2001:888:2000:d::a6]:45707 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:77651 On Sat, Sep 6, 2014 at 10:15 PM, Kurt Mueller wrote: > I understand: narrow build is UCS2, wide build is UCS4 > - In a UCS2 build each character of an Unicode string uses 16 Bits and has > code points from U-0000..U-FFFF (0..65535) > - In a UCS4 build each character of an Unicode string uses 32 Bits and has > code points from U-00000000..U-0010FFFF (0..1114111) Pretty much. Narrow builds are buggy, so as much as possible, you want to avoid using them. Ideally, use Python 3.3 or newer, where the distinction doesn't exist - all builds are functionally like wide builds, with memory usage even better than narrow builds (they'll use 8 bits per character if it's possible). As a general rule, precompiled Python for Windows is usually a narrow build, and Python distributions for Linux are usually wide builds. (I don't know about Mac OS builds.) You can test any Python by checking out sys.maxunicode - it'll be 65535 on a narrow build, or 1114111 on wide builds (because that's the maximum codepoint defined by Unicode - U+10FFFF - as it's the highest number that can be represented in UTF-16). ChrisA