Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #68058

How is unicode implemented behind the scenes?

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!ecngs!feeder2.ecngs.de!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <drsalists@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.047
X-Spam-Evidence '*H*': 0.91; '*S*': 0.00; 'encoding': 0.05; 'subject:How': 0.10; 'stored': 0.12; 'subject:unicode': 0.16; 'to:name:python list': 0.16; 'thanks.': 0.20; 'bytes': 0.24; "shouldn't": 0.24; 'unicode': 0.24; "i've": 0.25; 'options': 0.25; 'characters': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'probably': 0.32; 'subject:the': 0.34; 'common': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'subject:?': 0.36; 'sometimes': 0.38; 'depends': 0.38; 'to:addr:python-list': 0.38; 'heard': 0.39; 'realize': 0.39; 'to:addr:python.org': 0.39; 'how': 0.40; 'more': 0.64; 'details,': 0.68; 'internally.': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=d8Je7qE9erireoUjetsAFgnbKx5159As6nsF4+NZ/z4=; b=le7ZwY+b/S+VAZ7H1r/tKfPTYPfKuKHvj350hbWAYE8AI7mkBszNztoVHo0YDeiJa1 bzjvsqZEnoxcoXy6KTLmevb3J1EMO+KMzLYvLrgQFplmgmq4ihorQWpYZ+CFGAHJT+IC HSAv9itcnsj+qisyqlzB6UKXxMAut2JXseEdmdTuK2JmVB7h851NAh9nJUU/TYKBZsLx 4GgP3MPLeZGQzXInm1S/otCOnH95VOqys2i1Un1pE9F6lt1KYwTFAPAkVHkEnfcIk3fq VAbwdqRVEcaszhM5GpP3kWW1wAraR2Nf6t0v6LM6frXZ/BVi48QRrKQspbdbvKX21ntT onnQ==
MIME-Version 1.0
X-Received by 10.236.124.104 with SMTP id w68mr34511611yhh.2.1394330918889; Sat, 08 Mar 2014 18:08:38 -0800 (PST)
Date Sat, 8 Mar 2014 18:08:38 -0800
Subject How is unicode implemented behind the scenes?
From Dan Stromberg <drsalists@gmail.com>
To Python List <python-list@python.org>
Content-Type text/plain; charset=ISO-8859-1
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.7942.1394330927.18130.python-list@python.org> (permalink)
Lines 13
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1394330927 news.xs4all.nl 2866 [2001:888:2000:d::a6]:37258
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:68058

Show key headers only | View raw


OK, I know that Unicode data is stored in an encoding on disk.

But how is it stored in RAM?

I realize I shouldn't write code that depends on any relevant
implementation details, but knowing some of the more common
implementation options would probably help build an intuition for
what's going on internally.

I've heard that characters are no longer all c bytes wide internally,
so is it sometimes utf-8?

Thanks.

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

How is unicode implemented behind the scenes? Dan Stromberg <drsalists@gmail.com> - 2014-03-08 18:08 -0800
  Re: How is unicode implemented behind the scenes? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-09 02:50 +0000
    Re: How is unicode implemented behind the scenes? Roy Smith <roy@panix.com> - 2014-03-08 22:01 -0500
      Re: How is unicode implemented behind the scenes? Chris Angelico <rosuav@gmail.com> - 2014-03-09 14:19 +1100
    Re: How is unicode implemented behind the scenes? Rustom Mody <rustompmody@gmail.com> - 2014-03-08 19:12 -0800
    Re: How is unicode implemented behind the scenes? Dan Sommers <dan@tombstonezero.net> - 2014-03-09 05:46 +0000

csiph-web