Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #68058
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!ecngs!feeder2.ecngs.de!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <drsalists@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.047 |
| X-Spam-Evidence | '*H*': 0.91; '*S*': 0.00; 'encoding': 0.05; 'subject:How': 0.10; 'stored': 0.12; 'subject:unicode': 0.16; 'to:name:python list': 0.16; 'thanks.': 0.20; 'bytes': 0.24; "shouldn't": 0.24; 'unicode': 0.24; "i've": 0.25; 'options': 0.25; 'characters': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'probably': 0.32; 'subject:the': 0.34; 'common': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'subject:?': 0.36; 'sometimes': 0.38; 'depends': 0.38; 'to:addr:python-list': 0.38; 'heard': 0.39; 'realize': 0.39; 'to:addr:python.org': 0.39; 'how': 0.40; 'more': 0.64; 'details,': 0.68; 'internally.': 0.84 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=d8Je7qE9erireoUjetsAFgnbKx5159As6nsF4+NZ/z4=; b=le7ZwY+b/S+VAZ7H1r/tKfPTYPfKuKHvj350hbWAYE8AI7mkBszNztoVHo0YDeiJa1 bzjvsqZEnoxcoXy6KTLmevb3J1EMO+KMzLYvLrgQFplmgmq4ihorQWpYZ+CFGAHJT+IC HSAv9itcnsj+qisyqlzB6UKXxMAut2JXseEdmdTuK2JmVB7h851NAh9nJUU/TYKBZsLx 4GgP3MPLeZGQzXInm1S/otCOnH95VOqys2i1Un1pE9F6lt1KYwTFAPAkVHkEnfcIk3fq VAbwdqRVEcaszhM5GpP3kWW1wAraR2Nf6t0v6LM6frXZ/BVi48QRrKQspbdbvKX21ntT onnQ== |
| MIME-Version | 1.0 |
| X-Received | by 10.236.124.104 with SMTP id w68mr34511611yhh.2.1394330918889; Sat, 08 Mar 2014 18:08:38 -0800 (PST) |
| Date | Sat, 8 Mar 2014 18:08:38 -0800 |
| Subject | How is unicode implemented behind the scenes? |
| From | Dan Stromberg <drsalists@gmail.com> |
| To | Python List <python-list@python.org> |
| Content-Type | text/plain; charset=ISO-8859-1 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.7942.1394330927.18130.python-list@python.org> (permalink) |
| Lines | 13 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1394330927 news.xs4all.nl 2866 [2001:888:2000:d::a6]:37258 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:68058 |
Show key headers only | View raw
OK, I know that Unicode data is stored in an encoding on disk. But how is it stored in RAM? I realize I shouldn't write code that depends on any relevant implementation details, but knowing some of the more common implementation options would probably help build an intuition for what's going on internally. I've heard that characters are no longer all c bytes wide internally, so is it sometimes utf-8? Thanks.
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
How is unicode implemented behind the scenes? Dan Stromberg <drsalists@gmail.com> - 2014-03-08 18:08 -0800
Re: How is unicode implemented behind the scenes? Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-03-09 02:50 +0000
Re: How is unicode implemented behind the scenes? Roy Smith <roy@panix.com> - 2014-03-08 22:01 -0500
Re: How is unicode implemented behind the scenes? Chris Angelico <rosuav@gmail.com> - 2014-03-09 14:19 +1100
Re: How is unicode implemented behind the scenes? Rustom Mody <rustompmody@gmail.com> - 2014-03-08 19:12 -0800
Re: How is unicode implemented behind the scenes? Dan Sommers <dan@tombstonezero.net> - 2014-03-09 05:46 +0000
csiph-web