Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.054 X-Spam-Evidence: '*H*': 0.89; '*S*': 0.00; 'assuming': 0.09; 'bits': 0.09; 'null,': 0.09; 'subject:few': 0.09; '1111': 0.16; 'combinations': 0.16; 'unicode,': 0.16; 'unicode.': 0.16; 'worse.': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'fit': 0.20; 'header :User-Agent:1': 0.23; 'values': 0.27; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'character': 0.29; 'characters': 0.30; "d'aprano": 0.31; 'steven': 0.31; 'values.': 0.31; "can't": 0.35; 'there': 0.35; 'possible': 0.36; 'list': 0.37; 'list.': 0.37; 'sometimes': 0.38; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'even': 0.60; 'no.': 0.61; 'such': 0.63; 'received:74.208': 0.68; '8bit%:100': 0.72; '0000': 0.84; 'actually,': 0.84; 'received:74.208.4.194': 0.84; '2013': 0.98 Date: Wed, 12 Jun 2013 08:43:05 -0400 From: Dave Angel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130510 Thunderbird/17.0.6 MIME-Version: 1.0 To: python-list@python.org Subject: Re: A few questiosn about encoding References: <6dfa3707-80f4-407a-a109-66dbb0130513@googlegroups.com> <51b83e5a$0$29998$c3e8da3$5496439d@news.astraweb.com> In-Reply-To: <51b83e5a$0$29998$c3e8da3$5496439d@news.astraweb.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:m7cklHZDbE66aftJ/g43TxOVLeOf0R1Io9XFA8IhGOg PN7HwjyWYipw+6gYefArQC4QaZcUcYC/GXjsvR9drpodVdr/pn ScnYL0A4kO6FaWTrvhuAUPXVJSgTSLeI+UqOU21Oqc1l1O41y5 g+PpBoQLM0+uSPGNFePNw2qbcykuCw8FC/bYrJZ47pRgaGtQyH Lw4xbTcNPdN3WRyY61MwqpbOmz7NxncsUJpjvrReyD5SO0ANx1 jI1uOPnuBDbuDJcMk3zHFLnUuRq4UYpN7BFuA+YB7puFK8qcTY MYF5/yj+24JaiyhfvJAXXkYZxjldsSHH1+eJImT99YwHbTbPg= = X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 30 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1371041006 news.xs4all.nl 15931 [2001:888:2000:d::a6]:43566 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:47797 On 06/12/2013 05:24 AM, Steven D'Aprano wrote: > On Wed, 12 Jun 2013 09:09:05 +0000, Νικόλαος Κούρας wrote: > >> Isn't 14 bits way to many to store a character ? > > No. > > There are 1114111 possible characters in Unicode. (And in Japan, they > sometimes use TRON instead of Unicode, which has even more.) > > If you list out all the combinations of 14 bits: > > 0000 0000 0000 00 > 0000 0000 0000 01 > 0000 0000 0000 10 > 0000 0000 0000 11 > [...] > 1111 1111 1111 10 > 1111 1111 1111 11 > > you will see that there are only 32767 (2**15-1) such values. You can't > fit 1114111 characters with just 32767 values. > > Actually, it's worse. There are 16536 such values (2**14), assuming you include null, which you did in your list. -- DaveA