Path: csiph.com!news.mixmin.net!newsreader4.netcologne.de!news.netcologne.de!newsfeed0.kamp.net!newsfeed.kamp.net!fu-berlin.de!uni-berlin.de!not-for-mail From: Random832 Newsgroups: comp.lang.python Subject: Re: How to waste computer memory? Date: Sun, 20 Mar 2016 14:17:22 -0400 Lines: 9 Message-ID: References: <265377f4-741d-4aa2-9338-239f56f8bc57@googlegroups.com> <87twk3oli0.fsf@elektro.pacujo.net> <87k2kzo5y5.fsf@elektro.pacujo.net> <56ed0a71$0$1607$c3e8da3$5496439d@news.astraweb.com> <87lh5en79a.fsf@elektro.pacujo.net> <56ed68bb$0$1604$c3e8da3$5496439d@news.astraweb.com> <877fgylddm.fsf@elektro.pacujo.net> <56ed749e$0$1583$c3e8da3$5496439d@news.astraweb.com> <8737rmla4w.fsf@elektro.pacujo.net> <56ee2ebd$0$1597$c3e8da3$5496439d@news.astraweb.com> <12db8cba-8edf-4cd0-a91d-2f6b6634c9d3@googlegroups.com> <874mc1mc5g.fsf@bsb.me.uk> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de Mlowa38uGspjHP0P7XohvQJdNgy9iZufv2G5Sl9YrNVg== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'encoded': 0.05; 'bits': 0.07; 'utf-8': 0.07; 'subject:How': 0.09; 'encode': 0.09; 'received:internal': 0.09; 'bits,': 0.16; 'bytes),': 0.16; 'message-id:@webmail.messagingengine.com': 0.16; 'received:10.202': 0.16; 'received:10.202.2': 0.16; 'received:10.202.2.212': 0.16; 'received:66.111': 0.16; 'received:66.111.4': 0.16; 'received:io': 0.16; 'received:messagingengine.com': 0.16; 'received:psf.io': 0.16; 'wrote:': 0.16; 'bytes': 0.18; '(or': 0.23; 'header:In-Reply- To:1': 0.24; "doesn't": 0.26; 'least': 0.27; '(it': 0.29; 'that,': 0.34; 'could': 0.35; 'to:addr:python-list': 0.36; 'subject:?': 0.36; 'subject:: ': 0.37; 'received:10': 0.37; 'being': 0.37; 'received:66': 0.38; 'rather': 0.39; 'to:addr:python.org': 0.40; 'header:Message-Id:1': 0.61; 'limit': 0.65; 'mar': 0.65; 'six': 0.65; '20,': 0.66; 'account': 0.66 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.com; h= content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-sasl-enc :x-sasl-enc; s=mesmtp; bh=RJanz/OagPvgJ3F1nmpZKLH5tLA=; b=m5kJIW zlq3ipdaFokAUNjUf6XTY7WQ8CGOQcUO5dI/uymKW9VnHUDTs62QNgx4wF+4hTt0 KCewXQm5dd0/ikBAZF+UBnab8+L/YL1vfkuOoYTFo9LJrSb/0B7i9TX4XIhImJrH sNVLcy4Y8DnjxFE7zOERco9P015g69UOGJJWk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=RJanz/OagPvgJ3F 1nmpZKLH5tLA=; b=o1rF6lWVclwWIn1oElstOlUxoTgyz1UHRFfbvurBKRSBbiD p7nFrnMTkYGJ4Krf22+CTbUuuAfOfaIfbq3QyD0lqnNR/jW/SR/JDrpcacEgGyDv PhSugoWHfOsxfWFaE6NyOd2tarQuEQ+99Cux0uWXYIflekvmAU94LPCw5DF8= X-Sasl-Enc: 8S8lxUdA4OL3ZdsbhIx9nHeXx8cyAnXtMYW7mTLUlEAr 1458497842 X-Mailer: MessagingEngine.com Webmail Interface - ajax-872772a7 In-Reply-To: <874mc1mc5g.fsf@bsb.me.uk> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:105306 On Sun, Mar 20, 2016, at 10:55, Ben Bacarisse wrote: > It's 21. The reason being (or at least part of the reason being) that > 21 bits can be UTF-8 encoded in 4 bytes: 11110xxx 10xxxxxx 10xxxxxx > 10xxxxxx (3 + 3*6). The reason is the UTF-16 limit. Prior to that, UTF-8 had no such limit (it could encode up to 31 bits, as six bytes), and it doesn't account for the fact that four bytes can encode up to U+1FFFFF rather than U+10FFFF.