Path: csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!newsfeed0.kamp.net!newsfeed.kamp.net!feeder1.cambriumusenet.nl!82.197.223.103.MISMATCH!feeder3.cambriumusenet.nl!feed.tweaknews.nl!194.109.133.83.MISMATCH!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'argument': 0.05; 'subject:text': 0.05; 'string': 0.09; 'literal': 0.09; 'mind,': 0.09; 'received:internal': 0.09; 'subject:question': 0.10; 'evaluating': 0.16; 'finney': 0.16; 'message- id:@webmail.messagingengine.com': 0.16; 'received:10.202': 0.16; 'received:10.202.2': 0.16; 'received:10.202.2.212': 0.16; 'received:66.111': 0.16; 'received:66.111.4': 0.16; 'received:messagingengine.com': 0.16; 'surrogate': 0.16; 'valueerror,': 0.16; 'wrote:': 0.18; 'not,': 0.20; 'error': 0.23; "shouldn't": 0.24; 'unicode': 0.24; 'header:In-Reply-To:1': 0.27; 'statement': 0.30; "d'aprano": 0.31; 'steven': 0.31; 'writes:': 0.31; 'problem': 0.35; 'received:66': 0.35; 'should': 0.36; 'error.': 0.37; 'received:10': 0.37; 'ben': 0.38; 'to:addr:python- list': 0.38; 'to:addr:python.org': 0.39; 'is.': 0.60; 'from:no real name:2**0': 0.61; 'header:Message-Id:1': 0.63; 'mar': 0.68; 'lone': 0.84 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.us; h= message-id:x-sasl-enc:from:to:mime-version :content-transfer-encoding:content-type:in-reply-to:references :subject:date; s=mesmtp; bh=8UrbBJh2QjDcFf1Mjh255kQ58cE=; b=Ihb1 oG4QWbXrjqjIWiQAEQC2osO5H8Hr6fOMx3CVagNLmE7pr3MyOnin3R8xsUc8s/xy rLO0giQ7gaevi6yCYeJD7kGtCcXGUuwEluJcJhL24uW/zZUq0lEhQ89gOHC9OGTI jjnBFbaJAySioqe/ub6H/+8d43YboVH8MQiz7Vw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=message-id:x-sasl-enc:from:to :mime-version:content-transfer-encoding:content-type:in-reply-to :references:subject:date; s=smtpout; bh=8UrbBJh2QjDcFf1Mjh255kQ5 8cE=; b=KHI3U6H2WIkHFylvofHKiVdcvtQETSzg2L0g77AYwiHV0S1NUsNgTPhD 5LUyCEibrqK78klHJ0LRD5/NQbk9eiw99X9dgZcy1kKI4ovADNKfoxcuZuYYicFi jGlqwSZEwP2XeU211BXEo/MnNoWwLy9lk4p1gS8ZpwK1e2Df0i4= X-Sasl-Enc: AWHkr8TboN89fpgAu3lmUg9EtfAjx5v5blDydanSDoq1 1425875276 From: random832@fastmail.us To: python-list@python.org MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain X-Mailer: MessagingEngine.com Webmail Interface - ajax-07699171 In-Reply-To: <85y4n6x6p5.fsf@benfinney.id.au> References: <9169f3b1-2ac7-42a3-8033-584f84b88a1f@googlegroups.com> <7a75a23c-4678-4d7a-a2ec-9e8fff4c07f8@googlegroups.com> <132d5ce6-f672-4eec-99f9-1cc9e88b94f3@googlegroups.com> <619e4cb5-1c4c-449b-a5d7-951101b32b45@googlegroups.com> <54f862ca$0$13014$c3e8da3$5496439d@news.astraweb.com> <54fadc70$0$13004$c3e8da3$5496439d@news.astraweb.com> <87twxxxbvd.fsf@elektro.pacujo.net> <54fb1bf4$0$12993$c3e8da3$5496439d@news.astraweb.com> <87twxw4xlz.fsf@elektro.pacujo.net> <54fba9d4$0$12988$c3e8da3$5496439d@news.astraweb.com> <87y4n8uf9a.fsf@elektro.pacujo.net> <87twxvvrjl.fsf@elektro.pacujo.net> <54fc9400$0$13009$c3e8da3$5496439d@news.astraweb.com> <87d24juu8r.fsf@elektro.pacujo.net> <54fcfac0$0$12995$c3e8da3$5496439d@news.astraweb.com> <85y4n6x6p5.fsf@benfinney.id.au> Subject: Re: Newbie question about text encoding Date: Mon, 09 Mar 2015 00:27:56 -0400 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.19 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 13 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1425875286 news.xs4all.nl 2936 [2001:888:2000:d::a6]:57078 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:87171 On Sun, Mar 8, 2015, at 22:09, Ben Finney wrote: > Steven D'Aprano writes: > > > '\udd00' should be a SyntaxError. > > I find your argument convincing, that attempting to construct a Unicode > string of a lone surrogate should be an error. > > Shouldn't the error type be a ValueError, though? The statement is not, > to my mind, erroneous syntax. In this hypothetical, it's a problem with evaluating a literal - in the same way that '\U12345', or '\U00110000, is.