Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.016 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'encoding': 0.05; 'represents': 0.05; 'encode': 0.09; 'subject:few': 0.09; 'cc:addr :python-list': 0.11; 'charset': 0.16; 'hex': 0.16; 'nick': 0.16; 'ordinal': 0.16; 'process?': 0.16; 'simpson': 0.16; 'wrote:': 0.18; 'normally': 0.19; 'typing': 0.19; 'value.': 0.19; '>>>': 0.22; 'cc:addr:python.org': 0.22; '>>>': 0.24; 'byte': 0.24; 'char': 0.24; "shouldn't": 0.24; 'unicode': 0.24; 'cc:2**0': 0.24; '>': 0.26; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'character': 0.29; "doesn't": 0.30; 'matching': 0.30; 'message- id:@mail.gmail.com': 0.30; 'url:mailman': 0.30; 'gives': 0.31; 'chase': 0.31; 'existence': 0.31; 'url:python': 0.33; 'fri,': 0.33; 'received:209.85': 0.35; 'received:209.85.220': 0.35; 'no,': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; '14,': 0.36; 'sequence': 0.36; 'url:listinfo': 0.36; "didn't": 0.36; 'url:org': 0.36; 'received:209': 0.37; 'skip:& 10': 0.38; 'thank': 0.38; 'mapping': 0.38; 'url:mail': 0.40; 'tell': 0.60; 'first': 0.61; 'telling': 0.64; 'taking': 0.65; 'between': 0.67; 'cut': 0.74; 'characters,': 0.84; 'to:addr:support': 0.84; 'joel': 0.91; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=s+GmnPhQm6rQEaMBhINvloMaL5VNvsU6v67/rPmHOuU=; b=eE4d7IG96YeNHQdhJaeTJdO5CEHY5ucriJ5eEeZ3YbVtlEXe++ORWmpnj6m9PtHU9t 7hK8wL+DIXmmG1hDiiaA8h6lsrHaBfxWP2AwNqIkPzigyKaElUmb1vUcFZMH3LIylmZT VCHsOvenXILbwVnYFW1jn5/P3sWkNlYDXRdGlcQeglgnAVG+Msnu61RrtfNeLmXbMfDh CXZwHv+dCTlrX0ZKeTHsN0DpVNwXXAO0zwfDLdKE/L/hU7XCdcQlap9oo/LIDWiCxySM hjLwSF5U5yHrXDfsqMtikJl6yjzm7rsMvOeVi8NGnQvKHoGMEcPF3VFE5/ydGpsufGGQ rRrA== MIME-Version: 1.0 X-Received: by 10.220.213.131 with SMTP id gw3mr1119588vcb.27.1371223300033; Fri, 14 Jun 2013 08:21:40 -0700 (PDT) In-Reply-To: References: Date: Fri, 14 Jun 2013 11:21:39 -0400 Subject: Re: A few questiosn about encoding From: Joel Goldstick To: Nick the Gr33k Content-Type: multipart/alternative; boundary=089e01184210b4d15d04df1ecd45 Cc: "python-list@python.org" X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 135 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1371223308 news.xs4all.nl 16001 [2001:888:2000:d::a6]:36767 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:48163 --089e01184210b4d15d04df1ecd45 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable let's cut to the chase and start with telling us what you DO know Nick. That would take less typing On Fri, Jun 14, 2013 at 9:58 AM, Nick the Gr33k wrote= : > On 14/6/2013 1:14 =CE=BC=CE=BC, Cameron Simpson wrote: > >> Normally a character in a b'...' item represents the byte value >> matching the character's Unicode ordinal value. >> > > The only thing that i didn't understood is this line. > First please tell me what is a byte value > > > \x1b is a sequence you find inside strings (and "byte" strings, the >> b'...' format). >> > > \x1b is a character(ESC) represented in hex format > > b'\x1b' is a byte object that represents what? > > > >>> chr(27).encode('utf-8') > b'\x1b' > > >>> b'\x1b'.decode('utf-8') > '\x1b' > > After decoding it gives the char ESC in hex format > Shouldn't it result in value 27 which is the ordinal of ESC ? > > > No, I mean conceptually, there is no difference between a code-point > > > and its ordinal value. They are the same thing. > > Why Unicode charset doesn't just contain characters, but instead it > contains a mapping of (characters <--> ordinals) ? > > I mean what we do is to encode a character like chr(65).encode('utf-8') > > What's the reason of existence of its corresponding ordinal value since i= t > doesn't get involved into the encoding process? > > Thank you very much for taking the time to explain. > > -- > What is now proved was at first only imagined! > -- > http://mail.python.org/**mailman/listinfo/python-list > --=20 Joel Goldstick http://joelgoldstick.com --089e01184210b4d15d04df1ecd45 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
let's cut to the chase and start with telling us what = you DO know Nick.=C2=A0 That would take less typing


On Fri, Jun 14, 2013 at 9:5= 8 AM, Nick the Gr33k <support@superhost.gr> wrote:
On 14/6/2013 1:14 =CE=BC= =CE=BC, Cameron Simpson wrote:
Normally a character in a b'...' item represents the byte value
matching the character's Unicode ordinal value.

The only thing that i didn't understood is this line.
First please tell me what is a byte value


\x1b is a sequence you find inside strings (and "byte" strings, t= he
b'...' format).

\x1b is a character(ESC) represented in hex format

b'\x1b' is a byte object that represents what?


>>> chr(27).encode('utf-8')
b'\x1b'

>>> b'\x1b'.decode('utf-8')
'\x1b'

After decoding it gives the char ESC in hex format
Shouldn't it result in value 27 which is the ordinal of ESC ?

> No, I mean conceptually, there is no difference between a code-point
> and its ordinal value. They are the same thing.

Why Unicode charset doesn't just contain characters, but instead it con= tains a mapping of (characters <--> ordinals) ?

I mean what we do is to encode a character like chr(65).encode('utf-8&#= 39;)

What's the reason of existence of its corresponding ordinal value since= it doesn't get involved into the encoding process?

Thank you very much for taking the time to explain.

--
What is now proved was at first only imagined!
--
http://mail.python.org/mailman/listinfo/python-list



--
--089e01184210b4d15d04df1ecd45--