Path: csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.003 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'completeness': 0.07; 'mrab': 0.07; 'subject:data': 0.07; 'wednesday,': 0.07; '128': 0.09; 'character,': 0.09; 'continuation': 0.09; 'defines': 0.09; 'definitions': 0.09; 'messing': 0.09; 'pretend': 0.09; 'subject:string': 0.09; 'utf-8': 0.09; 'python': 0.11; 'assume': 0.12; 'to:name:python-list@python.org': 0.15; '(now': 0.16; '>the': 0.16; 'ascii': 0.16; 'characters,': 0.16; 'currencies': 0.16; 'disclaimers': 0.16; 'disclaimers,': 0.16; 'encoding.': 0.16; 'from:addr:jpmorgan.com': 0.16; 'received:155.180': 0.16; 'received:159': 0.16; 'received:159.53': 0.16; 'received:bankone.net': 0.16; 'received:exchad.jpmchase.net': 0.16; 'received:jpmchase.com': 0.16; 'received:jpmchase.net': 0.16; 'received:svr.bankone.net': 0.16; 'securities,': 0.16; 'subject:changing': 0.16; 'sure.': 0.16; 'url:disclosures': 0.16; 'url:jpmorgan': 0.16; 'accuracy': 0.18; 'message-----': 0.18; 'processed': 0.18; 'string': 0.18; 'received:169.254': 0.20; 'wrote:': 0.21; 'header:In-Reply-To:1': 0.22; 'convert': 0.23; '(which': 0.24; 'received:169': 0.27; 'that.': 0.28; 'url:wiki': 0.28; 'operations.': 0.29; 'received:155': 0.29; 'safely': 0.29; 'vice': 0.29; 'asking': 0.29; 'character': 0.30; 'mostly': 0.30; 'phone:': 0.31; 'url:mailman': 0.31; 'chris': 0.32; 'subject: (': 0.33; 'url:en': 0.33; 'byte': 0.33; 'bytes': 0.33; 'space': 0.34; 'could': 0.34; 'thanks': 0.34; 'url:python': 0.34; 'there': 0.35; 'characters': 0.35; 'things': 0.36; 'url:listinfo': 0.36; 'subject:)': 0.36; 'text': 0.36; 'changing': 0.36; 'but': 0.36; 'url:org': 0.36; 'does': 0.36; 'charset:us-ascii': 0.36; 'no,': 0.37; 'data': 0.38; 'next': 0.38; 'something': 0.38; 'subject:': 0.38; 'goes': 0.38; 'values': 0.38; 'correct': 0.38; 'proto:https': 0.39; 'being': 0.39; 'from:': 0.39; "can't": 0.39; 'to:addr:python-list': 0.39; 'think': 0.40; 'email addr:python.org': 0.40; 'how': 0.40; 'to:addr:python.org': 0.40; 'single': 0.61; 'more': 0.63; 'information,': 0.65; 'strings': 0.66; 'understood': 0.66; 'email name:python-list': 0.67; 'march': 0.67; 'purchase': 0.67; '2012': 0.69; 'subject': 0.70; 'direct': 0.70; 're:': 0.70; 'investment': 0.71; 'legal': 0.73; 'bank': 0.74; 'sale': 0.75; '256': 0.84; '712': 0.84; 'correction.': 0.84; 'encoding,': 0.84; 'houston,': 0.84; 'printable': 0.84; 'received:169.254.8': 0.84; 'recreate': 0.84; 'such,': 0.84; 'summarized': 0.84; 'technically': 0.84; 'ascii.': 0.91; 'encounter': 0.91 X-DKIM: OpenDKIM Filter v2.1.3 sf1.jpmchase.com q2THai7u000674 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=jpmorgan.com; s=smtpout; t=1333042605; bh=/+1N2AIxdU41HsBNLHaRCondECLVMnvOBUhQoAR1jzc=; h=From:To:Subject:Date:Message-ID:References:In-Reply-To: Content-Transfer-Encoding:MIME-Version:Content-Type; b=CCCkedxZsPXhYd2/tAQ7VAWIOxnv6+w2e4iWfrArzocf+IE0vQf8HOGwp/L6XK8AJ B2UGYNntaPmmEUe4W9n1GG/Ooh78vBdxdWsfTAxFGemsuY3MJ+MtaD75VnYT5Yu3ag Y+h2Uaq6Sn6zvX6GOVbQDO6JhYB8wRt+55Wvk/V0= From: "Prasad, Ramit" To: "python-list@python.org" Subject: RE: "convert" string to bytes without changing data (encoding) Thread-Topic: "convert" string to bytes without changing data (encoding) Thread-Index: AQHNDQ/IHZNzdgc8qEeb3v43AQCsyZaADnZwgABSZYCAASkh4A== Date: Thu, 29 Mar 2012 17:36:34 +0000 References: <9tg21lFmo3U1@mid.dfncis.de> <4f73504c$0$29981$c3e8da3$5496439d@news.astraweb.com> <5B80DD153D7D744689F57F4FB69AF47409291BEC@SCACMX008.exchad.jpmchase.net> <4F736B69.9080600@mrabarnett.plus.com> In-Reply-To: <4F736B69.9080600@mrabarnett.plus.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.67.79.38] Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-DLP-FWD: Yes Content-Type: text/plain; charset="us-ascii" X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 43 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1333042614 news.xs4all.nl 6850 [2001:888:2000:d::a6]:40308 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:22352 > > Technically, ASCII goes up to 256 but they are not A-z letters=2E=0D=0A= > >=0D=0A> Technically, ASCII is 7-bit, so it goes up to 127=2E=0D=0A=0D=0A= > No, ASCII only defines 0-127=2E Values >=3D128 are not ASCII=2E=0D=0A> = =0D=0A> >From https://en=2Ewikipedia=2Eorg/wiki/ASCII:=0D=0A> =0D=0A> ASC= II includes definitions for 128 characters: 33 are non-printing=0D=0A> co= ntrol characters (now mostly obsolete) that affect how text and=0D=0A> sp= ace is processed and 95 printable characters, including the space=0D=0A> = (which is considered an invisible graphic)=2E=0D=0A=0D=0A=0D=0ADoh! I was m= istaking extended ASCII for ASCII=2E Thanks for the=0D=0Acorrection=2E=0D= =0A=0D=0ARamit=0D=0A=0D=0A=0D=0ARamit Prasad | JPMorgan Chase Investment Ba= nk | Currencies Technology=0D=0A712 Main Street | Houston, TX 77002=0D=0Awo= rk phone: 713 - 216 - 5423=0D=0A=0D=0A--=0D=0A=0D=0A=0D=0A> -----Original M= essage-----=0D=0A> From: python-list-bounces+ramit=2Eprasad=3Djpmorgan=2Eco= m@python=2Eorg=0D=0A> [mailto:python-list-bounces+ramit=2Eprasad=3Djpmorgan= =2Ecom@python=2Eorg] On=0D=0A> Behalf Of MRAB=0D=0A> Sent: Wednesday, March= 28, 2012 2:50 PM=0D=0A> To: python-list@python=2Eorg=0D=0A> Subject: Re: "= convert" string to bytes without changing data (encoding)=0D=0A> =0D=0A> On= 28/03/2012 20:02, Prasad, Ramit wrote:=0D=0A> >> >The right way to conver= t bytes to strings, and vice versa, is via=0D=0A> >> >encoding and decodin= g operations=2E=0D=0A> >>=0D=0A> >> If you want to dictate to the original= poster the correct way to do=0D=0A> >> things then you don't need to do a= nything more that=2E You don't need=0D=0A> to=0D=0A> >> pretend like Chri= s Angelico that there's isn't a direct mapping from=0D=0A> >> the his Pyth= on 3 implementation's internal respresentation of strings=0D=0A> >> to byt= es in order to label what he's asking for as being "silly"=2E=0D=0A> >=0D= =0A> > It might be technically possible to recreate internal implementation= ,=0D=0A> > or get the byte data=2E That does not mean it will make any sens= e or=0D=0A> > be understood in a meaningful manner=2E I think Ian summarize= d it=0D=0A> > very well:=0D=0A> >=0D=0A> >>You can't generally just "deal w= ith the ascii portions" without=0D=0A> >>knowing something about the encodi= ng=2E Say you encounter a byte=0D=0A> >>greater than 127=2E Is it a singl= e non-ASCII character, or is it the=0D=0A> >>leading byte of a multi-byte c= haracter? If the next character is less=0D=0A> >>than 127, is it an ASCII = character, or a continuation of the previous=0D=0A> >>character? For UTF-8= you could safely assume ASCII, but without=0D=0A> >>knowing the encoding, = there is no way to be sure=2E If you just assume=0D=0A> >>it's ASCII and m= anipulate it as such, you could be messing up=0D=0A> >>non-ASCII characters= =2E=0D=0A> >=0D=0A> --=0D=0A> http://mail=2Epython=2Eorg/mailman/listinfo/p= ython-list=0D=0AThis email is confidential and subject to important disclai= mers and=0D=0Aconditions including on offers for the purchase or sale of=0D= =0Asecurities, accuracy and completeness of information, viruses,=0D=0Aconf= identiality, legal privilege, and legal entity disclaimers,=0D=0Aavailable = at http://www=2Ejpmorgan=2Ecom/pages/disclosures/email=2E