Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!tudelft.nl!txtfeed1.tudelft.nl!multikabel.net!newsfeed20.multikabel.net!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.014 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; '(at': 0.05; 'encoded': 0.07; 'gather': 0.07; 'subject:data': 0.07; 'referencing': 0.09; 'subject:string': 0.09; '...)': 0.16; 'ascii': 0.16; 'bytes,': 0.16; 'no;': 0.16; 'subject:changing': 0.16; 'language': 0.17; 'string': 0.18; 'specifically': 0.21; 'i.e.': 0.22; 'string,': 0.22; 'header:In-Reply-To:1': 0.22; 'header:User-Agent:1': 0.23; '(or': 0.24; '(which': 0.24; 'variable': 0.27; '---': 0.27; 'least': 0.27; 'interpreted': 0.29; 'points': 0.30; 'subject: (': 0.33; 'encoding': 0.33; 'unicode': 0.33; 'view,': 0.33; 'there': 0.35; 'characters': 0.35; "i'm": 0.36; 'subject:)': 0.36; 'peter': 0.37; 'data': 0.38; 'received:org': 0.38; 'being': 0.39; 'to:addr :python-list': 0.39; 'to:addr:python.org': 0.40; 'your': 0.60; 'above.': 0.63; 'different': 0.65; 'storage': 0.67; "you'll": 0.67; 'talking': 0.70; 'natural': 0.74; 'series': 0.80; 'encoding,': 0.84; 'stated': 0.85 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=modelnine.org; s=modelnine1012; t=1332931363; bh=twsaflo6I2i02o2e3/p6dSbzEEg3Z7mr9ImSIbtpG+M=; h=MIME-Version:Content-Type:Content-Transfer-Encoding:Date:From:To: Subject:In-Reply-To:References:Message-ID; b=DOPgkcNdiQYDeU6n8OWpaM6q8xQoYjPpzHYaeS300269o2zdSZwl96eUQq3Byt/q+ yQ46qIzeiMBRPSdAAK0bEvD+q7cut2K71J3+KJxtzqqzZtjHWUvXKFGsIkhW/hQqKa GHYBsHSHqtmUgJGHr8RE4vGOOTbWGqY7madPz2Lc= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Date: Wed, 28 Mar 2012 12:42:43 +0200 From: Heiko Wundram To: Subject: Re: "convert" string to bytes without changing data (encoding) In-Reply-To: <9tg4qoFbfpU1@mid.dfncis.de> References: <9tg21lFmo3U1@mid.dfncis.de> <9tg4qoFbfpU1@mid.dfncis.de> X-Sender: modelnine@modelnine.org User-Agent: Roundcube Webmail/0.7.2 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 24 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1332931371 news.xs4all.nl 6982 [2001:888:2000:d::a6]:39379 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:22272 Am 28.03.2012 11:43, schrieb Peter Daum: > ... in my example, the variable s points to a "string", i.e. a series > of > bytes, (0x61,0x62 ...) interpreted as ascii/unicode characters. No; a string contains a series of codepoints from the unicode plane, representing natural language characters (at least in the simplistic view, I'm not talking about surrogates). These can be encoded to different binary storage representations, of which ascii is (a common) one. > What I am looking for is a general way to just copy the raw data > from a "string" object to a "byte" object without any attempt to > "decode" or "encode" anything ... There is "logically" no raw data in the string, just a series of codepoints, as stated above. You'll have to specify the encoding to use to get at "raw" data, and from what I gather you're interested in the latin-1 (or iso-8859-15) encoding, as you're specifically referencing chars >= 0x80 (which hints at your mindset being in LATIN-land, so to speak). -- --- Heiko.