Path: csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'stored': 0.04; 'string.': 0.05; 'memory.': 0.07; 'subject:data': 0.07; 'bytes.': 0.09; 'level,': 0.09; 'subject:string': 0.09; 'terry': 0.09; 'jan': 0.10; "'a'": 0.16; 'api,': 0.16; 'bits,': 0.16; 'bytes,': 0.16; 'ctypes.': 0.16; 'endianness': 0.16; 'mentally': 0.16; 'pattern': 0.16; 'possibly,': 0.16; 'python-level': 0.16; 'received:80.91': 0.16; 'received:80.91.229': 0.16; 'received:gmane.org': 0.16; 'received:list': 0.16; 'received:verizon.net': 0.16; 'reedy': 0.16; 'subject:changing': 0.16; 'string': 0.18; 'wrote:': 0.21; 'depend': 0.22; 'header:In-Reply-To:1': 0.22; 'header:User- Agent:1': 0.23; 'convert': 0.23; 'possibly': 0.23; 'memory': 0.27; 'system,': 0.27; 'interface': 0.28; 'bit': 0.28; '3.x': 0.29; 'asking': 0.29; 'decide': 0.29; 'character': 0.30; 'chris': 0.32; 'subject: (': 0.33; 'byte': 0.33; 'bytes': 0.33; 'ram': 0.33; 'could': 0.34; 'problem': 0.34; 'there': 0.35; 'should': 0.35; 'characters': 0.35; 'subject:)': 0.36; 'header:X-Complaints-To:1': 0.36; 'supposed': 0.36; 'but': 0.36; 'actual': 0.38; 'supports': 0.38; 'received:org': 0.38; 'system.': 0.39; 'to:addr:python- list': 0.39; 'think': 0.40; 'how': 0.40; 'to:addr:python.org': 0.40; 'your': 0.60; 'course': 0.61; 'between': 0.64; 'strings': 0.66; 'here': 0.66; 'is.': 0.67; 'series': 0.80; 'physical': 0.81; 'impossible.': 0.84; '3.3': 0.91 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Terry Reedy Subject: Re: "convert" string to bytes without changing data (encoding) Date: Wed, 28 Mar 2012 14:11:28 -0400 References: <9tg21lFmo3U1@mid.dfncis.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: pool-74-109-121-73.phlapa.fios.verizon.net User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20111105 Thunderbird/8.0 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 44 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1332959217 news.xs4all.nl 6918 [2001:888:2000:d::a6]:44381 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:22298 On 3/28/2012 11:36 AM, Ross Ridge wrote: > Chris Angelico wrote: >> What is a string? It's not a series of bytes. > > Of course it is. Conceptually you're not supposed to think of it that > way, but a string is stored in memory as a series of bytes. *If* it is stored in byte memory. If you execute a 3.x program mentally or on paper, then there are no bytes. If you execute a 3.3 program on a byte-oriented computer, then the 'a' in the string might be represented by 1, 2, or 4 bytes, depending on the other characters in the string. The actual logical bit pattern will depend on the big versus little endianness of the system. My impression is that if you go down to the physical bit level, then again there are, possibly, no 'bytes' as a physical construct as the bits, possibly, are stored in parallel on multiple ram chips. > What he's asking for many not be very useful or practical, but if that's > your problem here than then that's what you should be addressing, not > pretending that it's fundamentally impossible. The python-level way to get the bytes of an object that supports the buffer interface is memoryview(). 3.x strings intentionally do not support the buffer interface as there is not any particular correspondence between characters (codepoints) and bytes. The OP could get the ordinal for each character and decide how *he* wants to convert them to bytes. ba = bytearray() for c in s: i = ord(c) To get the particular bytes used for a particular string on a particular system, OP should use the C API, possibly through ctypes. -- Terry Jan Reedy