Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python3': 0.05; 'subject:bug': 0.05; 'strings.': 0.07; 'python': 0.09; '(complex)': 0.09; 'before.': 0.09; 'bytes,': 0.09; 'modules.': 0.09; 'non-ascii': 0.09; 'parameter.': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'record.': 0.09; 'terry': 0.09; 'bug': 0.10; 'suggest': 0.11; 'library': 0.15; 'encoding': 0.15; 'arbitrarily': 0.16; 'bug)': 0.16; 'bugs,': 0.16; 'headers.': 0.16; 'meanwhile,': 0.16; 'os.walk': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'reedy': 0.16; 'replaces': 0.16; 'unicode.': 0.16; 'wrote:': 0.17; 'bytes': 0.17; 'unicode': 0.17; 'thanks,': 0.18; 'jan': 0.18; '>>>': 0.18; 'trying': 0.21; 'ctypes': 0.22; 'names.': 0.22; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'am,': 0.27; 'library.': 0.27; 'header:X-Complaints-To:1': 0.28; 'idea,': 0.29; 'request,': 0.29; 'sensible': 0.29; 'convert': 0.29; 'objects': 0.29; 'related': 0.30; 'returned': 0.30; 'function': 0.30; 'file': 0.32; 'structure': 0.32; 'handle': 0.33; 'to:addr:python-list': 0.33; 'version': 0.34; 'subject:?': 0.35; 'there': 0.35; 'received:org': 0.36; 'skip:u 20': 0.36; 'but': 0.36; 'characters': 0.36; 'should': 0.36; 'does': 0.37; '(for': 0.37; 'rather': 0.37; 'well.': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'object': 0.38; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'think': 0.40; 'most': 0.61; 'real': 0.61; 'containing': 0.61; 'first': 0.61; 'strange': 0.62; 'more': 0.63; 'behavior': 0.64; 'here': 0.65; 'presenting': 0.65; 'discovered': 0.83; 'received:fios.verizon.net': 0.84; 'enhancement': 0.95 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Terry Reedy Subject: Re: Python3.3 str() bug? Date: Fri, 09 Nov 2012 18:35:46 -0500 References: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: pool-173-75-251-66.phlapa.fios.verizon.net User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120824 Thunderbird/15.0 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 34 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1352504175 news.xs4all.nl 6927 [2001:888:2000:d::a6]:46120 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:33065 On 11/9/2012 8:13 AM, Helmut Jarausch wrote: > Just for the record. > I first discovered a real bug with Python3 when using os.walk on a file system > containing non-ascii characters in file names. > > I encountered a very strange behavior (I still would call it a bug) when trying > to put non-ascii characters in email headers. > This has only been solved satisfactorily in Python3.3. Most bugs, such as the above, are in library modules. There have been many related to unicode. In my opinion, 3.3 is the first version to handle unicode decently well. >>> How can I convert a data strucure of arbitrarily complex nature, which contains >>> bytestrings somewhere, to a string? > Thanks, but in my case the (complex) object is returned via ctypes from the > aspell library. > I still think that a standard function in Python3 which is able to 'stringify' > objects should take an encoding parameter. This is an interesting idea, which I have not seen before. It is more sensible in Python 3 than in Python 2. (For py2, unicode(str(object), encoding='xxx') does what you want.) Try presenting it here or on python-ideas as an enhancement request, rather than as a bug report ;-). In the meanwhile, if you cannot have the object constructed with strings rather than bytes, I suggest you write a custom converter function that understands the structure and replaces bytes with strings. -- Terry Jan Reedy