Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.005 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'argument': 0.05; 'explicitly': 0.05; 'subject:Python': 0.06; 'skip:" 60': 0.07; 'string': 0.09; '__init__': 0.09; 'encode': 0.09; 'python': 0.11; 'posted': 0.15; 'charset': 0.16; 'codec': 0.16; 'from:addr:torriem': 0.16; 'from:name:michael torrie': 0.16; 'ordinal': 0.16; 'subject:issue': 0.16; 'superfluous': 0.16; 'why,': 0.16; 'wrote:': 0.18; 'variable': 0.18; 'things.': 0.19; 'creating': 0.23; 'header:User-Agent:1': 0.23; 'tells': 0.24; 'source': 0.25; 'query': 0.26; 'defined': 0.27; 'header:In-Reply- To:1': 0.27; 'idea': 0.28; 'character': 0.29; 'characters': 0.30; 'specified': 0.30; "i'm": 0.30; 'code': 0.31; 'lines': 0.31; 'object.': 0.31; 'skip:q 20': 0.31; 'file': 0.32; 'checked': 0.32; 'subject: (': 0.35; "can't": 0.35; 'connection': 0.35; 'problem.': 0.35; 'there': 0.35; 'keyword': 0.36; 'doing': 0.36; 'method': 0.36; 'shows': 0.36; "i'll": 0.36; 'should': 0.36; 'message- id:@gmail.com': 0.38; 'whatever': 0.38; 'to:addr:python-list': 0.38; 'fact': 0.38; 'pm,': 0.38; 'sure': 0.39; 'to:addr:python.org': 0.39; 'received:org': 0.40; 'even': 0.60; 'skip:u 10': 0.60; 'above,': 0.60; 'identify': 0.61; 'subject: ': 0.61; 'you.': 0.62; 'details': 0.65; 'line,': 0.68; 'default': 0.69; 'walk': 0.74; '3.3.1': 0.84; 'here...': 0.84 X-Virus-Scanned: amavisd-new at torriefamily.org Date: Tue, 28 May 2013 08:34:25 -0600 From: Michael Torrie User-Agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.12) Gecko/20130105 Thunderbird/10.0.12 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Encodign issue in Python 3.3.1 (once again) References: <7823093c-2c07-4fa0-ae97-960c62f8ff9d@googlegroups.com> <69f5q8hkra42rvpfmbng4f0airgikqf5js@4ax.com> <9560d43f-8100-4914-9a71-c30ab479ec35@googlegroups.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 30 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1369751680 news.xs4all.nl 15995 [2001:888:2000:d::a6]:36780 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:46292 On 05/27/2013 02:17 PM, Νίκος Γκρ33κ wrote: > I have checked the database through phpMyAdmin and it is indeed UTF-8. > > I have no idea why python 3.3.1 chooses to work with latin-iso only.... It's not python that is doing this here... If you look at the source code to pymysql, I'm sure you will identify the problem. In fact I'll even walk you through things. The traceback you posted tells you what's going on and why, with superfluous details removed for clarity: File "/opt/python3/lib/python3.3/site-packages/pymysql/cursors.py", line 108, in execute: query = query.encode(charset) UnicodeEncodeError: 'latin-1' codec can't encode characters in position 46-52: ordinal not in range(256) So there we have it. pymysql is actually explicitly calling .encode() on a string and is using whatever character set is specified by the local variable charset. If you look in cursors.py at that line, and then a few lines above, you will find that charset is assigned to conn.charset. This means that the charset is actually defined in the connection object. So go look at connections.py. Sure enough, it shows that charset is defined by default as "latin-1." That's no good for you. So take a look at the __init__ method in connections.py. In there you should find the necessary keyword argument you need to use when creating the mysql connection to make sure the charset is utf-8.