Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'string.': 0.04; 'encoded': 0.05; 'python': 0.08; 'closer': 0.09; 'integer,': 0.09; 'string)': 0.09; 'utf-8': 0.09; '>>>': 0.12; 'am,': 0.14; 'wrote:': 0.14; '8:40': 0.16; 'line.strip()': 0.16; 'subject:Unicode': 0.16; 'wolfgang': 0.16; '\xa0for': 0.16; 'tue,': 0.17; 'bytes': 0.19; 'convert': 0.19; 'received:74.125.82.44': 0.19; 'received:mail- ww0-f44.google.com': 0.19; 'solution.': 0.19; 'header:In-Reply- To:1': 0.21; 'file,': 0.22; 'works.': 0.23; '(and': 0.25; 'function': 0.25; 'changed': 0.25; 'message-id:@mail.gmail.com': 0.28; 'explicitly': 0.29; 'instead': 0.29; 'bit': 0.30; 'looks': 0.31; 'does': 0.33; 'to:addr:python-list': 0.33; 'list': 0.33; 'actually': 0.33; 'lines': 0.33; 'characters': 0.34; 'daniel': 0.34; 'thank': 0.35; 'recognize': 0.35; 'using': 0.35; '8bit%:8': 0.36; 'received:google.com': 0.37; 'change': 0.37; 'instead.': 0.37; 'received:74.125.82': 0.38; 'received:74.125': 0.38; 'but': 0.38; 'subject:: ': 0.38; '8bit%:6': 0.39; 'unless': 0.39; 'to:addr:python.org': 0.39; 'really': 0.40; 'help': 0.40; '31,': 0.65; 'schrieb': 0.84 MIME-Version: 1.0 In-Reply-To: <4de50cfd$0$6538$9b4e6d93@newsspool4.arcor-online.net> References: <4de40ee8$0$6623$9b4e6d93@newsspool2.arcor-online.net> <4de50cfd$0$6538$9b4e6d93@newsspool4.arcor-online.net> Date: Tue, 31 May 2011 09:42:45 -0700 Subject: Re: sqlalchemy and Unicode strings: errormessage From: Benjamin Kaplan To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Junkmail-Status: score=10/49, host=mpv2.tis.cwru.edu X-Junkmail-Signature-Raw: score=unknown, refid=str=0001.0A020206.4DE51A87.006C,ss=1,fgs=0, ip=74.125.82.44, so=2010-12-23 16:51:53, dmn=2009-09-10 00:05:08, mode=single engine X-Junkmail-IWF: false X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 49 NNTP-Posting-Host: 82.94.164.166 X-Trace: 1306860238 news.xs4all.nl 49176 [::ffff:82.94.164.166]:37181 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:6738 On Tue, May 31, 2011 at 8:45 AM, Wolfgang Meiners wrote: > Am 31.05.11 13:32, schrieb Daniel Kluev: >> On Tue, May 31, 2011 at 8:40 AM, Wolfgang Meiners >> wrote: >>> metadata =3D MetaData('sqlite://') >>> a_table =3D Table('tf_lehrer', metadata, >>> =A0 =A0Column('id', Integer, primary_key=3DTrue), >>> =A0 =A0Column('Kuerzel', Text), >>> =A0 =A0Column('Name', Text)) >> >> Use UnicodeText instead of Text. >> >>> A_record =3D A_class('BUM', 'B=E4umer') >> >> If this is python2.x, use u'B=E4umer' instead. >> >> > > Thank you Daniel. > So i came a little bit closer to the solution. Actually i dont write the > strings in a python program but i read them from a file, which is > utf8-encoded. > > So i changed the lines > > =A0 =A0for line in open(file,'r'): > =A0 =A0 =A0 =A0line =3D line.strip() > > first to > > =A0 =A0for line in open(file,'r'): > =A0 =A0 =A0 =A0line =3D unicode(line.strip()) > > and finally to > > =A0 =A0for line in open(file,'r'): > =A0 =A0 =A0 =A0line =3D unicode(line.strip(),'utf8') > > and now i get really utf8-strings. It does work but i dont know why it > works. For me it looks like i change an utf8-string to an utf8-string. > There's no such thing as a UTF-8 string. You have a list of bytes (byte string) and you have a list of characters (unicode). UTF-8 is a function that can convert bytes into characters (and the reverse). You may recognize that the list of bytes was encoded using UTF-8 but the computer does not unless you explicitly tell it to. Does that help clear it up?