Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'syntax': 0.03; 'explicitly': 0.04; 'output': 0.04; 'binary': 0.05; 'ascii': 0.07; 'variable,': 0.07; 'subject:How': 0.09; 'python': 0.09; 'exception,': 0.09; 'notation': 0.09; 'subject:()': 0.09; 'subject:string': 0.09; 'subject:using': 0.09; 'utf8': 0.09; 'cc:addr:python-list': 0.10; 'stored': 0.10; 'assume': 0.11; 'bits,': 0.16; 'declaration': 0.16; 'encoded.': 0.16; 'ord': 0.16; 'statement.': 0.16; 'subject: \n ': 0.16; 'subject:unicode': 0.16; 'subject:variable': 0.16; 'unicode.': 0.16; 'string': 0.17; 'wrote:': 0.17; 'byte': 0.17; 'specify': 0.17; 'unicode': 0.17; 'variable': 0.20; 'error.': 0.21; '2.x': 0.22; 'cc:2**0': 0.23; 'cc:no real name:2**0': 0.24; 'command': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; 'first,': 0.27; 'correct': 0.28; 'lines': 0.28; 'decide': 0.28; 'horizontal': 0.29; 'omitted': 0.29; 'character': 0.29; 'convert': 0.29; 'probably': 0.29; 'maybe': 0.29; 'knows': 0.30; 'code': 0.31; 'file': 0.32; 'print': 0.32; 'everyone': 0.33; 'likely': 0.33; 'version': 0.34; 'wrong': 0.34; 'pm,': 0.35; 'but': 0.36; "didn't": 0.36; 'method': 0.36; "i'll": 0.36; 'subject:: ': 0.38; 'some': 0.38; 'received:192': 0.39; 'received:192.168': 0.40; 'containing': 0.61; "you'll": 0.62; 'charset:windows-1252': 0.65; 'header:Reply-To:1': 0.68; 'answer.': 0.71; 'received:74.208': 0.71; 'reply-to:no real name:2**0': 0.72; '#note:': 0.84; 'received:74.208.4.194': 0.84; 'subject:value': 0.84 Date: Thu, 16 Aug 2012 18:47:17 -0400 From: Dave Angel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120714 Thunderbird/14.0 MIME-Version: 1.0 To: Charles Jensen Subject: Re: How do I display unicode value stored in a string variable using ord() References: In-Reply-To: Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit X-Provags-ID: V02:K0:kDbOFDxUqJQfveA3hkzSBafVbGtJwLVOYgSvgAKAyEY A1YbeQrsbjl6NFJfGXxpFnQYpvxMxe+e6iRknXkWteohKMEIow TPMEM1CPOOKErYPhxMSzjQoUOgOq3sLO0d+9GATmBqthOEfluB RJR77zMPK9bOnNrTfatnIn1OuA00pzhH6wlpUCIKt/Rq5XusaU 9F404W53fPru2Here1AD3YKC8EyZfupA6wBweTcIQoBzLLbokB IPuzzuZ07Snx0oXK1P7S5ARtbCxmG/CbDjDiZMuIgP7lVCZN4+ RgoTY8SIdPEip7wJ4qvnWcMXuUxFuFJB34NbBFar8OoljLrjA= = Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: d@davea.name List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 44 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1345157258 news.xs4all.nl 6852 [2001:888:2000:d::a6]:41457 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:27210 On 08/16/2012 06:09 PM, Charles Jensen wrote: > Everyone knows that the python command > > ord(u'…') > > will output the number 8230 which is the unicode character for the horizontal ellipsis. > > How would I use ord() to find the unicode value of a string stored in a variable? > > So the following 2 lines of code will give me the ascii value of the variable a. How do I specify ord to give me the unicode value of a? > > a = '…' > ord(a) You omitted the print statement. You also didn't specify what version of Python you're using; I'll assume Python 2.x because in Python 3.x, the u"xx" notation would have been a syntax error. To get the ord of a unicode variable, you do it the same as a unicode literal: a = u"j" #note: for this to work reliably, you probably need the correct Unicode declaration in line 2 of the file print ord(a) But if you have a byte string containing some binary bits, and you want to get a unicode character value out of it, you'll need to explicitly convert it to unicode. First, decide what method the byte string was encoded. If you specify the wrong encoding, you'll likely to get an exception, or maybe just a nonsense answer. a = "\xc1\xc1" #I just made this value up; it's not valid utf8 b = a.decode("utf-8") print ord(b) -- DaveA