Path: csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Date: Thu, 16 Aug 2012 18:47:17 -0400
From: Dave Angel <d@davea.name>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120714 Thunderbird/14.0
MIME-Version: 1.0
To: Charles Jensen <hopefullycharles@gmail.com>
Subject: Re: How do I display unicode value stored in a string variable using ord()
References: <f801e06f-f7b2-4aca-b352-66856a939746@googlegroups.com>
In-Reply-To: <f801e06f-f7b2-4aca-b352-66856a939746@googlegroups.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 8bit
Cc: python-list@python.org
Precedence: list
Reply-To: d@davea.name
Newsgroups: comp.lang.python
Message-ID: <mailman.3401.1345157258.4697.python-list@python.org>
Lines: 44
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:27210

On 08/16/2012 06:09 PM, Charles Jensen wrote:
> Everyone knows that the python command
>
>      ord(u'…')
>
> will output the number 8230 which is the unicode character for the horizontal ellipsis.
>
> How would I use ord() to find the unicode value of a string stored in a variable?  
>
> So the following 2 lines of code will give me the ascii value of the variable a.  How do I specify ord to give me the unicode value of a?
>
>      a = '…'
>      ord(a)

You omitted the print statement.  You also didn't specify what version
of Python you're using;  I'll assume Python 2.x because in Python 3.x,
the u"xx" notation would have been a syntax error.

To get the ord of a unicode variable, you do it the same as a unicode
literal:

       a = u"j"         #note: for this to work reliably, you probably
need the correct Unicode declaration in line 2 of the file
       print ord(a)

But if you have a byte string containing some binary bits, and you want
to get a unicode character value out of it, you'll need to explicitly
convert it to unicode.

First, decide what method the byte string was encoded.  If you specify
the wrong encoding, you'll likely to get an exception, or maybe just a
nonsense answer.

       a = "\xc1\xc1"            #I just made this value up;  it's not
valid utf8
       b = a.decode("utf-8")
       print ord(b)



-- 

DaveA