Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Date: Mon, 6 Jan 2014 05:49:28 -0600
From: Tim Chase <python.list@tim.thechases.com>
To: python-list@python.org
Subject: Re: "More About Unicode in Python 2 and 3"
In-Reply-To: <CAPTjJmqNZyuvVvBu1zD7Ht7YySZ9weDLiOdL=6TNKHhfsqqH7g@mail.gmail.com>
References: <lablra$1mc$2@ger.gmane.org> <labmaj$8u2$1@ger.gmane.org> <lad05k$gf6$1@ger.gmane.org> <CAPTjJmqBeoTLxXiKVcsvk395qgKt+Qv+jF_sOpzi7CgZmBjQcw@mail.gmail.com> <52CA13BD.4050708@stoneleaf.us> <mailman.5001.1388976943.18130.python-list@python.org> <roy-7ED5DF.23241105012014@news.panix.com> <CAPTjJmqNZyuvVvBu1zD7Ht7YySZ9weDLiOdL=6TNKHhfsqqH7g@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.5012.1389008911.18130.python-list@python.org>
Lines: 40
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:63276

On 2014-01-06 15:51, Chris Angelico wrote:
> >>> data = b"\x43\x6c\x67\x75\x62\x61" # is there an easier way to
> >>> turn a hex dump into a bytes literal?

Depends on how you source them:


# space separated:
>>> s1 = "43 6c 67 75 62 61"
>>> ''.join(chr(int(pair, 16)) for pair in s1.split())
'Clguba'

# all smooshed together:
>>> s2 = s1.replace(' ','')
>>> s2
'436c67756261'
>>> ''.join(chr(int(s2[i*2:(i+1)*2], 16)) for i in range(len(s2)/2))
'Clguba'

# as \xHH escaped:
>>> s3 = ''.join('\\x'+s2[i*2:(i+1)*2] for i in range(len(s2)/2))
>>> print(s3)
\x43\x6c\x67\x75\x62\x61
>>> print(b3)
b'\\x43\\x6c\\x67\\x75\\x62\\x61'
>>> b3.decode('unicode_escape')
'Clguba'

It might get more complex if you're not just dealing with bytes, or
if you have some other encoding scheme, but "s1" (space-separated, or
some other delimiter such as colon-separated that can be passed
to the .split() call) and "s2" (all smooshed together) are the two I
encounter most frequently.

-tkc