Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #59663 > unrolled thread

inconsistency in converting from/to hex

Started byLaszlo Nagy <gandalf@shopzeus.com>
First post2013-11-16 23:16 +0100
Last post2013-11-17 18:22 +0200
Articles 5 — 5 participants

Back to article view | Back to comp.lang.python


Contents

  inconsistency in converting from/to hex Laszlo Nagy <gandalf@shopzeus.com> - 2013-11-16 23:16 +0100
    Re: inconsistency in converting from/to hex Ned Batchelder <ned@nedbatchelder.com> - 2013-11-16 15:35 -0800
    Re: inconsistency in converting from/to hex Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-11-17 06:31 +0000
      Re: inconsistency in converting from/to hex Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-11-17 12:38 +0000
      Re: inconsistency in converting from/to hex Serhiy Storchaka <storchaka@gmail.com> - 2013-11-17 18:22 +0200

#59663 — inconsistency in converting from/to hex

FromLaszlo Nagy <gandalf@shopzeus.com>
Date2013-11-16 23:16 +0100
Subjectinconsistency in converting from/to hex
Message-ID<mailman.2742.1384640221.18130.python-list@python.org>
We can convert from hex str to bytes with bytes.fromhex class method:

 >>> b = bytes.fromhex("ff")

But we cannot convert from hex binary:

 >>> b = bytes.fromhex(b"ff")
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
TypeError: must be str, not bytes

We don't have bytes_instance.tohex() instance method.
But we have binascii.hexlify. But binascii.hexlify does not return an 
str. It returns a bytes instance instead.

 >>> import binascii
 >>> binascii.hexlify(b)
b'ff'

Its reverse function binascii.unhexlify can be used on str and bytes too:

 >>> binascii.unhexlify(b'ff')
b'\xff'
 >>> binascii.unhexlify('ff')
b'\xff'

Questions:

* if we have bytes.fromhex() then why don't we have bytes_instance.tohex() ?
* if the purpose of binascii.unhexlify and bytes.fromhex is the same, 
then why allow binary arguments for the former, and not for the later?
* in this case, should there be "one obvious way to do it" or not?


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

[toc] | [next] | [standalone]


#59671

FromNed Batchelder <ned@nedbatchelder.com>
Date2013-11-16 15:35 -0800
Message-ID<76786fb5-0108-43fa-a1a5-5f86119cab61@googlegroups.com>
In reply to#59663
On Saturday, November 16, 2013 5:16:58 PM UTC-5, Laszlo Nagy wrote:
> We can convert from hex str to bytes with bytes.fromhex class method:
> 
>  >>> b = bytes.fromhex("ff")
> 
> But we cannot convert from hex binary:
> 
>  >>> b = bytes.fromhex(b"ff")
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> TypeError: must be str, not bytes
> 
> We don't have bytes_instance.tohex() instance method.
> But we have binascii.hexlify. But binascii.hexlify does not return an 
> str. It returns a bytes instance instead.
> 
>  >>> import binascii
>  >>> binascii.hexlify(b)
> b'ff'
> 
> Its reverse function binascii.unhexlify can be used on str and bytes too:
> 
>  >>> binascii.unhexlify(b'ff')
> b'\xff'
>  >>> binascii.unhexlify('ff')
> b'\xff'
> 
> Questions:
> 
> * if we have bytes.fromhex() then why don't we have bytes_instance.tohex() ?
> * if the purpose of binascii.unhexlify and bytes.fromhex is the same, 
> then why allow binary arguments for the former, and not for the later?
> * in this case, should there be "one obvious way to do it" or not?

The standard library is not always as consistent as we might like.  I don't think there is a better answer than that.

This will work if you want to use fromhex with bytes:

    b = bytes.fromhex(b"ff".decode("ascii"))


--Ned.

[toc] | [prev] | [next] | [standalone]


#59699

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-11-17 06:31 +0000
Message-ID<528862b2$0$29975$c3e8da3$5496439d@news.astraweb.com>
In reply to#59663
On Sat, 16 Nov 2013 23:16:58 +0100, Laszlo Nagy wrote:

> Questions:
> 
> * if we have bytes.fromhex() then why don't we have
> bytes_instance.tohex() ? 

The Python core developers are quite conservative about adding new 
methods, particularly when there is already a solution to the given 
problem. bytes.fromhex is very useful, because when working with binary 
data it is common to give data as strings of hex values, and so it is 
good to have a built-in method for it:

image = bytes.fromhex('ffd8ffe000104a464946000101 ...')

On the other hand, converting bytes to hexadecimal values is less common. 
There's already at least two ways to do it in Python 2:

py> import binascii
py> binascii.hexlify('Python')
'507974686f6e'

py> import codecs
py> codecs.encode('Python', 'hex')
'507974686f6e'

[Aside: in Python 3, the codecs where (mistakenly) removed, but they'll 
be added back in 3.4 or 3.5.]

So I can only imagine that had somebody proposed a bytes.tohex() method, 
they would have been told "there's already a way to do that, this isn't 
important enough to justify being built-in".


> * if the purpose of binascii.unhexlify and bytes.fromhex is the same,
> then why allow binary arguments for the former, and not for the later?

I would argue that the purpose is *not* the same. binascii is for working 
with binary files, hence it accepts bytes and produces bytes. 
bytes.fromhex is for producing bytes from strings.

It's an exceedingly narrow distinction, and I can understand anyone who 
is not convinced by my argument. I'm only half-convinced myself.


> * in this case, should there be "one obvious way to do it" or not?

Define "it". Do you mean "convert bytes to bytes", "bytes to str", "str 
to bytes", or "str to str"?

Besides, one *obvious* way is not the same as *only one* way.

I agree that its a bit of a mess. But only a little bit, and it will be 
less messy by 3.5 when the codecs solution is re-introduced. Then the 
codecs.encode and decode functions will be the one obvious way.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#59715

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2013-11-17 12:38 +0000
Message-ID<mailman.2770.1384691918.18130.python-list@python.org>
In reply to#59699
On 17/11/2013 06:31, Steven D'Aprano wrote:
>
> I agree that its a bit of a mess. But only a little bit, and it will be
> less messy by 3.5 when the codecs solution is re-introduced. Then the
> codecs.encode and decode functions will be the one obvious way.
>

For anyone who's interested in the codecs issues see 
http://bugs.python.org/issue7475 and http://bugs.python.org/issue19543

-- 
Python is the second best programming language in the world.
But the best has yet to be invented.  Christian Tismer

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#59756

FromSerhiy Storchaka <storchaka@gmail.com>
Date2013-11-17 18:22 +0200
Message-ID<mailman.2786.1384705376.18130.python-list@python.org>
In reply to#59699
17.11.13 08:31, Steven D'Aprano написав(ла):
> There's already at least two ways to do it in Python 2:
>
> py> import binascii
> py> binascii.hexlify('Python')
> '507974686f6e'
>
> py> import codecs
> py> codecs.encode('Python', 'hex')
> '507974686f6e'

Third:

 >>> import base64
 >>> base64.b16encode(b'Python')
b'507974686F6E'

Fourth:

 >>> '%0*x' % (2*len(b'Python'), int.from_bytes(b'Python', byteorder='big'))
b'507974686F6E'

Fifth:

 >>> ''.join('%02x' % b for b in b'Python')
b'507974686F6E'

> [Aside: in Python 3, the codecs where (mistakenly) removed, but they'll
> be added back in 3.4 or 3.5.]

Only renamed.

 >>> import codecs
 >>> codecs.encode(b'Python', 'hex_codec')
b'507974686f6e'

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web