Groups > comp.lang.python > #62462 > unrolled thread

Re: bytearray inconsistencies?

Started by	Ned Batchelder <ned@nedbatchelder.com>
First post	2013-12-20 20:58 -0500
Last post	2013-12-20 20:58 -0500
Articles	1 — 1 participant

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Re: bytearray inconsistencies? Ned Batchelder <ned@nedbatchelder.com> - 2013-12-20 20:58 -0500

#62462 — Re: bytearray inconsistencies?

From	Ned Batchelder <ned@nedbatchelder.com>
Date	2013-12-20 20:58 -0500
Subject	Re: bytearray inconsistencies?
Message-ID	<mailman.4448.1387591129.18130.python-list@python.org>

On 12/20/13 8:06 PM, Mark Lawrence wrote:
> Quoting from http://docs.python.org/3/library/functions.html#bytearray
>
> "The bytearray type is a mutable sequence of integers in the range 0 <=
> x < 256."
>
> Quoting from http://docs.python.org/3/library/stdtypes.html#bytes-methods
>
> "Whenever a bytes or bytearray method needs to interpret the bytes as
> characters (e.g. the is...() methods, split(), strip()), the ASCII
> character set is assumed (text strings use Unicode semantics).
>
> Note - Using these ASCII based methods to manipulate binary data that is
> not stored in an ASCII based format may lead to data corruption.
>
> The search operations (in, count(), find(), index(), rfind() and
> rindex()) all accept both integers in the range 0 to 255 (inclusive) as
> well as bytes and byte array sequences.
>
> Changed in version 3.3: All of the search methods also accept an integer
> in the range 0 to 255 (inclusive) as their first argument."
>
> I don't understand why the docs talk about "a mutable sequence of
> integers" but then discuss "needs to interpret the bytes as characters".

The split and strip methods work with whitespace when given no 
arguments.  Bytes aren't whitespace.  Characters can be, so the bytes 
need to be interpreted as characters.  Likewise, the is* methods 
(isalnum, isalpha, isdigit, islower, isspace, istitle, isupper) all 
require characters, so the bytes must be interpreted.

>   Further I don't understand why the changes done in 3.3 referred to
> above haven't also been applied to (say) the split method.  If I can
> call find to look for a zero, why can't I split on it?
>

I don't know the reason, but I would guess either no one considered it, 
or it was deemed unlikely to be useful.

If you have a zero, you can split on it with: 
bytestring.split(bytes([0])), but that doesn't explain why find can take 
a simple zero, and split has to take a bytestring with a zero in it.

-- 
Ned Batchelder, http://nedbatchelder.com

[toc] | [standalone]

csiph-web

Re: bytearray inconsistencies?

Contents

#62462 — Re: bytearray inconsistencies?