Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #62462

Re: bytearray inconsistencies?

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder4.news.weretis.net!rt.uk.eu.org!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'binary': 0.07; 'method.': 0.07; 'reason,': 0.07; 'ascii': 0.09; 'assumed': 0.09; 'deemed': 0.09; 'integers': 0.09; 'interpreted': 0.09; 'lawrence': 0.09; 'methods,': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'sequences.': 0.09; 'stored': 0.12; 'changes': 0.15; '(inclusive)': 0.16; '255': 0.16; 'bytearray': 0.16; 'interpreted.': 0.16; 'mutable': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'unlikely': 0.16; 'whitespace.': 0.16; 'zero,': 0.16; 'wrote:': 0.18; 'split': 0.19; 'header:User-Agent:1': 0.23; "aren't": 0.24; 'byte': 0.24; 'bytes': 0.24; 'integer': 0.24; 'interpret': 0.24; 'received:comcast.net': 0.24; 'unicode': 0.24; "haven't": 0.24; 'header:X-Complaints-To:1': 0.27; 'header:In-Reply-To:1': 0.27; 'array': 0.29; 'character': 0.29; "doesn't": 0.30; 'characters': 0.30; 'strip': 0.31; 'subject:skip:i 10': 0.31; 'url:python': 0.33; '(e.g.': 0.33; 'guess': 0.33; '"the': 0.34; "can't": 0.35; 'operations': 0.35; 'but': 0.35; 'version': 0.36; 'sequence': 0.36; 'done': 0.36; 'method': 0.36; 'subject:?': 0.36; 'url:org': 0.36; 'url:library': 0.38; 'to:addr:python-list': 0.38; 'pm,': 0.38; 'explain': 0.39; 'to:addr:python.org': 0.39; 'changed': 0.39; 'either': 0.39; 'received:org': 0.40; 'referred': 0.60; 'url:3': 0.61; 'range': 0.61; 'simple': 0.61; 'further': 0.61; 'first': 0.61; 'discuss': 0.62; 'useful.': 0.68; 'characters,': 0.84
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From Ned Batchelder <ned@nedbatchelder.com>
Subject Re: bytearray inconsistencies?
Date Fri, 20 Dec 2013 20:58:33 -0500
References <l92phr$3b2$1@ger.gmane.org>
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-Gmane-NNTP-Posting-Host c-50-133-228-126.hsd1.ma.comcast.net
User-Agent Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
In-Reply-To <l92phr$3b2$1@ger.gmane.org>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.4448.1387591129.18130.python-list@python.org> (permalink)
Lines 46
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1387591129 news.xs4all.nl 2926 [2001:888:2000:d::a6]:55995
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:62462

Show key headers only | View raw


On 12/20/13 8:06 PM, Mark Lawrence wrote:
> Quoting from http://docs.python.org/3/library/functions.html#bytearray
>
> "The bytearray type is a mutable sequence of integers in the range 0 <=
> x < 256."
>
> Quoting from http://docs.python.org/3/library/stdtypes.html#bytes-methods
>
> "Whenever a bytes or bytearray method needs to interpret the bytes as
> characters (e.g. the is...() methods, split(), strip()), the ASCII
> character set is assumed (text strings use Unicode semantics).
>
> Note - Using these ASCII based methods to manipulate binary data that is
> not stored in an ASCII based format may lead to data corruption.
>
> The search operations (in, count(), find(), index(), rfind() and
> rindex()) all accept both integers in the range 0 to 255 (inclusive) as
> well as bytes and byte array sequences.
>
> Changed in version 3.3: All of the search methods also accept an integer
> in the range 0 to 255 (inclusive) as their first argument."
>
> I don't understand why the docs talk about "a mutable sequence of
> integers" but then discuss "needs to interpret the bytes as characters".

The split and strip methods work with whitespace when given no 
arguments.  Bytes aren't whitespace.  Characters can be, so the bytes 
need to be interpreted as characters.  Likewise, the is* methods 
(isalnum, isalpha, isdigit, islower, isspace, istitle, isupper) all 
require characters, so the bytes must be interpreted.

>   Further I don't understand why the changes done in 3.3 referred to
> above haven't also been applied to (say) the split method.  If I can
> call find to look for a zero, why can't I split on it?
>

I don't know the reason, but I would guess either no one considered it, 
or it was deemed unlikely to be useful.

If you have a zero, you can split on it with: 
bytestring.split(bytes([0])), but that doesn't explain why find can take 
a simple zero, and split has to take a bytestring with a zero in it.

-- 
Ned Batchelder, http://nedbatchelder.com

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: bytearray inconsistencies? Ned Batchelder <ned@nedbatchelder.com> - 2013-12-20 20:58 -0500

csiph-web