Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #96613

Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?

Path csiph.com!eternal-september.org!feeder.eternal-september.org!border1.nntp.ams1.giganews.com!nntp.giganews.com!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!nzpost1.xs4all.net!not-for-mail
Return-Path <random832@fastmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'subject:: [': 0.03; 'string.': 0.04; 'ignored': 0.05; 'modified': 0.05; 'constructor': 0.07; 'terminated': 0.07; 'utf-8': 0.07; 'versions.': 0.07; 'cc:addr:python-list': 0.09; 'assumed': 0.09; 'bytes)': 0.09; 'bytes,': 0.09; 'high-level': 0.09; 'ignoring': 0.09; 'length.': 0.09; 'literal': 0.09; 'received:internal': 0.09; 'tuple': 0.09; 'unicode,': 0.09; 'python': 0.10; 'output': 0.13; 'argument': 0.15; '*you*': 0.16; '2014': 0.16; '\\n,': 0.16; 'bytes).': 0.16; 'count,': 0.16; 'credit:': 0.16; 'did,': 0.16; 'expected,': 0.16; 'message-id:@webmail.messagingengine.com': 0.16; 'offsets': 0.16; 'opcode': 0.16; 'peters': 0.16; 'received:10.202': 0.16; 'received:10.202.2': 0.16; 'received:10.202.2.212': 0.16; 'received:10.202.2.44': 0.16; 'received:66.111': 0.16; 'received:66.111.4': 0.16; 'received:compute4.internal': 0.16; 'received:messagingengine.com': 0.16; 'sense,': 0.16; 'valid.': 0.16; 'wrote:': 0.16; "wouldn't": 0.16; 'string': 0.17; 'byte': 0.18; 'bytes': 0.18; 'first.': 0.18; 'string,': 0.18; 'subject:] ': 0.19; 'cc:addr:python.org': 0.20; 'cc:2**1': 0.22; 'arguments': 0.22; 'level,': 0.22; 'sep': 0.22; 'sorry,': 0.22; 'subject:skip:i 10': 0.22; 'appears': 0.23; 'represents': 0.23; 'sets': 0.23; 'skip:b 30': 0.24; 'tim': 0.24; 'header:In-Reply-To:1': 0.24; 'mon,': 0.24; "doesn't": 0.26; 'appear': 0.26; 'rest': 0.26; 'earlier': 0.27; 'followed': 0.27; '14,': 0.27; 'values': 0.28; 'looks': 0.29; 'arguments,': 0.29; 'figured': 0.29; 'for,': 0.29; 'helpful.': 0.29; 'itself,': 0.29; 'pickle': 0.29; "i'm": 0.30; 'work.': 0.30; 'initially': 0.30; "i'd": 0.31; 'skip:d 40': 0.32; 'consist': 0.33; 'similar': 0.33; 'except': 0.34; 'add': 0.34; 'that,': 0.34; 'so,': 0.35; 'protocol': 0.35; 'level': 0.35; 'but': 0.36; 'instead': 0.36; '(and': 0.36; 'cases': 0.36; 'subject:" ': 0.36; 'subject:?': 0.36; 'received:10': 0.37; 'two': 0.37; 'aspects': 0.37; 'received:66': 0.38; 'version': 0.38; 'data': 0.39; 'format': 0.39; 'does': 0.39; 'enough': 0.39; 'subject:-': 0.39; 'mark': 0.40; 'some': 0.40; 'future': 0.60; 'ten': 0.60; "you'll": 0.61; 'header:Message-Id:1': 0.61; 'impact': 0.61; 'show': 0.62; 'is.': 0.63; 'more': 0.63; 'more.': 0.63; 'subject:there': 0.66; 'afraid': 0.67; 'low': 0.83; '3.5"': 0.84; 'payload': 0.84; 'pickled': 0.84; 'post,': 0.84; 'post;': 0.84; 'subject:any': 0.84; '(mark': 0.93; 'subject:Are': 0.95
DKIM-Signature v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-sasl-enc :x-sasl-enc; s=mesmtp; bh=OPG5+7hsBhJLSZZqbQP8q2wfoa4=; b=rRLt9Q FgYmJaltdE2RCM/viUVu0HydYWmj59dCBWQzZuMgaSe7Pvk5jT674dREFwcFx02q LANNpLklEywkeZGlROCBJl/iAS/7Cua/pKw5NWizdnPML5qsST1sYiEUM4otBNC+ nXgWynHwRGMcHq/QfOnaPHHA9LQLE5CUJqJ2E=
DKIM-Signature v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=OPG5+7hsBhJLSZZ qbQP8q2wfoa4=; b=QY1iWR1Ju/aFCIfjfNj03JNI5IDN4KXAZ821rxOoq6iUWIB C9Ee+7D4RLG7UzEkR68jZlhtWbeGjB6UcyIeQO+zMx5xsN891Wmk2B2M+lvgzjoq NfKbE+3QjO8+jI72c+vDnqRhiJyMvCluZ4EQRBMdF/hwoOSaC4F6Lkbk2S6s=
X-Sasl-Enc VjXTAjGyyKmuAhNMrRpULRui2WZygH6UObOcbKhvjg8/ 1442279996
From Random832 <random832@fastmail.com>
To Tim Peters <tim.peters@gmail.com>
Cc "Python-List" <python-list@python.org>, "datetime-sig" <datetime-sig@python.org>
MIME-Version 1.0
Content-Transfer-Encoding 7bit
Content-Type text/plain
X-Mailer MessagingEngine.com Webmail Interface - ajax-c76b43ce
In-Reply-To <CAExdVNkLmcCNHa20Y3NVe1w8U+GDXhxLZ8e9Zt74i0WAwi9iWA@mail.gmail.com>
References <m2h9mzqyy7.fsf@fastmail.com> <CAExdVNm0rrxhbSONfQdor7e=9t+6Tg3hh6eFq-h0NnfdpteVYg@mail.gmail.com> <1442085362.324875.381920729.5E7A6DCE@webmail.messagingengine.com> <CAExdVNnUwRKN2q=trpnD9=mxnXvuosWKO+s5=PCfjAO45-Yugw@mail.gmail.com> <CAP7h-xYFAggqbJBNCZbYFwUqCPzW7-4Rc0x_SzgAumpFYAr6oA@mail.gmail.com> <CAExdVNmfTsqunRR_b-Q1YqWrTjsV1L5ppAoAnL7SUwgR2PFU0A@mail.gmail.com> <CAP7h-xbqq2Eu+vVO2g4WmbRJ=gmL1r9D2sWhKe=qQ8Ev-n415w@mail.gmail.com> <CAP7+vJ+1Vg21qOWAD+9R2REDnWiW4o6nj=Rh8fx3x3cAZipcdw@mail.gmail.com> <CAP7h-xbA6ZagKj+rQ_cCc8d0oK1F329AAhSjY240iiDyB4756A@mail.gmail.com> <CAExdVNnw6zfJQ_wFOurjj6kcLcbr9RgXA7kvyNPmgraMcHU4vQ@mail.gmail.com> <CAP7h-xYp=MsOcvFXNCHNkXDDGH=jcAwYSSw1WWchy5bwZKjwLA@mail.gmail.com> <201509131224.t8DCOXHO004891@fido.openend.se> <CAExdVNkN465=bWeY61PFAxpiUQ-u0p2zaqtwH8PQWyQO4Jw9Mg@mail.gmail.com> <201509131600.t8DG07e0025688@fido.openend.se> <CAExdVNm3e43mJ3tqcUc9175WssV4zeuO024svJbMTjrTab=Qew@mail.gmail.com> <201509132031.t8DKVTwJ028027@fido.openend.se> <CAExdVNkeRVgV8CXLugMgqhUSuXU=qHYSFUo24Xw83X=8tVBjCg@mail.gmail.com> <201509140827.t8E8RPqb001076@fido.openend.se> <CAExdVNn2wM8YW=Jg=aM86X6RaaVYPTioTSJ2d1gGp0k76CN3mg@mail.gmail.com> <1442257996.253100.383441705.7A0986C7@webmail.messagingengine.com> <CAExdVN=s_V_uz9mSOtp6b6+fKjqLZHArXKUQ-ty5EYkLM5V2qw@mail.gmail.com> <1442260714.263025.383475777.4728D768@webmail.messagingengine.com> <CAExdVN=OWrFxYiuNZrWuVHfeT74VyXaUOXLKj0Q1Ai1SoupqbQ@mail.gmail.com> <1442262425.268793.383506657.0443601E@webmail.messagingengine.com> <CAExdVNmNL4oiGWjokOmvKaWvqPdsp2kuGKYGJ-aERRB+irQU8A@mail.gmail.com> <1442265800.280460.383547057.16B65298@webmail.messagingengine.com> <CAExdVNnJG3-wz1fLtV2wLb+cbHQstc0Zh-KGxBFj6Q5CeqQpEw@mail.gmail.com> <1442267635.287083.383576201.0990DAA7@webmail.messagingengine.com> <CAExdVNkLmcCNHa20Y3NVe1w8U+GDXhxLZ8e9Zt74i0WAwi9iWA@mail.gmail.com>
Subject Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo?
Date Mon, 14 Sep 2015 21:19:56 -0400
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.20+
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.579.1442279999.8327.python-list@python.org> (permalink)
Lines 86
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1442279999 news.xs4all.nl 23778 [2001:888:2000:d::a6]:47056
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:96613

Show key headers only | View raw


On Mon, Sep 14, 2015, at 18:09, Tim Peters wrote:
> Sorry, I'm not arguing about this any more.  Pickle doesn't work at
> all at the level of "count of bytes followed by a string". 

The SHORT_BINBYTES opcode consists of the byte b'C', followed by *yes
indeed* "count of bytes followed by a string".

> If you
> want to make a pickle argument that makes sense, I'm afraid you'll
> need to become familiar with how pickle works first.  This is not the
> place for a pickle tutorial.
> 
> Start by learning what a datetime pickle actually is.
> pickletools.dis() will be very helpful.

    0: \x80 PROTO      3
    2: c    GLOBAL     'datetime datetime'
   21: q    BINPUT     0
   23: C    SHORT_BINBYTES b'\x07\xdf\t\x0e\x15\x06*\x00\x00\x00'
   35: q    BINPUT     1
   37: \x85 TUPLE1
   38: q    BINPUT     2
   40: R    REDUCE
   41: q    BINPUT     3
   43: .    STOP

The payload is ten bytes, and the byte immediately before it is in fact
0x0a. If I pickle any byte string under 256 bytes long by itself, the
byte immediately before the data is the length. This is how I initially
came to the conclusion that "count of bytes followed by a string" was
valid.

I did, before writing my earlier post, look into the high-level aspects
of how datetime pickle works - it uses __reduce__ to create up to two
arguments, one of which is a 10-byte string, and the other is the
tzinfo. Those arguments are passed into the date constructor and
detected by that constructor - for example, I can call it directly with
datetime(b'\x07\xdf\t\x0e\x15\x06*\x00\x00\x00') and get the same result
as unpickling.

At the low level, the part that represents that first argument does
indeed appear to be "count of bytes followed by a string". I can add to
the count, add more bytes, and it will call the constructor with the
longer string. If I use pickletools.dis on my modified value the output
looks the same except for, as expected, the offsets and the value of the
argument to the SHORT_BINBYTES opcode.

So, it appears that, as I was saying, "wasted space" would not have been
an obstacle to having the "payload" accepted by the constructor (and
produced by __reduce__ ultimately _getstate) consist of "a byte string
of >= 10 bytes, the first 10 of which are used and the rest of which are
ignored by python <= 3.5" instead of "a byte string of exactly 10
bytes", since it would have accepted and produced exactly the same
pickle values, but been prepared to accept larger arguments pickled from
future versions.

For completeness: Protocol version 2 and 1 use BINUNICODE on a
latin1-to-utf8 version of the byte string, with a similar "count of
bytes followed by a string" (though the count of bytes is of UTF-8
bytes). Protocol version 0 uses UNICODE, terminated by \n, and a literal
\n is represented by \\u000a. In all cases some extra data around the
value sets it up to call "codecs.encode(..., 'latin1')" upon unpickling.

So have I shown you that I know enough about the pickle format to know
that permitting a longer string (and ignoring the extra bytes) would
have had zero impact on the pickle representation of values that did not
contain a longer string? I'd already figured out half of this before
writing my earlier post; I just assumed *you* knew enough that I
wouldn't have to show my work.

Extra credit:
    0: \x80 PROTO      3
    2: c    GLOBAL     'datetime datetime'
   21: q    BINPUT     0
   23: (    MARK
   24: M        BININT2    2014
   27: K        BININT1    9
   29: K        BININT1    14
   31: K        BININT1    21
   33: K        BININT1    6
   35: K        BININT1    42
   37: t        TUPLE      (MARK at 23)
   38: q    BINPUT     1
   40: R    REDUCE
   41: q    BINPUT     2
   43: .    STOP

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: [Datetime-SIG] Are there any "correct" implementations of tzinfo? Random832 <random832@fastmail.com> - 2015-09-14 21:19 -0400

csiph-web