Path: csiph.com!usenet.pasdenom.info!news.redatomik.org!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'classes,': 0.05; 'context': 0.05; 'constructor': 0.07; 'objects.': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'underlying': 0.09; 'yeah,': 0.09; 'python': 0.11; 'example:': 0.11; 'a()': 0.16; 'a(object):': 0.16; 'callables': 0.16; 'executed.': 0.16; 'expects': 0.16; 'kern': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'subject:make': 0.16; 'tcp': 0.16; 'wrote:': 0.16; 'string': 0.17; 'language': 0.19; '>>>': 0.20; 'fairly': 0.22; 'interpret': 0.22; 'problem:': 0.22; 'subject:request': 0.22; 'pass': 0.22; 'references': 0.23; 'slightly': 0.23; 'import': 0.24; 'header:In-Reply-To:1': 0.24; 'header:User-Agent:1': 0.26; 'header:X-Complaints-To:1': 0.26; 'skip:" 20': 0.26; 'object,': 0.27; 'specify': 0.27; 'asked': 0.28; "doesn't": 0.28; 'subject:/': 0.29; 'concern': 0.29; 'other,': 0.29; 'pickle': 0.29; 'objects': 0.29; 'point.': 0.29; 'sense': 0.29; 'work.': 0.30; 'extend': 0.31; 'code': 0.31; "can't": 0.32; 'skip:d 20': 0.32; 'returned': 0.32; 'point': 0.33; 'class': 0.33; 'instances': 0.33; 'open': 0.33; 'file': 0.34; 'skip:c 30': 0.35; 'to:addr:python-list': 0.35; 'attempt': 0.35; 'execution': 0.35; 'robert': 0.35; 'really': 0.35; "isn't": 0.35; 'handle': 0.36; 'but': 0.36; 'being': 0.36; 'there': 0.36; 'possible': 0.36; '(and': 0.36; 'smaller': 0.36; 'subject:: ': 0.37; 'correctly': 0.37; 'received:org': 0.38; 'files': 0.38; 'means': 0.39; 'expect': 0.39; 'does': 0.39; 'to:addr:python.org': 0.39; 'system.': 0.39; 'data': 0.40; 'some': 0.40; 'even': 0.61; 'here.': 0.61; 'skip:u 10': 0.62; 'leading': 0.62; 'within': 0.64; 'world': 0.64; 'our': 0.64; 'believe': 0.67; 'stated': 0.70; 'beside': 0.84; 'construct': 0.84; 'eco': 0.84; 'subject:read': 0.84; 'subject:write': 0.84; 'terrible': 0.84; 'whitelist': 0.84 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Robert Kern Subject: Re: enhancement request: make py3 read/write py2 pickle format Date: Wed, 10 Jun 2015 14:52:29 +0100 References: <878ubr3gv8.fsf@elektro.pacujo.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: uk.enthought.com User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 In-Reply-To: <878ubr3gv8.fsf@elektro.pacujo.net> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 70 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1433944371 news.xs4all.nl 2939 [2001:888:2000:d::a6]:51251 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:92399 On 2015-06-10 13:08, Marko Rauhamaa wrote: > Robert Kern : > >> By the very nature of the stated problem: serializing all language >> objects. Being able to construct any object, including instances of >> arbitrary classes, means that arbitrary code can be executed. All I >> have to do is make a pickle file for an object that claims that its >> constructor is shutil.rmtree(). > > You can't serialize/migrate arbitrary objects. Consider open TCP > connections, open files and other objects that extend outside the Python > VM. Yes, yes, but that's really beside the point. Yes, there are some objects for which it doesn't even make sense to serialize. But my point is that even in this slightly smaller set of objects that *can* be serialized (and pickle currently does serialize), being able to serialize all of them entails arbitrary code execution to deserialize them. To allow people to write their own types that can be serialized, you have to let them specify arbitrary callables that will do the reconstruction. If you whitelist the possible reconstruction callables, you have greatly restricted the types that can participate in the serialization system. > Also objects hold references to each other, leading to a huge > reference mesh. > > For example: > > a.buddy = b > b.buddy = a > with open("a", "wb") as f: f.write(serialize(a)) > with open("b", "wb") as f: f.write(serialize(b)) > > with open("a", "rb") as f: aa = deserialize(f.read()) > with open("b", "rb") as f: bb = deserialize(f.read()) > assert aa.buddy is bb Yeah, no one expects that to work. For example, if I deserialize the same string twice, you can't expect to get identical returned objects (as in, "deserialize(pickle) is deserialize(pickle)"). However, pickle does correctly handle fairly arbitrary reference graphs within the context of a single serialization, which is the most that can be asked of a serialization system. That isn't really a concern here. >>> class A(object): ... pass ... >>> a = A() >>> b = A() >>> a.buddy = b >>> b.buddy = a >>> data = [a, b] >>> data[0].buddy is data[1] True >>> data[1].buddy is data[0] True >>> import cPickle >>> unpickled = cPickle.loads(cPickle.dumps(data)) >>> unpickled[0].buddy is unpickled[1] True >>> unpickled[1].buddy is unpickled[0] True -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco