Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #18873 > unrolled thread

stable object serialization to text file

Started byMáté Koch <koch.mate@me.com>
First post2012-01-11 21:16 +0100
Last post2012-01-12 17:08 -0800
Articles 2 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  stable object serialization to text file Máté Koch <koch.mate@me.com> - 2012-01-11 21:16 +0100
    Re: stable object serialization to text file K Richard Pixley <rich@noir.com> - 2012-01-12 17:08 -0800

#18873 — stable object serialization to text file

FromMáté Koch <koch.mate@me.com>
Date2012-01-11 21:16 +0100
Subjectstable object serialization to text file
Message-ID<mailman.4680.1326369724.27778.python-list@python.org>
Hello All,

I'm developing an app which stores the data in file system database. The data in my case consists of large python objects, mostly dicts, containing texts and numbers. The easiest way to dump and load them would be pickle, but I have a problem with it: I want to keep the data in version control, and I would like to use it as efficiently as possible. Is it possible to force pickle to store the otherwise unordered (e.g. dictionary) data in a kind of ordered way, so that if I dump a large dict, then change 1 tiny thing in it and dump again, the diff of the former and the new file will be minimal?

If pickle is not the best choice for me, can you suggest anything else? (If there isn't any solution for it so far, I will write the module of course, but first I'd like to look around and make sure it hasn't been created yet.)

Thanks,

Mate

[toc] | [next] | [standalone]


#18888

FromK Richard Pixley <rich@noir.com>
Date2012-01-12 17:08 -0800
Message-ID<UtLPq.61497$Ee3.49339@newsfe04.iad>
In reply to#18873
On 1/11/12 12:16 , Máté Koch wrote:
> Hello All,
>
> I'm developing an app which stores the data in file system database. The data in my case consists of large python objects, mostly dicts, containing texts and numbers. The easiest way to dump and load them would be pickle, but I have a problem with it: I want to keep the data in version control, and I would like to use it as efficiently as possible. Is it possible to force pickle to store the otherwise unordered (e.g. dictionary) data in a kind of ordered way, so that if I dump a large dict, then change 1 tiny thing in it and dump again, the diff of the former and the new file will be minimal?
>
> If pickle is not the best choice for me, can you suggest anything else? (If there isn't any solution for it so far, I will write the module of course, but first I'd like to look around and make sure it hasn't been created yet.)

Json kinda sucks.  Try yaml.

If your data is simple enough, you can just write and read your own 
format.  Sort it first and you're golden.

You might also try sorted dicts.  I don't know if those will come out of 
pickle any differently than regular dicts, but it's worth trying.

You can also write your own serializer for any of the previously 
mentioned serializers.  If you sort during serialization, then they'll 
be sorted in the disk file.

--rich

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web