Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #21550
| Newsgroups | comp.lang.python |
|---|---|
| Date | 2012-03-12 20:38 -0700 |
| References | <4F5E50F6.9070309@it.uu.se> <mailman.592.1331584145.3037.python-list@python.org> |
| Subject | Re: Fast file data retrieval? |
| From | Jon Clements <joncle@googlemail.com> |
| Message-ID | <mailman.599.1331609909.3037.python-list@python.org> (permalink) |
On Monday, 12 March 2012 20:31:35 UTC, MRAB wrote: > On 12/03/2012 19:39, Virgil Stokes wrote: > > I have a rather large ASCII file that is structured as follows > > > > header line > > 9 nonblank lines with alphanumeric data > > header line > > 9 nonblank lines with alphanumeric data > > ... > > ... > > ... > > header line > > 9 nonblank lines with alphanumeric data > > EOF > > > > where, a data set contains 10 lines (header + 9 nonblank) and there can > > be several thousand > > data sets in a single file. In addition,*each header has a* *unique ID > > code*. > > > > Is there a fast method for the retrieval of a data set from this large > > file given its ID code? > > > Probably the best solution is to put it into a database. Have a look at > the sqlite3 module. > > Alternatively, you could scan the file, recording the ID and the file > offset in a dict so that, given an ID, you can seek directly to that > file position. I would have a look at either bsddb, Tokyo (or Kyoto) Cabinet or hamsterdb. If it's really going to get large and needs a full blown server, maybe MongoDB/redis/hadoop...
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Re: Fast file data retrieval? MRAB <python@mrabarnett.plus.com> - 2012-03-12 20:31 +0000
Re: Fast file data retrieval? Jon Clements <joncle@googlemail.com> - 2012-03-12 20:38 -0700
Re: Fast file data retrieval? Jon Clements <joncle@googlemail.com> - 2012-03-12 20:38 -0700
Re: Fast file data retrieval? Jorgen Grahn <grahn+nntp@snipabacken.se> - 2012-03-13 20:44 +0000
Re: Fast file data retrieval? Stefan Behnel <stefan_ml@behnel.de> - 2012-03-21 17:32 +0100
csiph-web