Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #21547
| Path | csiph.com!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder1.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <ramit.prasad@jpmorgan.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.010 |
| X-Spam-Evidence | '*H*': 0.98; '*S*': 0.00; 'completeness': 0.05; 'python': 0.08; 'dict': 0.09; 'eof': 0.09; 'subject:file': 0.13; 'to:name:python-list@python.org': 0.15; '712': 0.16; 'currencies': 0.16; 'disclaimers': 0.16; 'disclaimers,': 0.16; 'from:addr:jpmorgan.com': 0.16; 'grep': 0.16; 'python;': 0.16; 'received:155.180': 0.16; 'received:155.180.234': 0.16; 'received:159.53': 0.16; 'received:bankone.net': 0.16; 'received:exchad.jpmchase.net': 0.16; 'received:jpmchase.com': 0.16; 'received:jpmchase.net': 0.16; 'received:svr.bankone.net': 0.16; 'securities,': 0.16; 'url:disclosures': 0.16; 'url:jpmorgan': 0.16; 'file,': 0.21; 'header:In-Reply-To:1': 0.22; 'subject:data': 0.25; 'received:169': 0.28; 'received:169.254': 0.28; 'position.': 0.28; 'lines': 0.30; 'received:155': 0.30; 'received:159': 0.30; 'subject:?': 0.31; 'file.': 0.31; 'accuracy': 0.32; 'headers': 0.32; 'that,': 0.32; 'there': 0.33; 'file': 0.34; 'to:addr:python-list': 0.35; 'phone:': 0.35; '...': 0.35; 'sets': 0.35; 'offset': 0.37; 'charset:us-ascii': 0.37; 'could': 0.38; 'several': 0.38; 'data': 0.38; 'header': 0.39; 'to:addr:python.org': 0.40; 'subject': 0.61; 'offers': 0.62; 'alphanumeric': 0.67; 'information,': 0.69; 'legal': 0.72; 'url:email': 0.72; 'thousand': 0.74; 'bank': 0.75; 'sale': 0.75; 'investment': 0.77; 'purchase': 0.78; 'received:169.254.8': 0.84; 'subject:Fast': 0.84; 'steps.': 0.93 |
| X-DKIM | OpenDKIM Filter v2.1.3 sf3.jpmchase.com q2CL9DgN003057 |
| DKIM-Signature | v=1; a=rsa-sha256; c=simple/simple; d=jpmorgan.com; s=smtpout; t=1331586553; bh=LLFQ+G1uXggJb9LOqLHNCoUKVgNqxHZ30q5F2RVrhdI=; h=From:To:Subject:Date:Message-ID:References:In-Reply-To: Content-Transfer-Encoding:MIME-Version:Content-Type; b=q5gqWXd5kzYZGUGMrxtjoWcn09T2uftboP/EDFeGYac441Gs0UNn/lzogFOw2sNEz rAFVQXAKL+qF7hVoJwJJFnPhszb0JZ6Et7VPwrrRADGru9PpRcryjRbfbPrx7DU/X3 Rs8a1llNvOA8/O6we9BKmPOsXwfQ0gNfsI6LvdkA= |
| From | "Prasad, Ramit" <ramit.prasad@jpmorgan.com> |
| To | "python-list@python.org" <python-list@python.org> |
| Subject | RE: Fast file data retrieval? |
| Thread-Topic | Fast file data retrieval? |
| Thread-Index | AQHNAItSrV6iIdRrwUexxlc5bJnpz5ZnYDOA///E8mA= |
| Date | Mon, 12 Mar 2012 21:09:05 +0000 |
| References | <4F5E50F6.9070309@it.uu.se> <4F5E5D27.4010403@mrabarnett.plus.com> |
| In-Reply-To | <4F5E5D27.4010403@mrabarnett.plus.com> |
| Accept-Language | en-US |
| Content-Language | en-US |
| X-MS-Has-Attach | |
| X-MS-TNEF-Correlator | |
| x-originating-ip | [10.67.79.38] |
| Content-Transfer-Encoding | quoted-printable |
| MIME-Version | 1.0 |
| X-DLP-FWD | Yes |
| Content-Type | text/plain; charset="us-ascii" |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.12 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.596.1331586564.3037.python-list@python.org> (permalink) |
| Lines | 21 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1331586564 news.xs4all.nl 6913 [2001:888:2000:d::a6]:56532 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:21547 |
Show key headers only | View raw
> > header line > > 9 nonblank lines with alphanumeric data > > header line > > 9 nonblank lines with alphanumeric data > > ... > > ... > > ... > > header line > > 9 nonblank lines with alphanumeric data > > EOF > > > > where, a data set contains 10 lines (header + 9 nonblank) and there can > > be several thousand > > data sets in a single file. In addition,*each header has a* *unique ID > > code*. > Alternatively, you could scan the file, recording the ID and the file > offset in a dict so that, given an ID, you can seek directly to that > file position. If you can grep for the header lines you can retrieve the headers and the line number for seeking. grep is (probably) faster than python so I would have it be 2 steps. 1. grep > temp.txt 2. python; check if ID is in temp.txt and then processes Ramit Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology 712 Main Street | Houston, TX 77002 work phone: 713 - 216 - 5423 -- This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email.
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
RE: Fast file data retrieval? "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-03-12 21:09 +0000
csiph-web