Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #52610
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!npeer.de.kpn-eurorings.net!npeer-ng0.de.kpn-eurorings.net!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <rosuav@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.024 |
| X-Spam-Evidence | '*H*': 0.95; '*S*': 0.00; '16,': 0.03; 'encoding': 0.05; 'string.': 0.05; 'comfortably': 0.09; 'expense': 0.09; 'mixed': 0.09; 'portions': 0.09; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'simplified': 0.16; 'wrote:': 0.18; 'not,': 0.20; 'fit': 0.20; 'aug': 0.22; 'header:In-Reply- To:1': 0.27; 'character': 0.29; 'andrew': 0.30; 'message- id:@mail.gmail.com': 0.30; 'easier': 0.31; 'that.': 0.31; 'kay': 0.31; 'file': 0.32; 'text': 0.33; 'beginning': 0.33; 'fri,': 0.33; 'subject:the': 0.34; 'received:google.com': 0.35; 'add': 0.35; 'ram': 0.36; 'set.': 0.36; 'so,': 0.37; 'easily': 0.37; 'tasks': 0.38; 'to:addr:python-list': 0.38; 'files': 0.38; 'pm,': 0.38; 'to:addr:python.org': 0.39; 'enough': 0.39; 'even': 0.60; 'read': 0.60; "you'll": 0.62; 'information': 0.63; 'more': 0.64; 'within': 0.65; 'potentially': 0.81; 'meg': 0.84; 'spares': 0.84; '2013': 0.98 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=1SNA4ffdN2CZOwi0Nw2Ddqy8JDWOeqU3Yx8Sf+ZBHf0=; b=exGVo3BCgxlt7kfoTLlZVKFCOIkGcRdiIxczYlumIqzipJ4v+3XhmKGdfX13djTyLx 2jYfSJ5Rmi6y+RRQEIlSc22sIL/cIWTzAwIedX73/OV9QnAPEfM+PMLH69b57qmbNg9b +2YEtWqiFiVz7VbGkSUW2+M4Esta8Y6+NHk1JsuJ82PELDPU5YpL3sQYMiD75YyX7pXX LMZqwcjzvxh52UEypcLQAD3xPBwgPKf3ZQgFUXEc4J/kz9Z17PPss8QbhALRz7vAhobx m0Hqw6IbleAvZ9vbVi/Uq+78w+x0RX4d/mGnN9uP8OOJEXhJb5NjO3T2oHmY+ENKqCSI SqXA== |
| MIME-Version | 1.0 |
| X-Received | by 10.58.80.7 with SMTP id n7mr3175253vex.23.1376691260904; Fri, 16 Aug 2013 15:14:20 -0700 (PDT) |
| In-Reply-To | <1efhl8i0dmr9b.15q8opn6p0cj3.dlg@40tude.net> |
| References | <1efhl8i0dmr9b.15q8opn6p0cj3.dlg@40tude.net> |
| Date | Fri, 16 Aug 2013 23:14:20 +0100 |
| Subject | Re: Proper use of the codecs module. |
| From | Chris Angelico <rosuav@gmail.com> |
| To | python-list@python.org |
| Content-Type | text/plain; charset=ISO-8859-1 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.7.1376691263.23369.python-list@python.org> (permalink) |
| Lines | 13 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1376691263 news.xs4all.nl 15867 [2001:888:2000:d::a6]:48979 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:52610 |
Show key headers only | View raw
On Fri, Aug 16, 2013 at 3:02 PM, Andrew <andrew@invalid.invalid> wrote: > I have a mixed binary/text file[0], and the text portions use a radically > nonstandard character set. I want to read them easily given information > about the character encoding and an offset for the beginning of a string. To add to all the information already given: Is the file small enough to comfortably fit into memory? If so, you'll find it a LOT easier to play with strings in RAM than files on disk. Even if not, you may find a lot of tasks simplified by just reading a kay or a meg in and then working within that. That spares you the fiddliness of read(1) all the time, at the expense of potentially reading more than you need. ChrisA
Back to comp.lang.python | Previous | Next — Previous in thread | Find similar | Unroll thread
Proper use of the codecs module. Andrew <andrew@invalid.invalid> - 2013-08-16 10:02 -0400
Re: Proper use of the codecs module. Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-08-16 19:12 +0000
Re: Proper use of the codecs module. Andrew <andrew@invalid.invalid> - 2013-08-16 16:16 -0400
Re: Proper use of the codecs module. Chris Angelico <rosuav@gmail.com> - 2013-08-16 23:14 +0100
csiph-web