Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #18573

Re: Help with python-list archives

Date 2012-01-06 01:27 +0000
From MRAB <python@mrabarnett.plus.com>
Subject Re: Help with python-list archives
References <78484055-dd01-4237-9217-9eb038fc744f@p16g2000yqd.googlegroups.com> <14749754.624.1325806776674.JavaMail.geo-discussion-forums@vbgw2> <8f3b98e1-3b21-4f06-8456-0a555a7ee523@u32g2000yqe.googlegroups.com> <CALwzidn5zJVYPsViSx-SV25Ef5qSsY0d8bX=6rcdvPBu8UydZw@mail.gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.4464.1325813187.27778.python-list@python.org> (permalink)

Show all headers | View raw


On 06/01/2012 00:10, Ian Kelly wrote:
> On Thu, Jan 5, 2012 at 4:52 PM, random joe<pywin32@gmail.com>  wrote:
>>  Sure. Take the most recent file as example. "2012 - January.txt.gz".
>>  If you use the python doc example this is the result. If i use "r" or
>>  "rb" the result is the same.
>>
>>>>>  import gzip
>>>>>  f1 = gzip.open('C:\\2012-January.txt.gz', 'rb')
>>>>>  data = f1.read()
>>>>>  data[:100]
>>  '\x1f\x8b\x08\x08x\n\x05O\x02\xff/srv/mailman/archives/private/python-
>>  list/2012-January.txt\x00\xec\xbdy\x7f\xdb\xc6\xb50\xfcw\xf0)\xa6z|+
>>  \xaa!!l\xdc\x14[\x8b-;V\xe2-\x92\x12'
>>>>>  f2 = gzip.open('C:\\2012-January.txt.gz', 'r')
>>>>>  data = f2.read()
>>>>>  data[:100]
>>  '\x1f\x8b\x08\x08x\n\x05O\x02\xff/srv/mailman/archives/private/python-
>>  list/2012-January.txt\x00\xec\xbdy\x7f\xdb\xc6\xb50\xfcw\xf0)\xa6z|+
>>  \xaa!!l\xdc\x14[\x8b-;V\xe2-\x92\x12'
>>
>>  The docs and google provide no clear answer. I even tried 7zip and
>>  ended up with nothing but gibberish characters. There must be levels
>>  of compression or something. Why could they not simply use the tar
>>  format? Is there anywhere else one can download the archives?
>
> Interesting.  I tried this on a Linux system using both gunzip and
> your code, and both worked fine to extract that file.  I also tried
> your code on a Windows system, and I get the same result that you do.
> This appears to be a bug in the gzip module under Windows.
>
> I think there may be something peculiar about the archive files that
> the module is not handling correctly.  If I gunzip the file locally
> and then gzip it again before trying to open it in Python, then
> everything seems to be fine.

I've found that if I gunzip it twice (gunzip it and then gunzip the
result) using the gzip module I get the text file.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Help with python-list archives random joe <pywin32@gmail.com> - 2012-01-05 14:44 -0800
  Re: Help with python-list archives Miki Tebeka <miki.tebeka@gmail.com> - 2012-01-05 15:39 -0800
    Re: Help with python-list archives random joe <pywin32@gmail.com> - 2012-01-05 15:52 -0800
      Re: Help with python-list archives Ian Kelly <ian.g.kelly@gmail.com> - 2012-01-05 17:10 -0700
        Re: Help with python-list archives random joe <pywin32@gmail.com> - 2012-01-05 16:45 -0800
      Re: Help with python-list archives MRAB <python@mrabarnett.plus.com> - 2012-01-06 01:27 +0000
        Re: Help with python-list archives random joe <pywin32@gmail.com> - 2012-01-05 18:14 -0800
          Re: Help with python-list archives MRAB <python@mrabarnett.plus.com> - 2012-01-06 03:00 +0000
            Re: Help with python-list archives random joe <pywin32@gmail.com> - 2012-01-05 20:01 -0800
              Re: Help with python-list archives random joe <pywin32@gmail.com> - 2012-01-05 20:08 -0800
                Re: Help with python-list archives Chris Angelico <rosuav@gmail.com> - 2012-01-06 15:12 +1100
                Re: Help with python-list archives Ian Kelly <ian.g.kelly@gmail.com> - 2012-01-06 00:45 -0700
                Re: Help with python-list archives Anssi Saari <as@sci.fi> - 2012-01-10 16:47 +0200
              Re: Help with python-list archives Chris Angelico <rosuav@gmail.com> - 2012-01-06 15:11 +1100
          Re: Help with python-list archives Ian Kelly <ian.g.kelly@gmail.com> - 2012-01-06 00:41 -0700
            Re: Help with python-list archives random joe <pywin32@gmail.com> - 2012-01-06 16:55 -0800
    Re: Help with python-list archives Ian Kelly <ian.g.kelly@gmail.com> - 2012-01-05 17:02 -0700

csiph-web