Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #27983
| Path | csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <andipersti@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.001 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'subject:Python': 0.05; 'case.': 0.05; 'url:pipermail': 0.05; 'think,': 0.07; 'bug': 0.10; 'aug': 0.13; 'ignore': 0.13; '(size': 0.16; '[5]': 0.16; 'andreas': 0.16; 'archives.': 0.16; 'bye,': 0.16; 'headers,': 0.16; 'nowadays': 0.16; 'people?': 0.16; 'url:bugzilla': 0.16; 'url:detail': 0.16; 'url:issues': 0.16; 'url:show_bug': 0.16; 'wrote:': 0.17; 'headers': 0.17; 'saying': 0.18; 'tim': 0.18; 'archiving': 0.22; 'browsers': 0.22; 'http': 0.22; 'received:mail- bk0-f46.google.com': 0.22; 'setup,': 0.22; "i've": 0.23; 'second': 0.24; 'header': 0.24; 'header:In-Reply-To:1': 0.25; 'header:User- Agent:1': 0.26; 'looks': 0.26; '(which': 0.26; '[1]': 0.27; '[2]': 0.27; 'received:209.85.214.46': 0.27; 'correct': 0.28; 'subject:list': 0.28; 'noticed': 0.28; 'behaviour': 0.29; 'chase': 0.29; 'url:code': 0.29; 'url:python': 0.32; 'file': 0.32; 'anybody': 0.32; 'info': 0.32; 'to:addr:python-list': 0.33; 'received:google.com': 0.34; 'wrong': 0.34; 'server': 0.35; 'subject:?': 0.35; 'received:209.85': 0.35; 'something': 0.35; 'there': 0.35; 'but': 0.36; 'message-id:@gmail.com': 0.36; 'url:org': 0.36; 'received:209': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'some': 0.38; 'received:10': 0.38; 'to:addr:python.org': 0.39; 'received:209.85.214': 0.39; 'notice': 0.39; 'skip:" 10': 0.40; 'subject:-': 0.40; 'header:Received:5': 0.40; 'url:mail': 0.40; 'think': 0.40; 'most': 0.61; 'leading': 0.61; 'skip:w 30': 0.61; 'url:p': 0.63; 'confirm': 0.64; 'url:cgi': 0.65; 'do:': 0.91; 'url:mozilla': 0.91 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=KSVcbOUMPGzsZT+EV0NsQNKggMpXwPzG2mL6wbjslCU=; b=joQGHcKvy7f3/T6IUCYaiFJIXZu73Cqh4EgjH3NjmSV7aQ+KKGQV/cW1EZq1P3a1mS Z0fX9ZtulcqXzJ8PgsND//BhBxYXRhQ8luIlvVqBThV8dgXxpUjGS21XFv4WmYd9XQQj VPWHCnciB8bi1eFOdsDNobzYgL9SWbO+ppUinE94gmhfw8awNg4lSlVynRQfrDDmHZ/C DXhzqFzLlqDHKGp2VgCXUj7KW6oOriHKcY1WmlP7OhoTh8GppmI9lSEPIj9z9jPp86vw IoWTCz05SlmOAgAwZ2WWLuTyLhhx0tIdUD6pfpnEWnOHWOOOOTZ5cOmINk6I/Oz4L3jI 0Ppg== |
| Date | Mon, 27 Aug 2012 15:52:12 +0200 |
| From | Andreas Perstinger <andipersti@gmail.com> |
| User-Agent | Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 |
| MIME-Version | 1.0 |
| To | python-list@python.org |
| Subject | Re: Python list archives double-gzipped? |
| References | <503AD027.7000004@tim.thechases.com> |
| In-Reply-To | <503AD027.7000004@tim.thechases.com> |
| Content-Type | text/plain; charset=ISO-8859-1; format=flowed |
| Content-Transfer-Encoding | 7bit |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.12 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3875.1346075537.4697.python-list@python.org> (permalink) |
| Lines | 56 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1346075537 news.xs4all.nl 6949 [2001:888:2000:d::a6]:60872 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:27983 |
Show key headers only | View raw
On 27.08.2012 03:40, Tim Chase wrote:
> So it looks like some python-list@ archiving process is double
> gzip'ing the archives. Can anybody else confirm this and get the
> info the right people?
In January, "random joe" noticed the same problem[1].
I think, Anssi Saari[2] was right in saying that there is something
wrong in the browser or server setup, because I notice the same
behaviour with Firefox, Chromium, wget and curl.
$ ll *July*
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:48 chromium_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 748041 Aug 27 13:41 curl_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:48 firefox_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 748041 Aug 2 03:27 wget_2012-July.txt.gz
The browsers get a double gzipped file (size 747850) whereas the
download utilities get a normal gzipped file (size 748041).
After looking at the HTTP request and response headers I've noticed that
the browsers accept compressed data ("Accept-Encoding: gzip, deflate")
whereas wget/curl by default don't. After adding that header to
wget/curl they get the same double gzipped file as the browsers do:
$ ll *July*
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:48 chromium_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 748041 Aug 27 13:41 curl_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:40
curl_encoding_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 747850 Aug 27 13:48 firefox_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 748041 Aug 2 03:27 wget_2012-July.txt.gz
-rw-rw-r-- 1 andreas andreas 747850 Aug 2 03:27
wget_encoding_2012-July.txt.gz
I think the following is happening:
If you send the "Accept-Encoding: gzip, deflate"-header, the server will
gzip the file a second time (which is arguably unnecessary) and responds
with "Content-Encoding: gzip" and "Content-Type: application/x-gzip"
(which is IMHO correct according to RFC2616/14.11 and 14.17[3]).
But because many servers apparently don't set correct headers, the
default behaviour of most browsers nowadays is to ignore the
content-encoding for gzip files (application/x-gzip - see bug report for
firefox[4] and chromium[5]) and don't uncompress the outer layer,
leading to a double gzipped file in this case.
Bye, Andreas
[1] http://mail.python.org/pipermail/python-list/2012-January/617983.html
[2] http://mail.python.org/pipermail/python-list/2012-January/618211.html
[3] http://www.ietf.org/rfc/rfc2616
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=610679#c5
[5] http://code.google.com/p/chromium/issues/detail?id=47951#c9
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Re: Python list archives double-gzipped? Andreas Perstinger <andipersti@gmail.com> - 2012-08-27 15:52 +0200
csiph-web