Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.005 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'elif': 0.04; 'computed': 0.07; 'filename': 0.07; 'bytes)': 0.09; 'eof': 0.09; 'truncate': 0.09; 'to:name:python-list': 0.15; '(last': 0.16; 'flushed': 0.16; 'kernel.': 0.16; 'marker': 0.16; 'possible?': 0.16; 'reproduce': 0.16; 'subject:failed': 0.16; 'error.': 0.21; 'work.': 0.23; 'thus': 0.24; 'message-id:@mail.gmail.com': 0.27; "doesn't": 0.28; 'obscure': 0.29; 'python2.7': 0.29; 'trigger': 0.29; "i'm": 0.29; "we're": 0.30; "skip:' 10": 0.30; 'code': 0.31; 'file': 0.32; 'running': 0.32; 'could': 0.32; 'right?': 0.33; 'problem': 0.33; 'to:addr:python-list': 0.33; "can't": 0.34; 'received:google.com': 0.34; 'text': 0.34; 'problem,': 0.35; 'doing': 0.35; 'sometimes': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'really': 0.36; 'but': 0.36; 'generation': 0.36; 'problems': 0.36; 'does': 0.37; 'received:209': 0.37; 'files': 0.38; 'some': 0.38; 'to:addr:python.org': 0.39; 'subject:-': 0.40; 'header:Received:5': 0.40; 'end': 0.40; 'think': 0.40; 'making': 0.64; 'reached': 0.65; 'drive,': 0.65; 'nfs': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=InRMBdZMMOaAFMfGL425I8VEDIO27HqnWYQvb+u3jyo=; b=npBrtaoNJBKrnxRGXU0SWoEOq8QlEBWumTv+ywz2xracTpPqCeCtyNt2qwTsKOEXeS D5laCGoh1fu7TymN1Fg+t8pOLKQ3A35awtree4XDLRF/kVftPfMLdnbKWZqJdKUEJKGz haGqRxOx+7Livv0kEzxG1UufA0AUyoG1+FUFGE0YZ+gvUcdCqVgrZbQ0tsYOu18CllyR 5Nxr9YSai8NqnWLCK3NLU568r1EI63Xb8f5s5xV+N9SBabf6xBmt+uiWu9Ti3k/fj3lE +9uHVzBujYQ/bqwqwPwhrVo9hp0Kx4FlhVbGNj3E2nFdMbUi4R7V7iU2ILaeSbefo1Ho SumQ== MIME-Version: 1.0 Date: Wed, 1 Aug 2012 11:39:57 +0100 Subject: CRC-checksum failed in gzip From: andrea crotti To: python-list Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 33 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1343817600 news.xs4all.nl 6874 [2001:888:2000:d::a6]:36698 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:26344 We're having some really obscure problems with gzip. There is a program running with python2.7 on a 2.6.18-128.el5xen (red hat I think) kernel. Now this program does the following: if filename == 'out2.txt': out2 = open('out2.txt') elif filename == 'out2.txt.gz' out2 = open('out2.txt.gz') text = out2.read() out2.close() very simple right? But sometimes we get a checksum error. Reading the code I got the following: - CRC is at the end of the file and is computed against the whole file (last 8 bytes) - after the CRC there is the \0000 marker for the EOF - readline() doesn't trigger the checksum generation in the beginning, but only when the EOF is reached - until a file is flushed or closed you can't read the new content in it but the problem is that we can't reproduce it, because doing it manually on the same files it works perfectly, and the same files some time work some time don't work. The files are on a shared NFS drive, I'm starting to think that it's a network/fs problem, which might truncate the file adding an EOF before the end and thus making the checksum fail.. But is it possible? Or what else could it be?