Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #78083

Re: program to generate data helpful in finding duplicate large files

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!us.feeder.erje.net!news2.arglkargh.de!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <ian.g.kelly@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.004
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'importing': 0.05; 'output': 0.05; 'latter': 0.09; 'option,': 0.09; 'strings.': 0.09; 'subject:files': 0.09; '__future__': 0.16; 'inclined': 0.16; 'integers.': 0.16; 'personally,': 0.16; 'subject:program': 0.16; 'size,': 0.16; 'wrote:': 0.18; 'print': 0.22; 'values': 0.27; 'header:In-Reply-To:1': 0.27; 'chris': 0.29; 'am,': 0.29; 'converting': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; "skip:' 10": 0.31; "d'aprano": 0.31; 'sep': 0.31; 'steven': 0.31; 'fri,': 0.33; 'something': 0.35; 'convert': 0.35; 'received:google.com': 0.35; 'subject:data': 0.36; 'to:addr :python-list': 0.38; 'pm,': 0.38; 'to:addr:python.org': 0.39; 'either': 0.39; 'simply': 0.61; 'map': 0.64; 'lean': 0.84
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=3upGj9f5+l4+mYl9K9VEYbnI6M24gvU0URkpdJXOWDQ=; b=jSnQy9HesmAWVYyX19Ryj6T9CAEYvi7zyR6ezAOhIGOX1utFSf57BEQgup1YDePO4L k4Vawc+o7I4D2/GSUgUZR+uxBjIzji7qa4KLo4zxva2A9lrlAtXCmabWQCq6nnongv/U bPmxUhS63j+3sHFKvGlCikxezuhh4OJvEZLop4jTrrhLq5QfuDBfSyt9ZXt9dP9WFags Cri16LznH0iLiNTdYkaezfUUA612VSETMuoXiAkWHtXF3BmQG+c4hkFztUzpfEMS/MRl TdDCe3c5U7PT6Ob7ic3SuNtCNPdrJrXr35LRgMEu43UZdpgKNakD4Dh5B8sxF+0wSdyj h9sg==
X-Received by 10.70.60.197 with SMTP id j5mr3053205pdr.145.1411147254627; Fri, 19 Sep 2014 10:20:54 -0700 (PDT)
MIME-Version 1.0
In-Reply-To <CAPTjJmpRnN3FaT2EHiFFuMBOArVBuuvf+f4siex95SW6vPGdcQ@mail.gmail.com>
References <mailman.14114.1411063879.18130.python-list@python.org> <541bc310$0$29975$c3e8da3$5496439d@news.astraweb.com> <CAPTjJmpRnN3FaT2EHiFFuMBOArVBuuvf+f4siex95SW6vPGdcQ@mail.gmail.com>
From Ian Kelly <ian.g.kelly@gmail.com>
Date Fri, 19 Sep 2014 11:20:14 -0600
Subject Re: program to generate data helpful in finding duplicate large files
To Python <python-list@python.org>
Content-Type text/plain; charset=UTF-8
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.14150.1411147263.18130.python-list@python.org> (permalink)
Lines 15
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1411147263 news.xs4all.nl 2895 [2001:888:2000:d::a6]:57676
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:78083

Show key headers only | View raw


On Fri, Sep 19, 2014 at 12:45 AM, Chris Angelico <rosuav@gmail.com> wrote:
> On Fri, Sep 19, 2014 at 3:45 PM, Steven D'Aprano
>>     s = '\0'.join([thishost, md5sum, dev, ino, nlink, size, file_path])
>>     print s
>
> That won't work on its own; several of the values are integers. So
> either they need to be str()'d or something in the output system needs
> to know to convert them to strings. I'm inclined to the latter option,
> which simply means importing print_function from __future__ and
> setting sep=chr(0).

Personally, I lean toward converting them with map in this case:

    s = '\0'.join(map(str, [thishost, md5sum, dev, ino, nlink, size,
file_path]))

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

program to generate data helpful in finding duplicate large files David Alban <extasia@extasia.org> - 2014-09-18 11:11 -0700
  Re: program to generate data helpful in finding duplicate large files Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-09-19 15:45 +1000
    Re: program to generate data helpful in finding duplicate large files Chris Angelico <rosuav@gmail.com> - 2014-09-19 16:45 +1000
      Re: program to generate data helpful in finding duplicate large files Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-09-19 21:04 +1000
        Re: program to generate data helpful in finding duplicate large files Chris Angelico <rosuav@gmail.com> - 2014-09-19 21:36 +1000
          Re: program to generate data helpful in finding duplicate large files Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-09-20 09:33 +1000
            Re: program to generate data helpful in finding duplicate large files Chris Angelico <rosuav@gmail.com> - 2014-09-20 14:47 +1000
    Re: program to generate data helpful in finding duplicate large files Ian Kelly <ian.g.kelly@gmail.com> - 2014-09-19 11:20 -0600
    Re: program to generate data helpful in finding duplicate large files Chris Angelico <rosuav@gmail.com> - 2014-09-20 03:36 +1000

csiph-web