Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #78079
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <rosuav@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.001 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'binary': 0.07; 'differently.': 0.09; 'english,': 0.09; 'exit': 0.09; 'false,': 0.09; 'omit': 0.09; 'second.': 0.09; 'subject:files': 0.09; 'terminated': 0.09; 'windows,': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; '"\\r\\n"': 0.16; "'rb')": 0.16; '11:32': 0.16; 'determines': 0.16; 'file).': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'responses.': 0.16; 'set,': 0.16; 'subject:program': 0.16; 'files.': 0.16; 'folks': 0.16; 'wrote:': 0.18; 'all,': 0.19; 'file,': 0.19; 'thu,': 0.19; 'platforms': 0.22; 'shell': 0.22; 'cc:addr:python.org': 0.22; 'either.': 0.24; 'logical': 0.24; 'specify': 0.24; 'cc:2**0': 0.24; 'possibly': 0.26; 'header:In-Reply-To:1': 0.27; 'appear': 0.29; 'chris': 0.29; 'am,': 0.29; 'mode': 0.30; 'statement': 0.30; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'code': 0.31; 'that.': 0.31; 'usually': 0.31; 'block,': 0.31; 'question:': 0.31; 'sep': 0.31; 'universal': 0.31; 'quite': 0.32; 'open': 0.33; 'fri,': 0.33; 'style': 0.33; 'sense': 0.34; 'something': 0.35; 'case,': 0.35; 'convert': 0.35; 'one,': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'really': 0.36; 'impression': 0.36; 'subject:data': 0.36; 'doing': 0.36; 'thanks': 0.36; 'should': 0.36; 'error.': 0.37; 'example,': 0.37; 'starting': 0.37; 'files': 0.38; 'pm,': 0.38; 'short': 0.38; 'expect': 0.39; 'does': 0.39; 'skip:p 20': 0.39; 'ensure': 0.60; 'expression': 0.60; "you're": 0.61; 'first': 0.61; 'more': 0.64; '11:45': 0.84; 'careless': 0.84; 'closes': 0.84; 'fortunately,': 0.84; 'terrible': 0.84; 'opens': 0.91; 'to:none': 0.92 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=Stfc8kLoTQCqZZaYn/LpYlh44aVfGFooIruR4HQFhNM=; b=Qi8GPFq2PqmDJ8AclhxJUDoilajkrS1iEA5rx9vQd715exxVnHK0vXAIJdsiTFhVl4 icIVxOesLD7g9KaiFetYn1p4BKlUBj5Ing+WqdLCY64ZbRXi6KgFjPQBqIg994MfiWfH VM59ePra3h9KwNRQnmrUmqwD/vfJSekOizR9QM0LgHr61nzUkUS1gH/eia5bt53B3Nbh OnropeEvUTcMCsBen7wpbWs5AiorRR8YVmrbCq+FSQb/H+1Q0KCyEj/rl/vE7CPTS2bX xweLGHwZrqHZ/GvrVp7Q9p1XZkeqXQM9KqndOrtt4vm5an47p1W6NeIWs/AbK88+Y0uv 2ErQ== |
| MIME-Version | 1.0 |
| X-Received | by 10.50.20.169 with SMTP id o9mr26911895ige.14.1411135174989; Fri, 19 Sep 2014 06:59:34 -0700 (PDT) |
| In-Reply-To | <CALDD_=n0mftV0TFHAFiACgyyDSijE-fR7PiO3O94QYhk+5f6Ew@mail.gmail.com> |
| References | <CALDD_=n0mftV0TFHAFiACgyyDSijE-fR7PiO3O94QYhk+5f6Ew@mail.gmail.com> |
| Date | Fri, 19 Sep 2014 23:59:34 +1000 |
| Subject | Re: program to generate data helpful in finding duplicate large files |
| From | Chris Angelico <rosuav@gmail.com> |
| Cc | "python-list@python.org" <python-list@python.org> |
| Content-Type | text/plain; charset=UTF-8 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.14147.1411135178.18130.python-list@python.org> (permalink) |
| Lines | 58 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1411135178 news.xs4all.nl 2895 [2001:888:2000:d::a6]:56956 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:78079 |
Show key headers only | View raw
On Fri, Sep 19, 2014 at 11:32 PM, David Alban <extasia@extasia.org> wrote:
> thanks for the responses. i'm having quite a good time learning python.
Awesome! But while you're at it, you may want to consider learning
English on the side; capitalization does make your prose more
readable. Also, it makes you look careless - you appear not to care
about your English, so it's logical to expect that you may not care
about your Python either. That may be completely false, but it's still
the impression you're creating.
> On Thu, Sep 18, 2014 at 11:45 AM, Chris Kaynor <ckaynor@zindagigames.com>
> wrote:
>>
>> Additionally, you may want to specify binary mode by using open(file_path,
>> 'rb') to ensure platform-independence ('r' uses Universal newlines, which
>> means on Windows, Python will convert "\r\n" to "\n" while reading the
>> file). Additionally, some platforms will treat binary files differently.
>
> would it be good to use 'rb' all the time?
Only if you're reading binary files. In the program you're doing here,
yes; you want binary mode.
> if you omit the exit statement it in this example, and
> $report_mode is not set, your shell program will give a non-zero return code
> and appear to have terminated with an error. in shell the last expression
> evaluated determines the return code to the os.
IMO that's a terrible misfeature. If you actually want the return
value to be propagated, you should have to say so - something like:
#!/bin/sh
run_program
exit $?
Fortunately, Python isn't like that.
> style question: if there is only one, possibly short statement in a block,
> do folks usually move it up to the line starting the block?
>
> if not S_ISREG( mode ) or S_ISLNK( mode ):
> return
>
> vs.
>
> if not S_ISREG( mode ) or S_ISLNK( mode ): return
>
> or even:
>
> with open( file_path, 'rb' ) as f: md5sum = md5_for_file( file_path )
Only if it's really short AND it makes very good sense that way. Some
people would say "never". In the first case, I might do it, but not
the second. (Though that's not necessary at all, there; md5_for_file
opens and closes the file, so you don't need to open it redundantly
before calling.)
ChrisA
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Re: program to generate data helpful in finding duplicate large files Chris Angelico <rosuav@gmail.com> - 2014-09-19 23:59 +1000
Re: program to generate data helpful in finding duplicate large files Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2014-09-20 02:22 +1000
Re: program to generate data helpful in finding duplicate large files Chris Angelico <rosuav@gmail.com> - 2014-09-20 03:07 +1000
Re: program to generate data helpful in finding duplicate large files Cameron Simpson <cs@zip.com.au> - 2014-09-20 10:30 +1000
Re: program to generate data helpful in finding duplicate large files Ben Finney <ben+python@benfinney.id.au> - 2014-09-20 16:29 +1000
csiph-web