Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #102642
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: A sets algorithm |
| Date | 2016-02-07 18:20 -0600 |
| Message-ID | <mailman.82.1454892109.2317.python-list@python.org> (permalink) |
| References | <n98e0f$15lj$1@gioia.aioe.org> <mailman.81.1454885914.2317.python-list@python.org> <n98m3o$1h8k$1@gioia.aioe.org> |
On 2016-02-08 00:05, Paulo da Silva wrote: > Às 22:17 de 07-02-2016, Tim Chase escreveu: >> all_files = list(generate_MyFile_objects()) >> interesting = [ >> (my_file1, my_file2) >> for i, my_file1 >> in enumerate(all_files, 1) >> for my_file2 >> in all_files[i:] >> if my_file1 == my_file2 >> ] > > "my_file1 == my_file2" can be implemented into MyFile class taking > advantage of caching sizes (if different files are different), > hashes or even content (for small files) or file headers (first n > bytes). However this seems to have a problem: > all_files: a b c d e ... > If a==b then comparing b with c,d,e is useless. Depends on what the OP wants to have happen if more than one input file is equal. I.e., a == b == c. Does one just want "a has duplicates" (and optionally "and here's one of them"), or does one want "a == b", "a == c" and "b == c" in the output? > Another solution I thought of, could be defining some methods (I > still don't know which ones) in MyFile so that I could use sets > intersection. Would this one be a faster solution? Adding __hash__ would allow for the set operations, but would require (as ChrisA points out) knowing how to create a hash function that encompasses the information you want to compare. -tkc
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
A sets algorithm Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-07 21:46 +0000
Re: A sets algorithm Chris Angelico <rosuav@gmail.com> - 2016-02-08 08:58 +1100
Re: A sets algorithm Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2016-02-07 22:03 +0000
Re: A sets algorithm Tim Chase <python.list@tim.thechases.com> - 2016-02-07 16:17 -0600
Re: A sets algorithm Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-08 00:05 +0000
Re: A sets algorithm Tim Chase <python.list@tim.thechases.com> - 2016-02-07 18:20 -0600
Re: A sets algorithm Cem Karan <cfkaran2@gmail.com> - 2016-02-07 20:07 -0500
Re: A sets algorithm Paulo da Silva <p_s_d_a_s_i_l_v_a_ns@netcabo.pt> - 2016-02-08 02:22 +0000
Re: A sets algorithm Random832 <random832@fastmail.com> - 2016-02-08 09:49 -0500
Re: A sets algorithm Chris Angelico <rosuav@gmail.com> - 2016-02-09 02:11 +1100
Re: A sets algorithm Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-09 15:13 +1100
Re: A sets algorithm Chris Angelico <rosuav@gmail.com> - 2016-02-09 15:27 +1100
Re: A sets algorithm Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2016-02-09 17:48 +1300
csiph-web