Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.os.linux.misc > #69793
| Path | csiph.com!weretis.net!feeder9.news.weretis.net!panix!.POSTED.panix5.panix.com!qz!not-for-mail |
|---|---|
| From | Eli the Bearded <*@eli.users.panix.com> |
| Newsgroups | comp.os.linux.misc |
| Subject | the "same image" problem |
| Date | Sun, 20 Jul 2025 19:58:30 -0000 (UTC) |
| Organization | Some absurd concept |
| Message-ID | <eli$2507201536@qaz.wtf> (permalink) |
| Injection-Date | Sun, 20 Jul 2025 19:58:30 -0000 (UTC) |
| Injection-Info | reader1.panix.com; posting-host="panix5.panix.com:166.84.1.5"; logging-data="6397"; mail-complaints-to="abuse@panix.com" |
| User-Agent | Vectrex rn 2.1 (beta) |
| X-Liz | It's actually happened, the entire Internet is a massive game of Redcode |
| X-Motto | "Erosion of rights never seems to reverse itself." -- kenny@panix |
| X-US-Congress | Moronic Fucks. |
| X-Attribution | EtB |
| XFrom | is a real address |
| Encrypted | double rot-13 |
| Xref | csiph.com comp.os.linux.misc:69793 |
Show key headers only | View raw
I have tens of thousands of photos, mostly mine, spanning decades. During that time there has been a lot of opportunity for images to get into my collection in different ways. Like, I take a photo, resize it to post on a website, then download an archive of my activity from that website a few years later and now I maybe have three copies of the image, each with different MD5/SHA1/whatever hash: 1. My original 2. My reized version 3. The re-encoded version from the website archive Or I have an image from a backup of my phone, which I then later changed the tags on, so the exif data differs. (These I can _usually_ identify by filename matches. But some have filenames too generic for that to work.) Or I have a physical photo that I have scanned from both a print and from the negative at different times. Or I have a photo I shared with family and then they sent it back a few years later as a reminder, each time it getting re-encoded. I would like a tool that can scan my collection and easily help me find visually similar images but which may be not exactly pixel for pixel identical, and for 100% sure are not byte for byte identical on disk. It's been about ten years since I last looked for such a tool and I wasn't really happy the ones for Linux back then. Best I remember was "Perceptual Hash" ( https://www.phash.org/ -- last release 2013 ). The output was a number, but it could compare images pairwise, which doesn't scale well. Anything people like these days? Elijah ------ has not tried using phash in a long time
Back to comp.os.linux.misc | Previous | Next — Next in thread | Find similar
the "same image" problem Eli the Bearded <*@eli.users.panix.com> - 2025-07-20 19:58 +0000 Re: the "same image" problem Lawrence D'Oliveiro <ldo@nz.invalid> - 2025-07-20 23:38 +0000
csiph-web