Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #33295

Re: Generate unique ID for URL

References <0692e6a2-343c-4eb0-be57-fe5c815efb99@googlegroups.com> <roy-862116.20390413112012@news.panix.com> <1ce88f36-bfc7-4a55-89f8-70d1645d27ad@googlegroups.com>
Date 2012-11-14 15:06 +1100
Subject Re: Generate unique ID for URL
From Chris Angelico <rosuav@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.3661.1352865978.27098.python-list@python.org> (permalink)

Show all headers | View raw


On Wed, Nov 14, 2012 at 2:25 PM, Richard <richardbp@gmail.com> wrote:
> So the use case - I'm storing webpages on disk and want a quick retrieval system based on URL.
> I can't store the files in a single directory because of OS limitations so have been using a sub folder structure.
> For example to store data at URL "abc": a/b/c/index.html
> This data is also viewed locally through a web app.
>
> If you can suggest a better approach I would welcome it.

The cost of a crypto hash on the URL will be completely dwarfed by the
cost of storing/retrieving on disk. You could probably do some
arithmetic and figure out exactly how many URLs (at an average length
of, say, 100 bytes) you can hash in the time of one disk seek.

ChrisA

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 15:20 -0800
  Re: Generate unique ID for URL John Gordon <gordon@panix.com> - 2012-11-13 23:34 +0000
    Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 15:56 -0800
      Re: Generate unique ID for URL Chris Kaynor <ckaynor@zindagigames.com> - 2012-11-13 16:26 -0800
      Re: Generate unique ID for URL Richard Baron Penman <richardbp@gmail.com> - 2012-11-14 11:41 +1100
        Re: Generate unique ID for URL Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-11-14 10:44 +0100
          Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-14 03:14 -0800
      Re: Generate unique ID for URL Christian Heimes <christian@python.org> - 2012-11-14 01:43 +0100
        Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 16:50 -0800
          Re: Generate unique ID for URL Christian Heimes <christian@python.org> - 2012-11-14 02:05 +0100
      Re: Generate unique ID for URL Christian Heimes <christian@python.org> - 2012-11-14 01:59 +0100
        Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 17:18 -0800
        Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 17:18 -0800
  Re: Generate unique ID for URL Miki Tebeka <miki.tebeka@gmail.com> - 2012-11-13 16:13 -0800
    Re: Generate unique ID for URL Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-11-14 02:04 +0000
      Re: Generate unique ID for URL Steve Howell <showell30@yahoo.com> - 2012-11-13 18:32 -0800
        Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 19:12 -0800
  Re: Generate unique ID for URL Roy Smith <roy@panix.com> - 2012-11-13 20:39 -0500
    Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 19:25 -0800
      Re: Generate unique ID for URL Roy Smith <roy@panix.com> - 2012-11-13 22:38 -0500
        Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 19:56 -0800
      Re: Generate unique ID for URL Chris Angelico <rosuav@gmail.com> - 2012-11-14 15:06 +1100
        Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 20:14 -0800
        Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 20:14 -0800
    Re: Generate unique ID for URL Richard <richardbp@gmail.com> - 2012-11-13 19:27 -0800
    Re: Generate unique ID for URL Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-11-14 12:29 +0100
      Re: Generate unique ID for URL Dave Angel <d@davea.name> - 2012-11-14 07:33 -0500
        Re: Generate unique ID for URL Johannes Bauer <dfnsonfsduifb@gmx.de> - 2012-11-14 14:00 +0100

csiph-web