Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!eternal-september.org!feeder.eternal-september.org!cs.uu.nl!news.stack.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
To: python-list@python.org
From: Dennis Lee Bieber <wlfraed@ix.netcom.com>
Subject: Re: Using filepath method to identify an .html page
Date: Tue, 22 Jan 2013 17:01:26 -0500
Organization: > Bestiaria Support Staff <
References: <mailman.785.1358858844.2939.python-list@python.org> <50fe8e69$0$30003$c3e8da3$5496439d@news.astraweb.com> <0459659d-4ec2-4c7d-bee3-b4e363c916dd@googlegroups.com> <mailman.790.1358865192.2939.python-list@python.org> <ec8f1a56-d0f7-46a6-a8a3-9425d3aabf8e@googlegroups.com> <mailman.796.1358868351.2939.python-list@python.org> <mailman.801.1358870401.2939.python-list@python.org> <kdmg96$gl8$1@reader1.panix.com> <4847a0e3-aefa-4330-9252-db08f2e993df@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.841.1358892099.2939.python-list@python.org>
Lines: 35
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:37355

On Tue, 22 Jan 2013 10:07:21 -0800 (PST), Ferrous Cranus
<nikos.gr33k@gmail.com> declaimed the following in
gmane.comp.python.general:


> 
> No, because i DO NOT WANT to store LOTS OF BIGS absolute paths in the database.
>
	Why not? What is "BIG"...

	10,000 paths of 255 characters is (presume ASCII 1-byte per
character) means you have 2,550,000  characters -- That's LESS THAN
THREE MB for all the file paths. Add in a 2-byte short integer ID and
you've got 20,000 bytes of IDs. Creating unique indices (ID should
already be a unique auto-increment column) double the data usage plus
maybe 160,000 bytes for the pointers from the index to the data record.

	2,550,000 + 20,000 => 2,570,000		raw data
	2,570,000 + 160,000 => 2,730,000	indices

	2,570,000 + 2,730,000 => 5,300,000	5MB maximum

	I could store all that on my ancient PDA!

	We've probably generated that much text in the two discussion
threads alone!


	The safest way to generate your four digit integer, without running
the risk of collision from hashing, is a simple database table with
unique ID column and unique filepath column.
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/