Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'example:': 0.03; 'algorithm': 0.03; 'modified': 0.05; 'attributes': 0.07; 'filename': 0.07; 'key.': 0.07; 'properly.': 0.07; 'table.': 0.07; 'python': 0.09; '(without': 0.09; 'added.': 0.09; 'filenames,': 0.09; 'happen.': 0.09; 'identifier': 0.09; 'identifier,': 0.09; 'mess': 0.09; 'to:addr:comp.lang.python': 0.09; 'cc:addr:python- list': 0.10; 'template': 0.11; 'files.': 0.13; 'file,': 0.15; 'value.': 0.15; '(eg.': 0.16; 'altered,': 0.16; 'contents:': 0.16; "file's": 0.16; 'index.html': 0.16; 'modified.': 0.16; 'renamed': 0.16; 'retains': 0.16; 'rewriting': 0.16; 'somehow,': 0.16; 'sources,': 0.16; 'later': 0.16; 'wrote:': 0.17; 'mathematical': 0.17; 'mechanism': 0.17; 'solution.': 0.18; 'embedding': 0.22; 'latter': 0.22; 'cc:2**0': 0.23; 'programming': 0.23; 'thus': 0.24; 'cc:no real name:2**0': 0.24; 'external': 0.24; 'script': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; '(e.g.': 0.27; 'am,': 0.27; 'checking': 0.27; 'embedded': 0.27; 'separate': 0.27; 'hash': 0.29; 'key,': 0.29; 'nice!': 0.29; 'skip:/ 40': 0.29; 'case,': 0.29; 'no,': 0.29; '"the': 0.29; 'figure': 0.30; 'code': 0.31; 'gets': 0.32; 'problem.': 0.32; 'file': 0.32; 'not.': 0.32; 'sources': 0.32; 'could': 0.32; 'directory,': 0.33; 'retain': 0.33; 'zero': 0.33; 'another': 0.33; 'received:google.com': 0.34; '(1)': 0.34; 'updated': 0.34; 'list': 0.35; 'desirable': 0.35; 'moved': 0.35; 'path': 0.35; 'problem,': 0.35; 'replaced': 0.35; 'requiring': 0.35; 'so,': 0.35; "won't": 0.35; 'received:209.85': 0.35; 'there': 0.35; 'created': 0.36; 'but': 0.36; "didn't": 0.36; 'enough': 0.36; 'ones': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'store': 0.38; 'files': 0.38; 'mean': 0.38; 'some': 0.38; 'things': 0.38; 'page': 0.38; 'instead': 0.39; 'hello,': 0.39; 'list,': 0.39; 'end': 0.40; 'your': 0.60; 'increased': 0.60; 'real': 0.61; 'information': 0.63; 'therefore': 0.65; 'subject: & ': 0.67; '8bit%:100': 0.70; '8bit%:92': 0.70; 'sounds': 0.71; '2013': 0.84; 'actions,': 0.84; 'altered)': 0.84; 'choices:': 0.84; 'situations,': 0.84; 'updated,': 0.84; 'same,': 0.91; 'angel': 0.93 X-Received: by 10.49.63.164 with SMTP id h4mr3473708qes.39.1358752080174; Sun, 20 Jan 2013 23:08:00 -0800 (PST) Newsgroups: comp.lang.python Date: Sun, 20 Jan 2013 23:08:00 -0800 (PST) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=94.68.70.179; posting-account=DYJQ-woAAACEPH85Au2BhUVfFTfSfVa4 References: <8deb6f5d-ff10-4b36-bdd6-36f9eed58e1e@googlegroups.com> <5dd4babd-716d-4542-ad36-e6a841b73ec3@googlegroups.com> User-Agent: G2/1.0 X-Google-Web-Client: true X-Google-IP: 94.68.70.179 MIME-Version: 1.0 Subject: Re: Uniquely identifying each & every html template From: Ferrous Cranus To: comp.lang.python@googlegroups.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Message-ID: Lines: 173 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1358752089 news.xs4all.nl 6946 [2001:888:2000:d::a6]:38293 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:37162 =CE=A4=CE=B7 =CE=A3=CE=AC=CE=B2=CE=B2=CE=B1=CF=84=CE=BF, 19 =CE=99=CE=B1=CE= =BD=CE=BF=CF=85=CE=B1=CF=81=CE=AF=CE=BF=CF=85 2013 11:00:15 =CF=80.=CE=BC. = UTC+2, =CE=BF =CF=87=CF=81=CE=AE=CF=83=CF=84=CE=B7=CF=82 Dave Angel =CE=AD= =CE=B3=CF=81=CE=B1=CF=88=CE=B5: > On 01/19/2013 03:39 AM, Ferrous Cranus wrote: >=20 > > =CE=A4=CE=B7 =CE=A3=CE=AC=CE=B2=CE=B2=CE=B1=CF=84=CE=BF, 19 =CE=99=CE= =B1=CE=BD=CE=BF=CF=85=CE=B1=CF=81=CE=AF=CE=BF=CF=85 2013 12:09:28 =CF=80.= =CE=BC. UTC+2, =CE=BF =CF=87=CF=81=CE=AE=CF=83=CF=84=CE=B7=CF=82 Dave Angel= =CE=AD=CE=B3=CF=81=CE=B1=CF=88=CE=B5: >=20 > > >=20 > >> I don't understand the problem. A trivial Python script could scan >=20 > >> >=20 > >> through all the files in the directory, checking which ones are missin= g >=20 > >> >=20 > >> the identifier, and rewriting the file with the identifier added. >=20 > > >=20 > >> >=20 > >> So, since you didn't come to that conclusion, there must be some other >=20 > >> >=20 > >> reason you don't want to edit the files. Is it that the real sources >=20 > >> >=20 > >> are elsewhere (e.g. Dreamweaver), and whenever one recompiles those >=20 > >> >=20 > >> sources, these files get replaced (without identifiers)? >=20 > > >=20 > > Exactly. Files get modified/updates thus the embedded identifier will b= e missing each time. So, relying on embedding code to html template content= is not practical. >=20 > > >=20 > > >=20 > >> If that's the case, then I figure you have about 3 choices: >=20 > >> 1) use the file path as your key, instead of requiring a number >=20 > > >=20 > > No, i cannot, because it would mess things at a later time on when i fo= r example: >=20 > > >=20 > > 1. mv name.html othername.html (document's filename altered) >=20 > > 2. mv name.html /subfolder/name.html (document's filepath altered) >=20 > > >=20 > > Hence, new database counters will be created for each of the above acti= ons, therefore i will be having 2 counters for the same file, and the latte= r one will start from a zero value. >=20 > > >=20 > > Pros: If the file's contents gets updated, that won't affect the counte= r. >=20 > > Cons: If filepath is altered, then duplicity will happen. >=20 > > >=20 > > >=20 > >> 2) use a hash of the page (eg. md5) as your key. of course this coul= d >=20 > >> mean that you get a new value whenever the page is updated. That's go= od >=20 > >> in many situations, but you don't give enough information to know if >=20 > >> that's desirable for you or not. >=20 > > >=20 > > That sounds nice! A hash is a mathematical algorithm that produce a uni= que number after analyzing each file's contents? But then again what if the= html templated gets updated? That update action will create a new hash for= the file, hence another counter will be created for the same file, same en= d result as (1) solution. >=20 > > >=20 > > Pros: If filepath is altered, that won't affect the counter. >=20 > > Cons: If file's contents gets updated the, then duplicity will happen. >=20 > > >=20 > > >=20 > >> 3) Keep an external list of filenames, and their associated id numbers= . >=20 > >> The database would be a good place to store such a list, in a separate= table. >=20 > > >=20 > > I did not understand that solution. >=20 > > >=20 > > >=20 > > We need to find a way so even IF: >=20 > > >=20 > > (filepath gets modified && file content's gets modified) simultaneously= the counter will STILL retains it's value. >=20 > > >=20 >=20 >=20 > You don't yet have a programming problem, you have a specification=20 >=20 > problem. Somehow, you want a file to be considered "the same" even when= =20 >=20 > it's moved, renamed and/or modified. So all files are the same, and you= =20 >=20 > only need one id. >=20 > Don't pick a mechanism until you have an self-consistent spec. I do have the specification. An .html page must retain its database counter value even if its: (renamed && moved && contents altered) [original attributes of the file]: filename: index.html filepath: /home/nikos/public_html/ contents: Hello [get modified to]: filename: index2.html filepath: /home/nikos/public_html/folder/subfolder/ contents: Hello, people The file is still the same, even though its attributes got modified. We want counter.py script to still be able to "identify" the .html page, he= nce its counter value in order to get increased properly.