Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.016 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'seemed': 0.07; 'python': 0.08; 'bash': 0.09; 'hierarchical': 0.09; 'presume': 0.09; 'snippets': 0.09; 'throw': 0.09; 'output': 0.10; 'library': 0.15; 'intermediate': 0.15; 'subsequent': 0.15; 'door.': 0.16; 'hits.': 0.16; 'subject:Create': 0.16; 'url:lxml': 0.16; 'webpage,': 0.16; 'workflow': 0.16; 'wrote:': 0.16; "'python": 0.18; 'exists.': 0.18; 'help.': 0.19; 'pointed': 0.21; 'header:In-Reply-To:1': 0.22; 'cheers': 0.23; 'google,': 0.23; 'somewhere': 0.23; '(or': 0.23; 'conducting': 0.23; 'code': 0.25; 'ignore': 0.26; 'subject:]': 0.26; 'code.': 0.26; 'tools,': 0.28; 'explicitly': 0.29; 'gis': 0.29; 'asking': 0.29; 'example': 0.30; 'but...': 0.30; 'creator': 0.30; 'recreate': 0.30; "skip:' 10": 0.30; 'list': 0.32; 'pointing': 0.32; 'initial': 0.32; 'source': 0.33; 'it.': 0.33; 'there': 0.33; 'to:addr:python-list': 0.33; 'someone': 0.34; 'header:User-Agent:1': 0.34; 'preliminary': 0.34; 'round': 0.34; 'routine': 0.34; 'rather': 0.35; 'notes': 0.35; 'post': 0.36; 'google': 0.36; 'anything': 0.36; 'primary': 0.36; 'received:au': 0.36; 'doing': 0.36; 'solutions.': 0.37; 'put': 0.37; 'but': 0.37; 'open': 0.37; 'could': 0.38; 'strong': 0.38; 'steven': 0.38; 'some': 0.38; 'subject:: ': 0.39; 'under': 0.39; 'finding': 0.39; 'help': 0.39; 'plain': 0.39; 'why': 0.39; 'to:addr:python.org': 0.39; 'subject:from': 0.40; "it's": 0.40; 'results': 0.61; 'full': 0.63; 'skip:1 10': 0.63; 'free': 0.63; 'harder': 0.64; 'here': 0.65; 'website': 0.65; 'received:202': 0.66; 'stated': 0.67; 'guides': 0.67; 'images,': 0.67; 'advantages': 0.77; 'promote': 0.81; '[end': 0.84; 'dick': 0.84; 'was...': 0.84; 'wheel': 0.84 Date: Fri, 09 Sep 2011 09:40:42 +1000 From: Simon Cropper User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.21) Gecko/20110831 Lightning/1.0b2 Thunderbird/3.1.13 MIME-Version: 1.0 To: python-list@python.org Subject: Re: Create an index from a webpage [RANT, DNFTT] References: <1537032.qVoOGUtdWV@PointedEars.de> <4e68db21$0$30002$c3e8da3$5496439d@news.astraweb.com> In-Reply-To: <4e68db21$0$30002$c3e8da3$5496439d@news.astraweb.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - lincpan30.siteportal.com.au X-AntiAbuse: Original Domain - python.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - fossworkflowguides.com X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 59 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1315525252 news.xs4all.nl 2402 [2001:888:2000:d::a6]:37426 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:12981 On 09/09/11 01:11, Steven D'Aprano wrote: > [SNIP] > It's no harder to put the search terms into a google URL, which still gets > the point across without being a dick about it: > [SNIP] [RANT] OK I was not going to say anything but... 1. Being told to google-it when I explicitly stated in my initial post that I had been doing this and had not been able to find anything is just plain rude. It is unconstructive and irritating. 2. I presume that python-list is a mail list for python users - beginners, intermediate and advanced. If it is not then tell me and I will go somewhere else. 3. Some searches, particularly for common terms throw millions of hits. 'Python' returns 147,000,000 results on google, 'Sitemap' returns 1,410,000,000 results. Even 'Python AND Sitemap' still returns 5,020 results. Working through these links takes you round and round with no clear solutions. Asking for help on the primary python mail list -- after conducting a preliminary investigation for tools, libraries, code snippets seemed legitimate. 4. AND YES, I could write a program but why recreate code when there is a strong likelihood that code already exists. One of the advantages of python is that a lot of code is redistributed under licences that promote reuse. So why reinvent the wheel when their is a library full of code. Sometimes you just need help finding the door. 4. If someone is willing to help me, rather than lecture me (or poke me to see if they get a response), I would appreciate it. [END RANT] For people that are willing to help. My original request was... I am after a way of pointing a python routine to my website and have it create a tree, represented as a hierarchical HTML list in a webpage, of all the pages in that website (recursive list of internal links to HTML documents; ignore images, etc.). In subsequent notes to Thomas 'PointedEars'... I pointed to an example of the desired output here http://lxml.de/sitemap.html -- Cheers Simon Simon Cropper - Open Content Creator / Website Administrator Free and Open Source Software Workflow Guides ------------------------------------------------------------ Introduction http://www.fossworkflowguides.com GIS Packages http://gis.fossworkflowguides.com bash / Python http://scripting.fossworkflowguides.com