Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #12998

Re: Create an index from a webpage [RANT, DNFTT]

Path csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!de-l.enfer-du-nord.net!feeder2.enfer-du-nord.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <simoncropper@fossworkflowguides.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.027
X-Spam-Evidence '*H*': 0.95; '*S*': 0.00; 'pypi': 0.04; 'typed': 0.07; 'python': 0.08; 'url:pypi': 0.08; 'bash': 0.09; 'correct.': 0.09; 'run.': 0.09; 'library': 0.15; 'static': 0.15; 'files).': 0.16; 'subject:Create': 0.16; 'workflow': 0.16; 'wrote:': 0.16; 'trying': 0.21; 'maybe': 0.21; "doesn't": 0.22; 'header:In-Reply- To:1': 0.22; 'cheers': 0.23; 'exist,': 0.23; 'parse': 0.23; 'sep': 0.23; 'pm,': 0.24; 'xml': 0.25; 'subject:]': 0.26; 'described': 0.28; 'modify': 0.28; 'tools,': 0.28; 'import': 0.28; 'gis': 0.29; 'version.': 0.29; 'updated': 0.29; 'script': 0.29; 'closer': 0.30; 'creator': 0.30; 'least': 0.31; 'chris': 0.32; 'source': 0.33; 'actually': 0.33; 'there': 0.33; 'to:addr:python-list': 0.33; 'header:User-Agent:1': 0.34; '(as': 0.34; 'uses': 0.35; 'url:python': 0.36; 'fri,': 0.36; 'received:au': 0.36; 'example,': 0.37; 'impression': 0.37; 'opposed': 0.37; 'using': 0.37; 'but': 0.37; 'page': 0.37; 'something': 0.37; 'open': 0.37; 'could': 0.38; 'allows': 0.38; 'some': 0.38; 'url:org': 0.38; 'subject:: ': 0.39; 'getting': 0.39; 'to:addr:python.org': 0.39; 'subject:from': 0.40; 'might': 0.40; 'basis': 0.61; 'link': 0.63; 'free': 0.63; 'website.': 0.64; 'assessment': 0.64; 'google.': 0.64; 'sites': 0.65; 'website': 0.65; 'received:202': 0.66; 'stated': 0.67; 'guides': 0.67; 'luck': 0.68; 'url:0': 0.69; 'ideas.': 0.73; '12:43': 0.84; 'map.': 0.84
Date Fri, 09 Sep 2011 13:20:01 +1000
From Simon Cropper <simoncropper@fossworkflowguides.com>
User-Agent Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.21) Gecko/20110831 Lightning/1.0b2 Thunderbird/3.1.13
MIME-Version 1.0
To python-list@python.org
Subject Re: Create an index from a webpage [RANT, DNFTT]
References <mailman.874.1315484806.27778.python-list@python.org> <1537032.qVoOGUtdWV@PointedEars.de> <4e68db21$0$30002$c3e8da3$5496439d@news.astraweb.com> <mailman.886.1315525252.27778.python-list@python.org> <4e69769f$0$29987$c3e8da3$5496439d@news.astraweb.com> <4E697D6E.4010101@fossworkflowguides.com> <CAPTjJmr-8Uv0syAzQFs_b2HOpS03FnY=FEf6WHsVd1eMesZfrg@mail.gmail.com>
In-Reply-To <CAPTjJmr-8Uv0syAzQFs_b2HOpS03FnY=FEf6WHsVd1eMesZfrg@mail.gmail.com>
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
X-AntiAbuse This header was added to track abuse, please include it with any abuse report
X-AntiAbuse Primary Hostname - lincpan30.siteportal.com.au
X-AntiAbuse Original Domain - python.org
X-AntiAbuse Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse Sender Address Domain - fossworkflowguides.com
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.895.1315538412.27778.python-list@python.org> (permalink)
Lines 47
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1315538412 news.xs4all.nl 2535 [2001:888:2000:d::a6]:46865
X-Complaints-To abuse@xs4all.nl
Xref x330-a1.tempe.blueboxinc.net comp.lang.python:12998

Show key headers only | View raw


On 09/09/11 12:59, Chris Angelico wrote:
> On Fri, Sep 9, 2011 at 12:43 PM, Simon Cropper
> <simoncropper@fossworkflowguides.com>  wrote:
>> At present I am definitely getting the impression that my assumption that
>> something like this' must out there', is wrong.
>>
>> I have found a XML-Sitemaps Generator at http://www.xml-sitemaps.com,
>> this page allows you to create the XML files that can be uploaded to google.
>> But as stated I don't actually want what people now call 'sitemaps' I want a
>> automatically updated 'index / contents page' to my website. For example, if
>> I add a tutorial or update any of my links I want the 'global contents page'
>> to be updated when the python script is run.
>
> What you're looking at may be closer to autogenerated documentation
> than to a classic site map. There are a variety of tools that generate
> HTML pages on the basis of *certain information found in* all the
> files in a directory (as opposed to the entire content of those
> files). What you're trying to do may be sufficiently specific that it
> doesn't already exist, but it might be worth having a quick look at
> autodoc/doxygen - at least for some ideas.
>
> Chris Angelico

Chris,

You assessment is correct. Working through the PyPI I am having better 
luck with using different terms than the old-term 'sitemap'.

I have found a link to funnelweb which uses the transmogrify library 
(yeah, as if I would have typed this term into google!) that is 
described as "Crawl and parse static sites and import to Plone".

http://pypi.python.org/pypi/funnelweb/1.0

As funnelweb is modular, using a variety of the transmogrify tools, 
maybe I could modify this to create a 'non-plone' version.

-- 
Cheers Simon

    Simon Cropper - Open Content Creator / Website Administrator

    Free and Open Source Software Workflow Guides
    ------------------------------------------------------------
    Introduction               http://www.fossworkflowguides.com
    GIS Packages               http://gis.fossworkflowguides.com
    bash / Python        http://scripting.fossworkflowguides.com

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Create an index from a webpage Simon Cropper <simoncropper@fossworkflowguides.com> - 2011-09-08 22:26 +1000
  Re: Create an index from a webpage Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-09-08 14:38 +0200
    Re: Create an index from a webpage Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-09-09 01:11 +1000
      Re: Create an index from a webpage [RANT, DNFTT] Simon Cropper <simoncropper@fossworkflowguides.com> - 2011-09-09 09:40 +1000
        Re: Create an index from a webpage [RANT, DNFTT] "Rhodri James" <rhodri@wildebst.demon.co.uk> - 2011-09-09 01:32 +0100
          Re: Create an index from a webpage [RANT, DNFTT] Simon Cropper <simoncropper@fossworkflowguides.com> - 2011-09-09 12:09 +1000
            Re: Create an index from a webpage [RANT, DNFTT] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-09-09 12:16 +1000
            Re: Create an index from a webpage [RANT, DNFTT] Duncan Booth <duncan.booth@invalid.invalid> - 2011-09-09 10:29 +0000
        Re: Create an index from a webpage [RANT, DNFTT] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-09-09 12:14 +1000
          Re: Create an index from a webpage [RANT, DNFTT] Simon Cropper <simoncropper@fossworkflowguides.com> - 2011-09-09 12:43 +1000
          Re: Create an index from a webpage [RANT, DNFTT] Chris Angelico <rosuav@gmail.com> - 2011-09-09 12:59 +1000
          Re: Create an index from a webpage [RANT, DNFTT] Simon Cropper <simoncropper@fossworkflowguides.com> - 2011-09-09 13:20 +1000
          Re: Create an index from a webpage [RANT, DNFTT] Chris Angelico <rosuav@gmail.com> - 2011-09-09 13:46 +1000

csiph-web