Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.008 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'hallo,': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'snippet': 0.09; 'subject:files': 0.09; '"python': 0.16; 'libs': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'wrote:': 0.17; 'code.': 0.20; 'smallest': 0.22; 'help.': 0.22; 'header:In-Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'header:X-Complaints-To:1': 0.28; 'code': 0.31; 'file': 0.32; 'from:addr:yahoo.co.uk': 0.32; 'extract': 0.33; 'much.': 0.33; 'traceback': 0.33; 'to:addr:python-list': 0.33; 'text.': 0.35; 'subject:?': 0.35; 'something': 0.35; 'received:org': 0.36; 'thank': 0.36; 'problems': 0.36; 'subject:: ': 0.38; 'mark': 0.38; 'some': 0.38; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'help': 0.40; 'your': 0.60; 'back': 0.62; 'subject:read': 0.84; 'received:2': 0.91 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Mark Lawrence Subject: Re: What do I do to read html files on my pc? Date: Mon, 27 Aug 2012 13:05:55 +0100 References: <1c7cd833-b6ad-4a17-8ffe-a0ce20c8f400@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: host-2-97-64-154.as13285.net User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:14.0) Gecko/20120713 Thunderbird/14.0 In-Reply-To: <1c7cd833-b6ad-4a17-8ffe-a0ce20c8f400@googlegroups.com> X-Antivirus: avast! (VPS 120827-0, 27/08/2012), Outbound message X-Antivirus-Status: Clean X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 22 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1346069036 news.xs4all.nl 6858 [2001:888:2000:d::a6]:46324 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:27981 On 27/08/2012 11:59, mikcec82 wrote: > Hallo, > > I have an html file on my pc and I want to read it to extract some text. > Can you help on which libs I have to use and how can I do it? > > thank you so much. > > Michele > Type something like "python html parsing" into the box of your favourite search engine, hit return and follow the links it comes back with. Write some code. If you have problems give us the smallest code snippet that reproduces the issue together with the complete traceback and we'll help. -- Cheers. Mark Lawrence.