Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!news.stack.nl!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.014 X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'english)': 0.09; 'hallo,': 0.09; 'subject:files': 0.09; 'cc:addr:python-list': 0.10; '"this': 0.13; 'aug': 0.13; 'libs': 0.16; 'mark.': 0.16; 'received:mail- wi0-f178.google.com': 0.16; 'mon,': 0.16; 'string': 0.17; 'wrote:': 0.17; '>>>': 0.18; 'cc:2**0': 0.23; 'example': 0.23; 'this:': 0.23; 'cc:no real name:2**0': 0.24; 'testing': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'am,': 0.27; 'message-id:@mail.gmail.com': 0.27; "doesn't": 0.28; 'received:209.85.212': 0.28; 'lot.': 0.29; 'url:mailman': 0.29; 'code': 0.31; '(and': 0.32; 'url:python': 0.32; 'file': 0.32; 'print': 0.32; 'url:listinfo': 0.32; 'extract': 0.33; 'much.': 0.33; 'excel': 0.33; 'received:google.com': 0.34; 'thanks': 0.34; 'text.': 0.35; 'subject:?': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'url:org': 0.36; 'be.': 0.36; 'thank': 0.36; 'enough': 0.36; 'passed': 0.37; 'received:209': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'some': 0.38; 'where': 0.40; 'header:Received:5': 0.40; 'help': 0.40; 'url:mail': 0.40; 'your': 0.60; 'information': 0.63; 'answer:': 0.84; 'subject:read': 0.84; 'joel': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=Gh01wyCciBctQTKpzFyXNVjvRVsqmiF0xDp/ve9VwdY=; b=t4WnUn/pN15048Ye4XjWlUwfm0JSz0LPLgx/Z7QyxnQVyo0fuAj6803+C6Z5V8OKMZ BhNrPQM+C1MMcJo7k9UGLs9a3k54R1QlbROeo7aDDL1MVvhgXJYmmHk1aUOdXP/C/KCr aZGb2t8Y2yuR4Mf5GCpn/5fgAXGkr2rLADLuBRkxPmxSYduwqfNsvSCdtDl4982LGopK jA58DYwXZXOdXPsFBR9vlxHW0wNubUz9G6v0XTp/ztONdrF84m1YeQOj5C2aRL8IvAYl qrPnjV+0HuvCFTxG0VJ3h3D33CBJDCdOKSLheQDm1EMS7S6IfIo2OdKR56XuaoEHl517 LAlA== MIME-Version: 1.0 In-Reply-To: <858c2da2-6936-4bd7-8944-f45446fbd3be@googlegroups.com> References: <1c7cd833-b6ad-4a17-8ffe-a0ce20c8f400@googlegroups.com> <858c2da2-6936-4bd7-8944-f45446fbd3be@googlegroups.com> Date: Mon, 27 Aug 2012 10:21:31 -0400 Subject: Re: What do I do to read html files on my pc? From: Joel Goldstick To: mikcec82 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 80 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1346077298 news.xs4all.nl 6921 [2001:888:2000:d::a6]:38667 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:27985 On Mon, Aug 27, 2012 at 9:51 AM, mikcec82 wrote: > Il giorno luned=C3=AC 27 agosto 2012 12:59:02 UTC+2, mikcec82 ha scritto: >> Hallo, >> >> >> >> I have an html file on my pc and I want to read it to extract some text. >> >> Can you help on which libs I have to use and how can I do it? >> >> >> >> thank you so much. >> >> >> >> Michele > > Hi ChrisA, Hi Mark. > Thanks a lot. > > I have this html data and I want to check if it is present a string "XXXX= " or/and a string "NOT PASSED": > > > > >   >   >   >   >   > > XXXX > > > > . > . > . > > > > > > > CODE CHECK > > > : NOT PASSED > > > > > > Depending on this check I have to fill a cell in an excel file with answe= r: NOK (if Not passed or XXXX is present), or OK (if Not passed and XXXX ar= e not present). > > Thanks again for your help (and sorry for my english) > -- > http://mail.python.org/mailman/listinfo/python-list from your example it doesn't seem there is enough information to know where in the html your strings will be. If you just read the whole file into a string you can do this: >>> s =3D "this is a string" >>> if 'this' in s: ... print 'yes' ... yes >>> Of course you will be testing for 'XXXX' or 'NOT PASSED' --=20 Joel Goldstick