Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #28052
| Newsgroups | comp.lang.python |
|---|---|
| Date | 2012-08-29 03:22 -0700 |
| References | <1c7cd833-b6ad-4a17-8ffe-a0ce20c8f400@googlegroups.com> |
| Message-ID | <c92e1fdc-ff14-4484-9e03-e322de9ba82b@googlegroups.com> (permalink) |
| Subject | Re: What do I do to read html files on my pc? |
| From | mikcec82 <michele.cecere@gmail.com> |
Il giorno lunedì 27 agosto 2012 12:59:02 UTC+2, mikcec82 ha scritto:
> Hallo,
>
>
>
> I have an html file on my pc and I want to read it to extract some text.
>
> Can you help on which libs I have to use and how can I do it?
>
>
>
> thank you so much.
>
>
>
> Michele
Hi Peter and thanks for your precious help.
Fortunately, there aren't runs of "X" with repeats other than 2 or 4.
Starting from your code, I wrote this code (I post it, so it could be helpful for other people):
f = open(fileorig, 'r')
nomefile = f.read()
start = nomefile.find("XX")
start2 = nomefile.find("NOT PASSED")
c0 = 0
c1 = 0
c2 = 0
while (start != -1) | (start2 != -1):
if nomefile[start:start+4] == "XXXX":
print "XXXX found at location", start
start += 4
c0 +=1
elif nomefile[start:start+2] == "XX":
print "XX found at location", start
start += 2
c1 +=1
if nomefile[start2:start2+10] == "NOT PASSED":
print "NOT PASSED found at location", start2
start2 += 10
c2 +=1
start = nomefile.find("XX", start)
start2 = nomefile.find("NOT PASSED", start2)
print "XXXX %s founded" % c0, "\nXX %s founded" % c1, "\nNOT PASSED %s founded" % c2
Now, I'm able to find all occurences of strings: "XXXX", "XX" and "NOT PASSED"
Thank you so much.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
What do I do to read html files on my pc? mikcec82 <michele.cecere@gmail.com> - 2012-08-27 03:59 -0700
Re: What do I do to read html files on my pc? Chris Angelico <rosuav@gmail.com> - 2012-08-27 21:58 +1000
Re: What do I do to read html files on my pc? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-08-27 13:05 +0100
Re: What do I do to read html files on my pc? mikcec82 <michele.cecere@gmail.com> - 2012-08-27 06:51 -0700
Re: What do I do to read html files on my pc? Joel Goldstick <joel.goldstick@gmail.com> - 2012-08-27 10:21 -0400
Re: What do I do to read html files on my pc? Chris Angelico <rosuav@gmail.com> - 2012-08-28 00:41 +1000
Re: What do I do to read html files on my pc? Jean-Michel Pichavant <jeanmichel@sequans.com> - 2012-08-27 18:57 +0200
Re: What do I do to read html files on my pc? mikcec82 <michele.cecere@gmail.com> - 2012-08-28 03:09 -0700
Re: What do I do to read html files on my pc? Oscar Benjamin <oscar.j.benjamin@gmail.com> - 2012-08-28 13:31 +0100
Re: What do I do to read html files on my pc? Peter Otten <__peter__@web.de> - 2012-08-28 17:38 +0200
Re: What do I do to read html files on my pc? mikcec82 <michele.cecere@gmail.com> - 2012-08-29 03:22 -0700
Re: What do I do to read html files on my pc? Umesh Sharma <usharma01@gmail.com> - 2012-08-29 05:00 -0700
csiph-web