Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #93728
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Subject | Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? |
| Date | 2015-07-12 11:51 +0200 |
| Organization | None |
| References | <cbcc6d3f-1fc7-4caf-b6d9-3a7ff9d8f1d5@googlegroups.com> <f0b23331-69f6-4503-b9f5-52024fb78609@googlegroups.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.446.1436694733.3674.python-list@python.org> (permalink) |
Simon Evans wrote:
> Dear Mark Lawrence, thank you for your advice.
> I take it that I use the input you suggest for the line :
>
> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid.html",lxml")
>
> seeing as I have to give the file's full address I therefore have to
> modify your :
>
> soup = BeautifulSoup(ecological_pyramid,"lxml")
>
> to :
>
> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid," "lxml")
>
> otherwise I get :
>
>
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html"."r")as
>>>> ecological_pyramid: soup = BeautifulSoup(ecological_pyramid,"lxml")
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> NameError: name 'ecological_pyramid' is not defined
>
>
> so anyway with the input therefore as:
>
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html"."r")as
>>>> ecological_pyramid: soup = BeautifulSoup("C:\Beautiful
>>>> Soup\ecological_pyramid,","lxml") producer_entries = soup.find("ul")
>>>> print(producer_entries.li.div.string)
No. If you pass the filename beautiful soup will mistake it as the HTML. You
can verify that in the interactive interpreter:
>>> soup = BeautifulSoup("C:\Beautiful Soup\ecologicalpyramid.html","lxml")
>>> soup
<html><body><p>C:\Beautiful Soup\ecologicalpyramid.html</p></body></html>
You have to pass an open file to BeautifulSoup, not a filename:
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r") as f:
... soup = BeautifulSoup(f, "lxml")
...
However, if you look at the data returned by soup.find("ul") you'll see
>>> producer_entries = soup.find("ul")
>>> producer_entries
<ul id="producers">
<li class="producers">
</li><li class="producerlist">
<div class="name">plants</div>
<div class="number">100000</div>
</li>
<li class="producerlist">
<div class="name">algae</div>
<div class="number">100000</div>
</li>
</ul>
The first <li>...</li> node does not contain a div
>>> producer_entries.li
<li class="producers">
</li>
and thus
>>> producer_entries.li.div is None
True
and the following error is expected with the given data.
Returning None is beautiful soup's way of indicating that the
<li> node has no <div> child at all. If you want to
process the first li that does have a <div> child a straight-forward
way is to iterate over the children:
>>> for li in producer_entries.find_all("li"):
... if li.div is not None:
... print(li.div.string)
... break # remove if you want all, not just the first
...
plants
Taking a second look at the data you probably want the li nodes with
class="producerlist":
>>> for li in soup.find_all("li", attrs={"class": "producerlist"}):
... print(li.div.string)
...
plants
algae
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-11 15:17 -0700
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-07-12 00:06 +0100
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 01:59 -0700
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Peter Otten <__peter__@web.de> - 2015-07-12 11:51 +0200
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 04:48 -0700
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Peter Otten <__peter__@web.de> - 2015-07-12 14:26 +0200
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 05:36 -0700
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 05:48 -0700
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Peter Otten <__peter__@web.de> - 2015-07-12 15:12 +0200
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Larry Hudson <orgnut@yahoo.com> - 2015-07-12 13:06 -0700
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 10:33 -0700
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? MRAB <python@mrabarnett.plus.com> - 2015-07-12 19:05 +0100
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-07-12 19:23 +0100
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? "Albert Visser" <albert.visser@gmail.com> - 2015-07-12 20:34 +0200
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Laura Creighton <lac@openend.se> - 2015-07-12 21:47 +0200
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-07-12 21:09 +0100
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Laura Creighton <lac@openend.se> - 2015-07-12 22:29 +0200
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-07-12 21:48 +0100
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Laurent Pointal <laurent.pointal@free.fr> - 2015-07-12 19:54 +0200
Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Chris Angelico <rosuav@gmail.com> - 2015-07-13 03:58 +1000
csiph-web