Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #93704 > unrolled thread

Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?

Started bySimon Evans <musicalhacksaw@yahoo.co.uk>
First post2015-07-11 15:17 -0700
Last post2015-07-13 03:58 +1000
Articles 20 — 9 participants

Back to article view | Back to comp.lang.python


Contents

  Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-11 15:17 -0700
    Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-07-12 00:06 +0100
    Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 01:59 -0700
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Peter Otten <__peter__@web.de> - 2015-07-12 11:51 +0200
    Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 04:48 -0700
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Peter Otten <__peter__@web.de> - 2015-07-12 14:26 +0200
    Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 05:36 -0700
    Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 05:48 -0700
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Peter Otten <__peter__@web.de> - 2015-07-12 15:12 +0200
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Larry Hudson <orgnut@yahoo.com> - 2015-07-12 13:06 -0700
    Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Simon Evans <musicalhacksaw@yahoo.co.uk> - 2015-07-12 10:33 -0700
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? MRAB <python@mrabarnett.plus.com> - 2015-07-12 19:05 +0100
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-07-12 19:23 +0100
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? "Albert Visser" <albert.visser@gmail.com> - 2015-07-12 20:34 +0200
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Laura Creighton <lac@openend.se> - 2015-07-12 21:47 +0200
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-07-12 21:09 +0100
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Laura Creighton <lac@openend.se> - 2015-07-12 22:29 +0200
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2015-07-12 21:48 +0100
    Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Laurent Pointal <laurent.pointal@free.fr> - 2015-07-12 19:54 +0200
      Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ? Chris Angelico <rosuav@gmail.com> - 2015-07-13 03:58 +1000

#93704 — Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?

FromSimon Evans <musicalhacksaw@yahoo.co.uk>
Date2015-07-11 15:17 -0700
SubjectWhy doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?
Message-ID<cbcc6d3f-1fc7-4caf-b6d9-3a7ff9d8f1d5@googlegroups.com>
Dear Programmers, 
Thank you for your advice regarding giving the console a current address in the code for it to access the html file. 

The console seems to accept the code to that extent, but when I input the two lines of code intended to access the location of a required word, the console rejects it re :

AttributeError:'NoneType' object has no attribute 'li' 

However the document 'EcologicalPyramid.html' does contain the words 'li' and 'ul', in its text. I am not sure as to how the input is arranged to output 'plants' which is also in the documents text, but that is the word the code is meant to elicit. 

I enclose the pertinent code as input and output from the console, and the html code for the document 'EcologicalPyramid.html'

Thank you in advance for your help. 

---------------------------------------------------------------------
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r") as ecological_pyramid:
soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid.html","lxml")
... producer_entries = soup.find("ul")
  File "<stdin>", line 2
    producer_entries = soup.find("ul")
                   ^
SyntaxError: invalid syntax
>>> producer_entries = soup.find("ul")
>>> print (producer_entries.li.div.string)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'li'
----------------------------------------------------------------------
prin<html>
<body>
<div class="ecopyramid">
<ul id= "producers">
<li class="producers">
<li class="producerlist">
<div class="name">plants</div>
<div class="number">100000</div>
</li>
<li class="producerlist">
<div class="name">algae</div>
<div class="number">100000</div>
</li>
</ul>
<ul id ="primaryconsumers">
<li class="primaryconsumerlist">
<div class="name">deer</div>
<div class-"number">1000</div>
</li
<li class="primaryconsumerlist">
<div class="name">deer</div>
<div class="number">1000</div>
</li>
<li class="primaryconsumerlist">
<div class="name">rabbit</div>
<div class="number">2000</div>
</li>
</ul>
<ul id="secondaryconsumers">
<li class="secondary consumerlist">
<div class="name">fox</div>
<div class="number">100</div>
</li>
<li class=secondaryconsumerlist">
<div class="name">bear</div>
<div class="number">100</div>
</li>
</ul>
<ul id="tertiaryconsumers">
<li class="tertiaryconsumerslist">
<div class="name">lion</div>
<div class="number">80</div>
</li>
<li class="tertiaryconsumerlist">
<div class="name">tiger</div>
<div class="number">50</div>
</li>
</ul>
</body>
</html>

[toc] | [next] | [standalone]


#93708

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-07-12 00:06 +0100
Message-ID<mailman.436.1436656022.3674.python-list@python.org>
In reply to#93704
On 11/07/2015 23:17, Simon Evans wrote:
> Dear Programmers,
> Thank you for your advice regarding giving the console a current address in the code for it to access the html file.
>
> The console seems to accept the code to that extent, but when I input the two lines of code intended to access the location of a required word, the console rejects it re :
>
> AttributeError:'NoneType' object has no attribute 'li'
>
> However the document 'EcologicalPyramid.html' does contain the words 'li' and 'ul', in its text. I am not sure as to how the input is arranged to output 'plants' which is also in the documents text, but that is the word the code is meant to elicit.
>
> I enclose the pertinent code as input and output from the console, and the html code for the document 'EcologicalPyramid.html'
>
> Thank you in advance for your help.
>
> ---------------------------------------------------------------------
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r") as ecological_pyramid:
> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid.html","lxml")

Beautiful Soup takes a string or a file handle so as it's good  practise 
to use the "with open" construct this should do it:-

soup = BeautifulSoup(ecological_pyramid,"lxml")

but do you actually need the "lxml", with the simple parsing I've done 
in the past I've never used it?

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#93726

FromSimon Evans <musicalhacksaw@yahoo.co.uk>
Date2015-07-12 01:59 -0700
Message-ID<f0b23331-69f6-4503-b9f5-52024fb78609@googlegroups.com>
In reply to#93704
Dear Mark Lawrence, thank you for your advice. 
I take it that I use the input you suggest for the line :

soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid.html",lxml")

seeing as I have to give the file's full address I therefore have to modify your :

soup = BeautifulSoup(ecological_pyramid,"lxml")

to :

soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid," "lxml")

otherwise I get :


>>> with open("C:\Beautiful Soup\ecologicalpyramid.html"."r")as ecological_pyramid:
>>> soup = BeautifulSoup(ecological_pyramid,"lxml")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'ecological_pyramid' is not defined


so anyway with the input therefore as:

>>> with open("C:\Beautiful Soup\ecologicalpyramid.html"."r")as ecological_pyramid: 
>>> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid,","lxml")
>>> producer_entries = soup.find("ul")
>>> print(producer_entries.li.div.string)

I still get the following output from the console:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'li'
>>>

As is probably evident, what is the problem Python has with finding the required html code within the 'ecologicalpyramid' html file, or more specifically why does it respond that the html file has no such attribute as 'li' ?
Incidentally I have installed all the xml, lxml, html, and html5 TreeBuilders/ Parsers. I am using lxml as that is the format specified in the text. 

I may as well quote the text on the page in question in 'Getting Started with Beautiful Soup':

'Since producers come as the first entry for the <ul>tag, we can use the find() method, which normally searches fo ronly the first occurrance of a particular tag in a BeautifulSoup object. We store this in producer_entries. The next line prints the name of the first producer. From the previous HTML diagram we can understand that the first producer is stored inside the first <div> tag of the first <li> tag that immediately follows the first <ul> tag , as shown inthe following code: 

<ul id = "producers">
<li class= "producerlist">
<div class= "name">plants</div>
<div class="name">100000</div>
</li>
</ul>

So after running the preceding code, we will get plants, which is the first producer, as the output.'

(page 30)

[toc] | [prev] | [next] | [standalone]


#93728

FromPeter Otten <__peter__@web.de>
Date2015-07-12 11:51 +0200
Message-ID<mailman.446.1436694733.3674.python-list@python.org>
In reply to#93726
Simon Evans wrote:

> Dear Mark Lawrence, thank you for your advice.
> I take it that I use the input you suggest for the line :
> 
> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid.html",lxml")
> 
> seeing as I have to give the file's full address I therefore have to
> modify your :
> 
> soup = BeautifulSoup(ecological_pyramid,"lxml")
> 
> to :
> 
> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid," "lxml")
> 
> otherwise I get :
> 
> 
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html"."r")as
>>>> ecological_pyramid: soup = BeautifulSoup(ecological_pyramid,"lxml")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> NameError: name 'ecological_pyramid' is not defined
> 
> 
> so anyway with the input therefore as:
> 
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html"."r")as
>>>> ecological_pyramid: soup = BeautifulSoup("C:\Beautiful
>>>> Soup\ecological_pyramid,","lxml") producer_entries = soup.find("ul")
>>>> print(producer_entries.li.div.string)

No. If you pass the filename beautiful soup will mistake it as the HTML. You
can verify that in the interactive interpreter:

>>> soup = BeautifulSoup("C:\Beautiful Soup\ecologicalpyramid.html","lxml")
>>> soup
<html><body><p>C:\Beautiful Soup\ecologicalpyramid.html</p></body></html>

You have to pass an open file to BeautifulSoup, not a filename:

>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r") as f:
...     soup = BeautifulSoup(f, "lxml")
... 

However, if you look at the data returned by soup.find("ul") you'll see

>>> producer_entries = soup.find("ul")
>>> producer_entries
<ul id="producers">
<li class="producers">
</li><li class="producerlist">
<div class="name">plants</div>
<div class="number">100000</div>
</li>
<li class="producerlist">
<div class="name">algae</div>
<div class="number">100000</div>
</li>
</ul>

The first <li>...</li> node does not contain a div

>>> producer_entries.li
<li class="producers">
</li>

and thus

>>> producer_entries.li.div is None
True

and the following error is expected with the given data. 
Returning None is beautiful soup's way of indicating that the
<li> node has no <div> child at all. If you want to 
process the first li that does have a <div> child a straight-forward 
way is to iterate over the children:

>>> for li in producer_entries.find_all("li"):
...     if li.div is not None:
...         print(li.div.string)
...         break # remove if you want all, not just the first
... 
plants

Taking a second look at the data you probably want the li nodes with
class="producerlist":

>>> for li in soup.find_all("li", attrs={"class": "producerlist"}):
...     print(li.div.string)
... 
plants
algae

[toc] | [prev] | [next] | [standalone]


#93732

FromSimon Evans <musicalhacksaw@yahoo.co.uk>
Date2015-07-12 04:48 -0700
Message-ID<c7fead8b-58c0-4981-8eb6-961615d512a1@googlegroups.com>
In reply to#93704
Dear Peter Otten, thank you for your reply that I have not gone very far into the detail of which, as it seems Python console cannot recognise the name 'f' as given it, re output below :


Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.

>>> from bs4 import BeautifulSoup
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
>>> soup = BeautifulSoup(f, "lxml")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'f' is not defined
>>>

[toc] | [prev] | [next] | [standalone]


#93733

FromPeter Otten <__peter__@web.de>
Date2015-07-12 14:26 +0200
Message-ID<mailman.449.1436704006.3674.python-list@python.org>
In reply to#93732
Simon Evans wrote:

> Dear Peter Otten, thank you for your reply that I have not gone very far
> into the detail of which, as it seems Python console cannot recognise the
> name 'f' as given it, re output below :
> 
> 
> Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]
> on win 32
> Type "help", "copyright", "credits" or "license" for more information.
> 
>>>> from bs4 import BeautifulSoup
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
>>>> soup = BeautifulSoup(f, "lxml")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> NameError: name 'f' is not defined
>>>>

Is that copy-and-paste? When I try to reproduce that I get

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
... soup = BeautifulSoup(f, "lxml")
  File "<stdin>", line 2
    soup = BeautifulSoup(f, "lxml")
       ^
IndentationError: expected an indented block

and when I indent the soup = ... line properly I don't get an error:

$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
...     soup = BeautifulSoup(f, "lxml")
... 
>>> 

[toc] | [prev] | [next] | [standalone]


#93734

FromSimon Evans <musicalhacksaw@yahoo.co.uk>
Date2015-07-12 05:36 -0700
Message-ID<98fa51f3-ea5b-40eb-b9d9-aa54594d1299@googlegroups.com>
In reply to#93704
Dear Peter Otten,
Incidentally, you have discovered a fault in that there is an erroneous difference in my code of 'ecologicalpyramid.html' and that given in the text, in the first few lines re: 
----------------------------------------------------------------------------
<html> 
<body> 
<div class="ecopyramid"> 
<ul id= "producers"> 
<li class="producers"> 
<li class="producerlist"> 
<div class="name">plants</div> 
<div class="number">100000</div> 
</li> 
<li class="producerlist"> 
<div class="name">algae</div> 
<div class="number">100000</div> 
</li> 
</ul> 
----------------------------------------------------------------------------
<html>
<body>
<div class="ecopyramid">
<ul id= "producers">
<li class="producerlist">
<div class="name">plants</div>
<div class="number">100000</div>
</li>
<li class="producerlist">
<div class="name">algae</div>
<div class="number">100000</div>
</li>
</ul>
----------------------------------------------------------------------------
I have removed the line <li class="producers"> to the right html code of the lower 

version. Now there is a string ("plants") between the <"li class producerlist"> and </li>
Sorry about that.
However as you said, the input code as quoted in the text, still won't return 'plants' 

re:
----------------------------------------------------------------------------
Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid,","lxml")
>>> producer_entries = soup.find("ul")
>>> print(producer_entries.li.div.string)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'li'
>>> 
----------------------------------------------------------------------------

[toc] | [prev] | [next] | [standalone]


#93735

FromSimon Evans <musicalhacksaw@yahoo.co.uk>
Date2015-07-12 05:48 -0700
Message-ID<8ae715b3-91b1-4db2-9e69-cfc8c078939e@googlegroups.com>
In reply to#93704
Dear Peter Otten, 
Yes, I have been copying and pasting, as it saves typing. I do get 'indented block' error responses as a small price to pay for the time and energy thus saved. Also Console seems to reject for 'indented block' reasons better known to itself, copy and pasted lines that it accepts and  are exactly the same on the following line of input. Maybe it is an inbuilt feature of Python's to discourage copy and pasting.

[toc] | [prev] | [next] | [standalone]


#93736

FromPeter Otten <__peter__@web.de>
Date2015-07-12 15:12 +0200
Message-ID<mailman.450.1436706744.3674.python-list@python.org>
In reply to#93735
Simon Evans wrote:

> Dear Peter Otten,
> Yes, I have been copying and pasting, as it saves typing. I do get
> 'indented block' error responses as a small price to pay for the time and
> energy thus saved. Also Console seems to reject for 'indented block'
> reasons better known to itself, copy and pasted lines that it accepts and 
> are exactly the same on the following line of input. Maybe it is an
> inbuilt feature of Python's to discourage copy and pasting.

You should not ignore the error. It means that the respective code was not 
executed.

[toc] | [prev] | [next] | [standalone]


#93745

FromLarry Hudson <orgnut@yahoo.com>
Date2015-07-12 13:06 -0700
Message-ID<AtmdnZHIzrokVT_InZ2dnUU7-amdnZ2d@giganews.com>
In reply to#93735
On 07/12/2015 05:48 AM, Simon Evans wrote:
> Dear Peter Otten,
> Yes, I have been copying and pasting, as it saves typing. I do get 'indented block' error responses as a small price
 > to pay for the time and energy thus saved.
 >
You CANNOT ignore indenting.
Indenting is NOT optional, it is REQUIRED by Python syntax.
Changing indenting TOTALLY changes the meaning.

 > Also Console seems to reject for 'indented block' reasons better known to itself, copy and
 > pasted lines that it accepts and  are exactly the same on the following line of input.
 >
As above, that is Python syntax.

 > Maybe it is an inbuilt feature of Python's to discourage copy and pasting.
 >
Absolutely not.  As noted above, this is the way Python is defined to work.

      -=- Larry -=-

[toc] | [prev] | [next] | [standalone]


#93737

FromSimon Evans <musicalhacksaw@yahoo.co.uk>
Date2015-07-12 10:33 -0700
Message-ID<1e75fe53-743b-4432-8b4e-6897b3d54956@googlegroups.com>
In reply to#93704
Dear Peter Otten, 
I typed in (and did not copy and paste) the code as you suggested just now (6.28 pm, Sunday 12th July 2015), this is the result I got: 
----------------------------------------------------------------------------
Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
... soup = BeautifulSoup(f,"lxml")
  File "<stdin>", line 2
    soup = BeautifulSoup(f,"lxml")
       ^
IndentationError: expected an indented block
>>> soup = BeautifulSoup(f,"lxml")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'f' is not defined
>>>
----------------------------------------------------------------------------
The first time I typed in the second line, I got the 
"Indentation error" 
the second time I typed in exactly the same code, I got the: 
"NameError:name 'f' is not defined"

[toc] | [prev] | [next] | [standalone]


#93740

FromMRAB <python@mrabarnett.plus.com>
Date2015-07-12 19:05 +0100
Message-ID<mailman.452.1436724512.3674.python-list@python.org>
In reply to#93737
On 2015-07-12 18:33, Simon Evans wrote:
>
> Dear Peter Otten,
> I typed in (and did not copy and paste) the code as you suggested just now (6.28 pm, Sunday 12th July 2015), this is the result I got:
> ----------------------------------------------------------------------------
> Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
> 32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from bs4 import BeautifulSoup
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
> ... soup = BeautifulSoup(f,"lxml")
>    File "<stdin>", line 2
>      soup = BeautifulSoup(f,"lxml")
>         ^
> IndentationError: expected an indented block
>>>> soup = BeautifulSoup(f,"lxml")
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> NameError: name 'f' is not defined
>>>>
> ----------------------------------------------------------------------------
> The first time I typed in the second line, I got the
> "Indentation error"
> the second time I typed in exactly the same code, I got the:
> "NameError:name 'f' is not defined"
>
Python parses the lines, and, if there are no syntax errors, it then
executes them.

That block of code (the 2 lines) contained a syntax (indentation)
error in the second line, so _none_ of the block was executed, and,
therefore, 'f' isn't defined.

[toc] | [prev] | [next] | [standalone]


#93741

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-07-12 19:23 +0100
Message-ID<mailman.453.1436725403.3674.python-list@python.org>
In reply to#93737
On 12/07/2015 18:33, Simon Evans wrote:
>
> Dear Peter Otten,
> I typed in (and did not copy and paste) the code as you suggested just now (6.28 pm, Sunday 12th July 2015), this is the result I got:
> ----------------------------------------------------------------------------
> Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
> 32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from bs4 import BeautifulSoup
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
> ... soup = BeautifulSoup(f,"lxml")
>    File "<stdin>", line 2
>      soup = BeautifulSoup(f,"lxml")
>         ^
> IndentationError: expected an indented block
>>>> soup = BeautifulSoup(f,"lxml")
> Traceback (most recent call last):
>    File "<stdin>", line 1, in <module>
> NameError: name 'f' is not defined
>>>>
> ----------------------------------------------------------------------------
> The first time I typed in the second line, I got the
> "Indentation error"
> the second time I typed in exactly the same code, I got the:
> "NameError:name 'f' is not defined"
>

You can tell that to the marines :)

 >>> with open("ecologicalpyramid.html","r")as f:
...     soup = BeautifulSoup(f)
...     producer_entries = soup.find("ul")
...     producer_entries
...
<ul id="producers">
<li class="producers">
</li><li class="producerlist">
<div class="name">plants</div>
<div class="number">100000</div>
</li>
<li class="producerlist">
<div class="name">algae</div>
<div class="number">100000</div>
</li>
</ul>
 >>>

Can I suggest that you slow down.  It strikes me that you're trying to 
run a marathon a day for a year before you can even walk.  For example 
is the file path in your call to open() correct?  Frankly I very much 
doubt it, although it is possible.

Perhaps you'd be more comfortable on the tutor mailing list?  If so see 
https://mail.python.org/mailman/listinfo/tutor or gmane.comp.python.tutor.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#93742

From"Albert Visser" <albert.visser@gmail.com>
Date2015-07-12 20:34 +0200
Message-ID<mailman.454.1436726068.3674.python-list@python.org>
In reply to#93737
On Sun, 12 Jul 2015 19:33:17 +0200, Simon Evans  
<musicalhacksaw@yahoo.co.uk> wrote:

>
> Dear Peter Otten,
> I typed in (and did not copy and paste) the code as you suggested just  
> now (6.28 pm, Sunday 12th July 2015), this is the result I got:
> ----------------------------------------------------------------------------
> Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit  
> (Intel)] on win
> 32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from bs4 import BeautifulSoup
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
> ... soup = BeautifulSoup(f,"lxml")
>   File "<stdin>", line 2
>     soup = BeautifulSoup(f,"lxml")
>        ^
> IndentationError: expected an indented block
>>>> soup = BeautifulSoup(f,"lxml")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> NameError: name 'f' is not defined
>>>>
> ----------------------------------------------------------------------------
> The first time I typed in the second line, I got the
> "Indentation error"
> the second time I typed in exactly the same code, I got the:
> "NameError:name 'f' is not defined"

"Expected an indented block" means that the indicated line should have  
started with at least one whitespace character more than the preceding  
line.

  >>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
  ... soup = BeautifulSoup(f,"lxml")

should have been something like

  >>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
  ...     soup = BeautifulSoup(f,"lxml")
-- 
Vriendelijke groeten / Kind regards,

Albert Visser

Using Opera's mail client: http://www.opera.com/mail/

[toc] | [prev] | [next] | [standalone]


#93744

FromLaura Creighton <lac@openend.se>
Date2015-07-12 21:47 +0200
Message-ID<mailman.456.1436730478.3674.python-list@python.org>
In reply to#93737
Simon Evans -- what editor are you using to write your Python code with?

Laura Creighton

[toc] | [prev] | [next] | [standalone]


#93746

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-07-12 21:09 +0100
Message-ID<mailman.457.1436731780.3674.python-list@python.org>
In reply to#93737
On 12/07/2015 20:47, Laura Creighton wrote:
> Simon Evans -- what editor are you using to write your Python code with?
>
> Laura Creighton
>

Editor?  His earlier posts clearly show he's using the 2.7.6 32 bit 
interactive interpreter on Windows.

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#93747

FromLaura Creighton <lac@openend.se>
Date2015-07-12 22:29 +0200
Message-ID<mailman.458.1436732966.3674.python-list@python.org>
In reply to#93737
In a message of Sun, 12 Jul 2015 21:09:22 +0100, Mark Lawrence writes:
>On 12/07/2015 20:47, Laura Creighton wrote:
>> Simon Evans -- what editor are you using to write your Python code with?
>>
>> Laura Creighton
>>
>
>Editor?  His earlier posts clearly show he's using the 2.7.6 32 bit 
>interactive interpreter on Windows.

He's sending us that stuff, but he may be writing it someplace else
first.  And that someplace else may be spitting out combined
spaces and tabs.  If he is pasting that into the interpreter,
that will cause tons of problems like he is seeing.

Laura

[toc] | [prev] | [next] | [standalone]


#93748

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2015-07-12 21:48 +0100
Message-ID<mailman.459.1436734120.3674.python-list@python.org>
In reply to#93737
On 12/07/2015 21:29, Laura Creighton wrote:
> In a message of Sun, 12 Jul 2015 21:09:22 +0100, Mark Lawrence writes:
>> On 12/07/2015 20:47, Laura Creighton wrote:
>>> Simon Evans -- what editor are you using to write your Python code with?
>>>
>>> Laura Creighton
>>>
>>
>> Editor?  His earlier posts clearly show he's using the 2.7.6 32 bit
>> interactive interpreter on Windows.
>
> He's sending us that stuff, but he may be writing it someplace else
> first.  And that someplace else may be spitting out combined
> spaces and tabs.  If he is pasting that into the interpreter,
> that will cause tons of problems like he is seeing.
>
> Laura
>

At 18:33 BST Simon stated "I typed in (and did not copy and paste) the 
code...".  I started my reply with "You can tell that to the marines :)".

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#93738

FromLaurent Pointal <laurent.pointal@free.fr>
Date2015-07-12 19:54 +0200
Message-ID<55a2a9b9$0$2970$426a74cc@news.free.fr>
In reply to#93704
Simon Evans wrote:

<zip>
> ---------------------------------------------------------------------
>>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r") as

You seem to run Python under Windows.

You have to take care of escape mechanism beyond \ char in string literals 
(see Python docs). 

By chance, \B and \e are not recognized as escape sequences, so they are 
left as is. But \a \b \f \n \r \t \v \x  will be processed differently

Double \ in your string:
	"C:\\Beautiful Soup\\ecologicalpyramid.html"

Or use a raw string by prepending a r to disable escape sequences:
	r"C:\Beautiful Soup\ecologicalpyramid.html"


<zip>

A+
Laurent.

[toc] | [prev] | [next] | [standalone]


#93739

FromChris Angelico <rosuav@gmail.com>
Date2015-07-13 03:58 +1000
Message-ID<mailman.451.1436723920.3674.python-list@python.org>
In reply to#93738
On Mon, Jul 13, 2015 at 3:54 AM, Laurent Pointal
<laurent.pointal@free.fr> wrote:
> Double \ in your string:
>         "C:\\Beautiful Soup\\ecologicalpyramid.html"
>
> Or use a raw string by prepending a r to disable escape sequences:
>         r"C:\Beautiful Soup\ecologicalpyramid.html"

Or use forward slashes:
    "C:/Beautiful Soup/ecologicalpyramid.html"

ChrisA

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web