Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #27318

Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'

From Stefan Behnel <stefan_ml@behnel.de>
Subject Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
Date 2012-08-18 19:56 +0200
References <cb6f13e3-f189-44e6-8aac-f11d3e7fa7ba@googlegroups.com>
Newsgroups comp.lang.python
Message-ID <mailman.3465.1345312619.4697.python-list@python.org> (permalink)

Show all headers | View raw


Dmitry Arsentiev, 15.08.2012 14:49:
> Has anybody already meet the problem like this? -
> AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
> 
> When I run scrapy, I get
> 
>   File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py",
> line 14, in <module>
>     libxml2.HTML_PARSE_NOERROR + \
> AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
> 
> 
> When I run
>  python -c 'import libxml2; libxml2.HTML_PARSE_RECOVER'
> 
> I get
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
> AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER'
> 
> How can I cure it?
> 
> Python 2.7
> libxml2-python 2.6.9
> 2.6.11-gentoo-r6

That version of libxml2 is way too old and doesn't support parsing
real-world HTML. IIRC, that started with 2.6.21 and got improved a bit
after that.

Get a 2.8.0 installation, as someone pointed out already.

Stefan

Back to comp.lang.python | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

python+libxml2+scrapy  AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' Dmitry Arsentiev <dmarsentev@gmail.com> - 2012-08-15 05:49 -0700
  Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' Dieter Maurer <dieter@handshake.de> - 2012-08-16 07:19 +0200
  Re: python+libxml2+scrapy  AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' personificator@gmail.com - 2012-08-16 18:57 -0700
  Re: python+libxml2+scrapy  AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' Stefan Behnel <stefan_ml@behnel.de> - 2012-08-18 19:56 +0200

csiph-web