Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #27318
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Subject | Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' |
| Date | 2012-08-18 19:56 +0200 |
| References | <cb6f13e3-f189-44e6-8aac-f11d3e7fa7ba@googlegroups.com> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3465.1345312619.4697.python-list@python.org> (permalink) |
Dmitry Arsentiev, 15.08.2012 14:49: > Has anybody already meet the problem like this? - > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' > > When I run scrapy, I get > > File "/usr/local/lib/python2.7/site-packages/scrapy/selector/factories.py", > line 14, in <module> > libxml2.HTML_PARSE_NOERROR + \ > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' > > > When I run > python -c 'import libxml2; libxml2.HTML_PARSE_RECOVER' > > I get > Traceback (most recent call last): > File "<string>", line 1, in <module> > AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' > > How can I cure it? > > Python 2.7 > libxml2-python 2.6.9 > 2.6.11-gentoo-r6 That version of libxml2 is way too old and doesn't support parsing real-world HTML. IIRC, that started with 2.6.21 and got improved a bit after that. Get a 2.8.0 installation, as someone pointed out already. Stefan
Back to comp.lang.python | Previous | Next — Previous in thread | Find similar | Unroll thread
python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' Dmitry Arsentiev <dmarsentev@gmail.com> - 2012-08-15 05:49 -0700 Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' Dieter Maurer <dieter@handshake.de> - 2012-08-16 07:19 +0200 Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' personificator@gmail.com - 2012-08-16 18:57 -0700 Re: python+libxml2+scrapy AttributeError: 'module' object has no attribute 'HTML_PARSE_RECOVER' Stefan Behnel <stefan_ml@behnel.de> - 2012-08-18 19:56 +0200
csiph-web