Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #35321

Re: Scrapy/XPath help

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <grettke@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'python,': 0.02; 'skip:[ 20': 0.03; "'',": 0.07; '21,': 0.07; 'subject:help': 0.07; 'python': 0.09; 'skip:t 60': 0.09; 'tutorials,': 0.09; 'cc:addr :python-list': 0.10; 'def': 0.10; 'dec': 0.15; 'appreciated!': 0.16; 'make,': 0.16; 'wrote:': 0.17; 'script.': 0.17; 'tests': 0.18; 'sender:addr:gmail.com': 0.18; 'trying': 0.21; 'import': 0.21; 'os,': 0.22; 'cc:2**0': 0.23; 'errors': 0.23; "i've": 0.23; 'cc:no real name:2**0': 0.24; 'cc:addr:python.org': 0.25; 'header :In-Reply-To:1': 0.25; 'message-id:@mail.gmail.com': 0.27; 'subject:/': 0.28; 'all.': 0.28; 'run': 0.28; 'url:mailman': 0.29; 'class': 0.29; "i'm": 0.29; 'fri,': 0.30; 'received:209.85.210.174': 0.30; 'version,': 0.30; 'basic': 0.30; 'error': 0.30; 'url:python': 0.32; 'help,': 0.32; 'could': 0.32; 'url:listinfo': 0.32; 'skip:h 40': 0.33; 'weeks': 0.33; 'received:google.com': 0.34; 'pm,': 0.35; 'received:209.85': 0.35; 'but': 0.36; 'url:org': 0.36; 'skip:t 40': 0.37; 'item': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'unit': 0.38; 'some': 0.38; 'header:Received:5': 0.40; 'help': 0.40; 'url:mail': 0.40; 'share': 0.61; 'ama,': 0.84; 'luck': 0.93
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Fw29fvr2m+MOPlktOtBVRBPnWq2McXU01agjigfCOIM=; b=BSeVz69KdqviueLz6i2brMMEubrrAa4YIkNjIJxfHpQ033y6op01CMgnM+v0e05z/u SyAJew5+xN1a+3/LQHUHY1peqRtJIx/4bAYmxL+zS0GkS7nyXYwNOhVTgSlWWIcRrbEI ML/eNlnhZzvVdYVwkDb9mIiGh27JFtLbOJ9g6phvNXq8BVTrT+Fg1itc+xeobg9uktFL niJYGpAXhKZ2y03e+zBdz1FmI0BF8l9Xs/30y+Q6w8/QALQ26CePoA/6Iy7bsof3MyVT b/ySDGAOZiLkwJhN7zKwmyHS1a7aTv2yGxArSNkOjoHWT55lxUgm89J+Gk2P39z3K1oE fDuA==
MIME-Version 1.0
Sender grettke@gmail.com
In-Reply-To <e180db33-272f-4a9d-bc1e-231f3c3580bf@googlegroups.com>
References <e180db33-272f-4a9d-bc1e-231f3c3580bf@googlegroups.com>
Date Fri, 21 Dec 2012 15:34:11 -0600
X-Google-Sender-Auth 2MLsu90ZrRDD3VbiYJ4hbJBIegw
Subject Re: Scrapy/XPath help
From Grant Rettke <grettke@acm.org>
To Always Learning <cbrowning@ou.edu>
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding quoted-printable
Cc python-list@python.org
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1169.1356125659.29569.python-list@python.org> (permalink)
Lines 43
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1356125659 news.xs4all.nl 6897 [2001:888:2000:d::a6]:40493
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:35321

Show key headers only | View raw


You might have better luck if you share the python make, version, os,
error message, and some unit tests demonstrating what you expect.

On Fri, Dec 21, 2012 at 3:21 PM, Always Learning <cbrowning@ou.edu> wrote:
> Hello all. I'm new to Python, but have been playing around with it for a few weeks now, following tutorials, etc. I've spun off on my own and am trying to do some basic web scraping. I've used Firebug/View XPath in Firefox for some help with the XPaths, however, I still am receiving errors when I try to run this script. If you could help, it would be greatly appreciated!
>
> from scrapy.spider import BaseSpider
> from scrapy.selector import HtmlXPathSelector
> from cbb_info.items import CbbInfoItem, Field
>
> class GameInfoSpider(BaseSpider):
>     name = "game_info"
>     allowed_domains = ["www.sbrforum.com"]
>     start_urls = [
>         'http://www.sbrforum.com/betting-odds/ncaa-basketball/',
>         ]
>
>     def parse(self, response):
>         hxs = HtmlXPathSelector(response)
>         toplevels = hxs.select("//div[@class='eventLine-value']")
>         items = []
>         for toplevels in toplevels:
>             item = CbbInfoItem()
>             item ["teams"] = toplevels.select("/span[@class='team-name'/text()").extract()
>             item ["lines"] = toplevels.select("/div[@rel='19']").extract()
>             item.append(item)
>         return items
> --
> http://mail.python.org/mailman/listinfo/python-list



-- 
Grant Rettke | ACM, AMA, COG, IEEE
grettke@acm.org | http://www.wisdomandwonder.com/
Wisdom begins in wonder.
((λ (x) (x x)) (λ (x) (x x)))

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Scrapy/XPath help Always Learning <cbrowning@ou.edu> - 2012-12-21 13:21 -0800
  Re: Scrapy/XPath help Grant Rettke <grettke@acm.org> - 2012-12-21 15:34 -0600
    Re: Scrapy/XPath help Always Learning <cbrowning@ou.edu> - 2012-12-21 13:58 -0800
      Re: Scrapy/XPath help Dave Angel <d@davea.name> - 2012-12-21 22:18 -0500
      Re: Scrapy/XPath help donarb <donarb@nwlink.com> - 2012-12-25 11:15 -0800
      Re: Scrapy/XPath help donarb <donarb@nwlink.com> - 2012-12-25 11:15 -0800
    Re: Scrapy/XPath help Always Learning <cbrowning@ou.edu> - 2012-12-21 13:58 -0800

csiph-web