Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Laura Creighton Newsgroups: comp.lang.python Subject: Re: Does Python allow variables to be passed into function for dynamic screen scraping? Date: Sat, 28 Nov 2015 23:44:21 +0100 Lines: 71 Message-ID: References: <48f7bb74-93f0-4bf8-b781-e7f4b2daf032@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Trace: news.uni-berlin.de lFuUwMtS3ryJcKsDVxNedwtX+paG8UD2Lxy2RDCw61lg== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'subject:Python': 0.05; 'none,': 0.05; 'valueerror:': 0.07; 'cc:addr:python-list': 0.09; 'creighton': 0.09; 'received:openend.se': 0.09; 'received:theraft.openend.se': 0.09; 'script,': 0.09; 'subject:Does': 0.09; 'subject:into': 0.09; 'python': 0.10; 'skip:p 40': 0.15; 'variables': 0.15; '>on': 0.16; 'cc:addr:lac': 0.16; 'cc:addr:openend.se': 0.16; 'forth.': 0.16; 'from:addr:lac': 0.16; 'from:addr:openend.se': 0.16; 'from:name:laura creighton': 0.16; "isn't.": 0.16; 'message-id:@fido.openend.se': 0.16; 'received:fido': 0.16; 'received:fido.openend.se': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:allow': 0.16; 'subject:screen': 0.16; 'traceback.': 0.16; 'url.': 0.16; 'wrote:': 0.16; 'laura': 0.18; '2015': 0.20; 'cc:addr:python.org': 0.20; 'work,': 0.21; 'cc:2**1': 0.22; 'function,': 0.22; 'parse': 0.22; 'parsing': 0.22; 'pass': 0.22; 'code,': 0.23; '(or': 0.23; '(like': 0.23; 'errors': 0.23; 'sat,': 0.23; 'import': 0.24; 'somewhere': 0.24; 'requests': 0.25; 'wondering': 0.25; 'script': 0.25; 'error': 0.27; 'function': 0.28; 'this.': 0.28; 'actual': 0.28; 'received:se': 0.29; 'url:wikipedia': 0.29; 'cc:no real name:2**1': 0.29; 'objects': 0.29; "i'm": 0.30; 'url:wiki': 0.30; 'code': 0.30; 'skip:g 30': 0.30; 'probably': 0.31; 'post': 0.31; 'table': 0.32; 'getting': 0.33; 'traceback': 0.33; 'info': 0.34; 'i.e.': 0.35; 'nov': 0.35; 'skip:> 10': 0.35; 'something': 0.35; 'but': 0.36; 'url:org': 0.36; 'subject:?': 0.36; 'subject:: ': 0.37; 'expect': 0.37; 'skip:s 50': 0.37; 'thought': 0.37; 'charset :us-ascii': 0.37; 'names': 0.38; 'skip:p 20': 0.38; 'someone': 0.38; 'data': 0.39; 'url:en': 0.39; 'along': 0.39; 'your': 0.60; 'share': 0.61; 'header:Message-Id:1': 0.61; 'caused': 0.61; 'saturday,': 0.63; 'sample': 0.63; 'information': 0.63; 'webpage': 0.66; 'cut': 0.67; 'results.': 0.67; 'email,': 0.69; 'online': 0.71; '>from': 0.76; 'saw': 0.77; '2.7.': 0.84; '>def': 0.84; '>if': 0.84; 'header:In-reply-to:1': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=openend.se; s=default; t=1448750663; bh=Ru0QzEPpZ+17AD/fDjkCg3fJ/eulMaLOUPdZzzHAG54=; h=To:cc:From:Subject:In-reply-to:References:Date:From; b=nJix54C/ZNs0IhLcO1pgLE519SDfXm46ybTeh39dAHYzneLXDqj1ucwdaSGGiCbnr yGqw2q7PZzIGJWj1UsZa9PwD6HcTd4oxp2Nb7FJH7aPc+fQqN+fwUmxwYVbsY/L8Bi jMPDJsSkixdU8dD52OpDRPitAasczyvjjmkfz9EE= In-reply-to: <48f7bb74-93f0-4bf8-b781-e7f4b2daf032@googlegroups.com> Comments: In-reply-to ryguy7272 message dated "Sat, 28 Nov 2015 14:37:26 -0800." Content-ID: <26164.1448750661.1@fido> X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.9 (theraft.openend.se [82.96.5.2]); Sat, 28 Nov 2015 23:44:23 +0100 (CET) X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:99682 In a message of Sat, 28 Nov 2015 14:37:26 -0800, ryguy7272 writes: >On Saturday, November 28, 2015 at 5:28:55 PM UTC-5, Laura Creighton wrote= : >> In a message of Sat, 28 Nov 2015 14:03:10 -0800, ryguy7272 writes: >> >I'm looking at this URL. >> >https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names >> > >> >If I hit F12 I can see tags such as these: >> >> >> >And so on and so forth. = >> > >> >I'm wondering if someone can share a script, or a function, that will = allow me to pass in variables and download (or simply print) the results. = I saw a sample online that I thought would work, and I made a few modific= ations but now I keep getting a message that says: ValueError: All objects= passed were None >> > >> >Here's the script that I'm playing around with. >> > >> >import requests >> >import pandas as pd >> >from bs4 import BeautifulSoup >> > >> >#Get the relevant webpage set the data up for parsing >> >url =3D "https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names" >> >r =3D requests.get(url) >> >soup=3DBeautifulSoup(r.content,"lxml") >> > >> >#set up a function to parse the "soup" for each category of informatio= n and put it in a DataFrame >> >def get_match_info(soup,tag,class_name): >> > info_array=3D[] >> > for info in soup.find_all('%s'%tag,attrs=3D{'class':'%s'%class_nam= e}): >> > return pd.DataFrame(info_array) >> > >> >#for each category pass the above function the relevant information i.= e. tag names >> >tag1 =3D get_match_info(soup,"td","title") >> >tag2 =3D get_match_info(soup,"td","class") >> > >> >#Concatenate the DataFrames to present a final table of all the above = info = >> >match_info =3D pd.concat([tag1,tag2],ignore_index=3DFalse,axis=3D1) >> > >> >print match_info >> > >> >I'd greatly appreciate any help with this. >> = >> Post your error traceback. If you are getting Value Errors about None, >> then probably something you expect to return a match, isn't. But witho= ut >> the actual error, we cannot help much. >> = >> Laura > > >Ok. How do I post the error traceback? I'm using Spyder Python 2.7. You cut and paste it out of wherever you are reading it, and paste it into the email, along with your code, also cut and pasted from somewhere (like an editor). That way we get the exact code that caused the exact traceback you are getting. Laura