Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #99681
| Newsgroups | comp.lang.python |
|---|---|
| Date | 2015-11-28 14:37 -0800 |
| References | <e13afc4b-ac4e-4a75-bca6-1c7be9399cb6@googlegroups.com> <mailman.1.1448749716.14615.python-list@python.org> |
| Message-ID | <48f7bb74-93f0-4bf8-b781-e7f4b2daf032@googlegroups.com> (permalink) |
| Subject | Re: Does Python allow variables to be passed into function for dynamic screen scraping? |
| From | ryguy7272 <ryanshuell@gmail.com> |
On Saturday, November 28, 2015 at 5:28:55 PM UTC-5, Laura Creighton wrote:
> In a message of Sat, 28 Nov 2015 14:03:10 -0800, ryguy7272 writes:
> >I'm looking at this URL.
> >https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names
> >
> >If I hit F12 I can see tags such as these:
> ><a title=
> ><a class=
> >And so on and so forth.
> >
> >I'm wondering if someone can share a script, or a function, that will allow me to pass in variables and download (or simply print) the results. I saw a sample online that I thought would work, and I made a few modifications but now I keep getting a message that says: ValueError: All objects passed were None
> >
> >Here's the script that I'm playing around with.
> >
> >import requests
> >import pandas as pd
> >from bs4 import BeautifulSoup
> >
> >#Get the relevant webpage set the data up for parsing
> >url = "https://en.wikipedia.org/wiki/Wikipedia:Unusual_place_names"
> >r = requests.get(url)
> >soup=BeautifulSoup(r.content,"lxml")
> >
> >#set up a function to parse the "soup" for each category of information and put it in a DataFrame
> >def get_match_info(soup,tag,class_name):
> > info_array=[]
> > for info in soup.find_all('%s'%tag,attrs={'class':'%s'%class_name}):
> > return pd.DataFrame(info_array)
> >
> >#for each category pass the above function the relevant information i.e. tag names
> >tag1 = get_match_info(soup,"td","title")
> >tag2 = get_match_info(soup,"td","class")
> >
> >#Concatenate the DataFrames to present a final table of all the above info
> >match_info = pd.concat([tag1,tag2],ignore_index=False,axis=1)
> >
> >print match_info
> >
> >I'd greatly appreciate any help with this.
>
> Post your error traceback. If you are getting Value Errors about None,
> then probably something you expect to return a match, isn't. But without
> the actual error, we cannot help much.
>
> Laura
Ok. How do I post the error traceback? I'm using Spyder Python 2.7.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Does Python allow variables to be passed into function for dynamic screen scraping? ryguy7272 <ryanshuell@gmail.com> - 2015-11-28 14:03 -0800
Re: Does Python allow variables to be passed into function for dynamic screen scraping? Laura Creighton <lac@openend.se> - 2015-11-28 23:28 +0100
Re: Does Python allow variables to be passed into function for dynamic screen scraping? ryguy7272 <ryanshuell@gmail.com> - 2015-11-28 14:37 -0800
Re: Does Python allow variables to be passed into function for dynamic screen scraping? Laura Creighton <lac@openend.se> - 2015-11-28 23:44 +0100
Re: Does Python allow variables to be passed into function for dynamic screen scraping? Steven D'Aprano <steve@pearwood.info> - 2015-11-29 12:58 +1100
Re: Does Python allow variables to be passed into function for dynamic screen scraping? ryguy7272 <ryanshuell@gmail.com> - 2015-11-28 20:52 -0800
csiph-web