Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #56745 > unrolled thread

web scraping

Started byRonald Routt <ronroutt@gmail.com>
First post2013-10-12 10:12 -0400
Last post2013-10-13 11:02 -0700
Articles 3 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  web scraping Ronald Routt <ronroutt@gmail.com> - 2013-10-12 10:12 -0400
    Re: web scraping dvghana@gmail.com - 2013-10-12 13:35 -0700
      Re: web scraping John Nagle <nagle@animats.com> - 2013-10-13 11:02 -0700

#56745 — web scraping

FromRonald Routt <ronroutt@gmail.com>
Date2013-10-12 10:12 -0400
Subjectweb scraping
Message-ID<mailman.1039.1381592008.18130.python-list@python.org>
	I am new to programming and trying to figure out python.  

I am trying to learn which tools and tutorials I need to use along with some good beginner tutorials in scraping the the web.  The end result I am trying to come up with is scraping auto dealership sites for the following:  

1.Name of dealership
2.  State where dealership is located
3.  Name of Owner, President or General Manager
4.  Email address of number 3 above
5.  Phone number of dealership

Note:  Many times the Owner, President or General Manager and their email address is under a tab on the website such as "Meet our team" or "Support".  Sometimes this information is not available on the website.

I sure would appreciate any help I can get to get me on the right track.  From what I have read so far, believe I have to use urllib but know nothing about how to us it..

Thanks
ronroutt@gmail.com 

[toc] | [next] | [standalone]


#56756

Fromdvghana@gmail.com
Date2013-10-12 13:35 -0700
Message-ID<2eee66fb-3c9b-4ebd-969f-55f3e6bae558@googlegroups.com>
In reply to#56745
On Saturday, October 12, 2013 7:12:38 AM UTC-7, Ronald Routt wrote:
> I am new to programming and trying to figure out python.  
> 
> 
> 
> I am trying to learn which tools and tutorials I need to use along with some good beginner tutorials in scraping the the web.  The end result I am trying to come up with is scraping auto dealership sites for the following:  
> 
> 
> 
> 1.Name of dealership
> 
> 2.  State where dealership is located
> 
> 3.  Name of Owner, President or General Manager
> 
> 4.  Email address of number 3 above
> 
> 5.  Phone number of dealership
> 
> 
> 
> Note:  Many times the Owner, President or General Manager and their email address is under a tab on the website such as "Meet our team" or "Support".  Sometimes this information is not available on the website.
> 
> 
> 
> I sure would appreciate any help I can get to get me on the right track.  From what I have read so far, believe I have to use urllib but know nothing about how to us it..
> 
> 
> 
> Thanks
> 
> ronroutt@gmail.com

if you are really new to python I will suggest you go through the tutorial at www.learnpythonthehardway.org and when you are done search either google or youtube for how to use "beautiful soup" I believe you should be fine.

[toc] | [prev] | [next] | [standalone]


#56774

FromJohn Nagle <nagle@animats.com>
Date2013-10-13 11:02 -0700
Message-ID<l3en80$td8$1@dont-email.me>
In reply to#56756
On 10/12/2013 1:35 PM, dvghana@gmail.com wrote:
> On Saturday, October 12, 2013 7:12:38 AM UTC-7, Ronald Routt wrote:
>> I am new to programming and trying to figure out python.
>> 
>> 
>> 
>> I am trying to learn which tools and tutorials I need to use along
>> with some good beginner tutorials in scraping the the web.  The end
>> result I am trying to come up with is scraping auto dealership
>> sites for the following:
>> 
>> 1.Name of dealership 
>> 2.  State where dealership is located 
>> 3.  Name of Owner, President or General Manager 
>> 4.  Email address of number 3 above
>> 5.  Phone number of dealership

   If you really want that data, and aren't just hacking, buy it.
There are data brokers that will sell it to you. D&B, FindTheCompany,
Infot, etc.

   Sounds like you want to spam. Don't.

				John Nagle

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web