Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #56745 > unrolled thread
| Started by | Ronald Routt <ronroutt@gmail.com> |
|---|---|
| First post | 2013-10-12 10:12 -0400 |
| Last post | 2013-10-13 11:02 -0700 |
| Articles | 3 — 3 participants |
Back to article view | Back to comp.lang.python
web scraping Ronald Routt <ronroutt@gmail.com> - 2013-10-12 10:12 -0400
Re: web scraping dvghana@gmail.com - 2013-10-12 13:35 -0700
Re: web scraping John Nagle <nagle@animats.com> - 2013-10-13 11:02 -0700
| From | Ronald Routt <ronroutt@gmail.com> |
|---|---|
| Date | 2013-10-12 10:12 -0400 |
| Subject | web scraping |
| Message-ID | <mailman.1039.1381592008.18130.python-list@python.org> |
I am new to programming and trying to figure out python. I am trying to learn which tools and tutorials I need to use along with some good beginner tutorials in scraping the the web. The end result I am trying to come up with is scraping auto dealership sites for the following: 1.Name of dealership 2. State where dealership is located 3. Name of Owner, President or General Manager 4. Email address of number 3 above 5. Phone number of dealership Note: Many times the Owner, President or General Manager and their email address is under a tab on the website such as "Meet our team" or "Support". Sometimes this information is not available on the website. I sure would appreciate any help I can get to get me on the right track. From what I have read so far, believe I have to use urllib but know nothing about how to us it.. Thanks ronroutt@gmail.com
[toc] | [next] | [standalone]
| From | dvghana@gmail.com |
|---|---|
| Date | 2013-10-12 13:35 -0700 |
| Message-ID | <2eee66fb-3c9b-4ebd-969f-55f3e6bae558@googlegroups.com> |
| In reply to | #56745 |
On Saturday, October 12, 2013 7:12:38 AM UTC-7, Ronald Routt wrote: > I am new to programming and trying to figure out python. > > > > I am trying to learn which tools and tutorials I need to use along with some good beginner tutorials in scraping the the web. The end result I am trying to come up with is scraping auto dealership sites for the following: > > > > 1.Name of dealership > > 2. State where dealership is located > > 3. Name of Owner, President or General Manager > > 4. Email address of number 3 above > > 5. Phone number of dealership > > > > Note: Many times the Owner, President or General Manager and their email address is under a tab on the website such as "Meet our team" or "Support". Sometimes this information is not available on the website. > > > > I sure would appreciate any help I can get to get me on the right track. From what I have read so far, believe I have to use urllib but know nothing about how to us it.. > > > > Thanks > > ronroutt@gmail.com if you are really new to python I will suggest you go through the tutorial at www.learnpythonthehardway.org and when you are done search either google or youtube for how to use "beautiful soup" I believe you should be fine.
[toc] | [prev] | [next] | [standalone]
| From | John Nagle <nagle@animats.com> |
|---|---|
| Date | 2013-10-13 11:02 -0700 |
| Message-ID | <l3en80$td8$1@dont-email.me> |
| In reply to | #56756 |
On 10/12/2013 1:35 PM, dvghana@gmail.com wrote: > On Saturday, October 12, 2013 7:12:38 AM UTC-7, Ronald Routt wrote: >> I am new to programming and trying to figure out python. >> >> >> >> I am trying to learn which tools and tutorials I need to use along >> with some good beginner tutorials in scraping the the web. The end >> result I am trying to come up with is scraping auto dealership >> sites for the following: >> >> 1.Name of dealership >> 2. State where dealership is located >> 3. Name of Owner, President or General Manager >> 4. Email address of number 3 above >> 5. Phone number of dealership If you really want that data, and aren't just hacking, buy it. There are data brokers that will sell it to you. D&B, FindTheCompany, Infot, etc. Sounds like you want to spam. Don't. John Nagle
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web