Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #99714
| From | Cem Karan <cfkaran2@gmail.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: How can I count word frequency in a web site? |
| Date | 2015-11-29 21:31 -0500 |
| Message-ID | <mailman.14.1448850720.14615.python-list@python.org> (permalink) |
| References | <6851e3b8-0d46-4808-9f7f-372b71bf327c@googlegroups.com> |
You might want to look into Beautiful Soup (https://pypi.python.org/pypi/beautifulsoup4), which is an HTML screen-scraping tool. I've never used it, but I've heard good things about it. Good luck, Cem Karan On Nov 29, 2015, at 7:49 PM, ryguy7272 <ryanshuell@gmail.com> wrote: > I'm trying to figure out how to count words in a web site. Here is a sample of the link I want to scrape data from and count specific words. > http://finance.yahoo.com/q/h?s=STRP+Headlines > > I only want to count certain words, like 'fraud', 'lawsuit', etc. I want to have a way to control for specific words. I have a couple Python scripts that do this for a text file, but not for a web site. I can post that, if that's helpful. > > -- > https://mail.python.org/mailman/listinfo/python-list
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
How can I count word frequency in a web site? ryguy7272 <ryanshuell@gmail.com> - 2015-11-29 16:49 -0800
Re: How can I count word frequency in a web site? Cem Karan <cfkaran2@gmail.com> - 2015-11-29 21:31 -0500
Re: How can I count word frequency in a web site? ryguy7272 <ryanshuell@gmail.com> - 2015-11-29 18:54 -0800
Re: How can I count word frequency in a web site? Michiel Overtoom <motoom@xs4all.nl> - 2015-11-30 08:56 +0100
Re: How can I count word frequency in a web site? Laura Creighton <lac@openend.se> - 2015-11-30 03:51 +0100
Re: How can I count word frequency in a web site? ryguy7272 <ryanshuell@gmail.com> - 2015-11-30 07:04 -0800
Re: How can I count word frequency in a web site? ryguy7272 <ryanshuell@gmail.com> - 2015-11-30 07:04 -0800
csiph-web