Path: csiph.com!usenet.pasdenom.info!gegeweb.org!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <02caf0a8-1506-4746-9136-3452cbdea14b@googlegroups.com>
References: <a50210f8-8959-46da-a386-2d9a7a17a79e@googlegroups.com> <mailman.81.1377099024.19984.python-list@python.org> <bfd5cc17-8901-47b4-944f-7841c8d7cc15@googlegroups.com> <mailman.83.1377100719.19984.python-list@python.org> <02caf0a8-1506-4746-9136-3452cbdea14b@googlegroups.com>
Date: Wed, 21 Aug 2013 13:52:18 -0400
Subject: Re: I wonder if I would be able to collect data from such page using Python
From: Joel Goldstick <joel.goldstick@gmail.com>
To: Comment Holder <commentholder@gmail.com>
Content-Type: text/plain; charset=UTF-8
Cc: "python-list@python.org" <python-list@python.org>
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.89.1377107547.19984.python-list@python.org>
Lines: 22
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:52777

On Wed, Aug 21, 2013 at 1:41 PM, Comment Holder <commentholder@gmail.com> wrote:
> Dear Joel,
>
> Many thanks for your help - I think I shall start with this way and see how it goes. My concerns were if the task can be accomplished with Python, and from your posts, I guess it can - so I shall give it a try :).
>
> Again, thanks a lot & all best//
>
> --
> http://mail.python.org/mailman/listinfo/python-list


You're welcome.  One thought popped into my mind.  Since the site
seems to be from the Wall Street Journal, you may want to look into
whether they have an api for searching and retrieving articles.  If
they do, this would be simpler and probably safer than parsing web
pages.  From time to time, websites change their layout, which would
probably break your program.  However APIs are more stable

good luck to you
-- 
Joel Goldstick
http://joelgoldstick.com