Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <bfd5cc17-8901-47b4-944f-7841c8d7cc15@googlegroups.com>
References: <a50210f8-8959-46da-a386-2d9a7a17a79e@googlegroups.com> <mailman.81.1377099024.19984.python-list@python.org> <bfd5cc17-8901-47b4-944f-7841c8d7cc15@googlegroups.com>
Date: Wed, 21 Aug 2013 11:58:30 -0400
Subject: Re: I wonder if I would be able to collect data from such page using Python
From: Joel Goldstick <joel.goldstick@gmail.com>
To: Comment Holder <commentholder@gmail.com>
Content-Type: text/plain; charset=UTF-8
Cc: "python-list@python.org" <python-list@python.org>
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.83.1377100719.19984.python-list@python.org>
Lines: 32
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:52770

On Wed, Aug 21, 2013 at 11:44 AM, Comment Holder
<commentholder@gmail.com> wrote:
> Many thanks Joel,
>
> You are right to some extent. I come from Finance background, but I am very familiar with what could be referred to as non-native languages such as Matlab, VBA,.. actually, I have developed couple of complete programs.
>
> I have asked this question, because I am a little worried about the structure of this particular page, as there are no specific defined classes.
>
> I know how powerful Python is, but I wonder if it could do the job with this particular page.
>
> Again, many thanks Joel, I appreciate your guidance.
> All Best//
> --
> http://mail.python.org/mailman/listinfo/python-list

Your biggest hurdle will be to get proficient with python.  Give
yourself a weekend with a good tutorial.  You won't be very skilled,
but you will get the gist of things.

Also, google Beautiful Soup.  You need the latest version. Its v4 I
think.  They have a GREAT tutorial.  Spend a few hours with it and you
will see your way to get the data you want from your web pages.

Since you gave a sample web page, I am guessing that you need to log
in to the site for 'real data'.  For that, you need to really
understand stuff that you might not.  At any rate, study the Requests
Module documentation.  Python comes with urllib, and urllib2 that
cover the same ground, but Requests is a lot simpler to understand

-- 
Joel Goldstick
http://joelgoldstick.com