Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #52768

Re: I wonder if I would be able to collect data from such page using Python

References <a50210f8-8959-46da-a386-2d9a7a17a79e@googlegroups.com>
Date 2013-08-21 11:30 -0400
Subject Re: I wonder if I would be able to collect data from such page using Python
From Joel Goldstick <joel.goldstick@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.81.1377099024.19984.python-list@python.org> (permalink)

Show all headers | View raw


On Wed, Aug 21, 2013 at 10:55 AM, Comment Holder
<commentholder@gmail.com> wrote:
> Hi,
> I am totally new to Python. I noticed that there are many videos showing how to collect data from Python, but I am not sure if I would be able to accomplish my goal using Python so I can start learning.
>
> Here is the example of the target page:
> http://and.medianewsonline.com/hello.html
> In this example, there are 10 articles.
>
> What I exactly need is to do the following:
> 1- Collect the article title, date, source, and contents.
> 2- I need to be able to export the final results to excel or a database client. That is, I need to have all of those specified in step 1 in one row, while each of them saved in separate column. For example:
>
> Title1    Date1   Source1   Contents1
> Title2    Date2   Source2   Contents2
>
> I appreciate any advise regarding my case.
>
> Thanks & Regards//
> --
> http://mail.python.org/mailman/listinfo/python-list

I'm guessing that you are not only new to Python, but that you haven't
much experience in writing computer programs at all.  So, you need to
do that.  There is a good tutorial on the python site, and lots of
links to other resources.

then do this:

1. write code to access the page you require.  The Requests module can
help with that
2. write code to select the data you want.  The BeautifulSoup module
is excellent for this
3. write code to save your data in comma separated value format.
4. import to excel or wherever

Now, go off and write the code.  When you get stuck, copy and paste
the portion of the code that is giving you problems, along with the
traceback.  You can also get help at the python-tutor mailing list



-- 
Joel Goldstick
http://joelgoldstick.com

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 07:55 -0700
  Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 11:30 -0400
    Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 08:44 -0700
      Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 11:58 -0400
        Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 10:41 -0700
          Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 13:52 -0400
          Re: I wonder if I would be able to collect data from such page using Python Terry Reedy <tjreedy@udel.edu> - 2013-08-21 15:18 -0400
            Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-22 07:58 -0700
  Re: I wonder if I would be able to collect data from such page using Python Piet van Oostrum <piet@vanoostrum.org> - 2013-08-22 00:54 -0400
    Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-22 08:03 -0700
      Re: I wonder if I would be able to collect data from such page using Python Chris Angelico <rosuav@gmail.com> - 2013-08-23 01:11 +1000

csiph-web