Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #52777

Re: I wonder if I would be able to collect data from such page using Python

Path csiph.com!usenet.pasdenom.info!gegeweb.org!usenet-fr.net!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <joel.goldstick@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.006
X-Spam-Evidence '*H*': 0.99; '*S*': 0.00; 'python,': 0.02; 'subject:Python': 0.06; '21,': 0.07; 'subject:would': 0.07; 'welcome.': 0.07; 'apis': 0.09; 'pages.': 0.09; 'parsing': 0.09; 'subject:using': 0.09; 'api': 0.11; 'cc:addr:python-list': 0.11; 'journal,': 0.16; 'layout,': 0.16; 'retrieving': 0.16; 'do,': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'subject:page': 0.19; 'seems': 0.21; 'aug': 0.22; 'cc:addr:python.org': 0.22; 'mind.': 0.24; 'simpler': 0.24; 'cc:2**0': 0.24; 'task': 0.26; 'header:In-Reply- To:1': 0.27; 'message-id:@mail.gmail.com': 0.30; 'url:mailman': 0.30; 'accomplished': 0.31; 'probably': 0.32; 'url:python': 0.33; 'guess': 0.33; 'comment': 0.34; 'subject:from': 0.34; 'received:google.com': 0.35; 'subject:data': 0.36; 'url:listinfo': 0.36; 'thanks': 0.36; 'url:org': 0.36; 'searching': 0.37; 'stable': 0.38; 'pm,': 0.38; 'url:mail': 0.40; 'how': 0.40; 'break': 0.61; "you're": 0.61; 'more': 0.64; 'dear': 0.65; 'wall': 0.65; 'to:addr:gmail.com': 0.65; 'websites': 0.72; ':).': 0.84; 'popped': 0.84; 'safer': 0.84; 'joel': 0.91; 'luck': 0.93; '2013': 0.98
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=VJQiKcY7Y9SUjvAQfezFwNL8+R3DnMUQNLz+qyJvdq8=; b=K0ITn8mKM5iNGzuPweviQr0lI5w51ItBDG1vGFBXDGODRNoPhcUzMUON9Lu7P2z0bv E9I90jgxxJ70MC9mnrQxlNrI1FaIe3dalGx/4J4zJ35mYKP6OKOcnryb6KC2GF/Z8JC4 X3vEzTeIpmeUkKULnPHMD1/6rnkq0+72svYdb/JrQmtVdshXAYQqB78uvN9iUi1wsrXN GFxSxsl5BKT6KRW9KCPpdG2C68ECpmIcDQfwmxvYygL5Iv7dcskk3GZ4wLpFXRzvagjW PNOZZpdMJvT6khoaYolNCvVIJYJukkkupXOushAAFXFy8fwiLp9Gu5eF7CSyj3MKv9In ObeA==
MIME-Version 1.0
X-Received by 10.52.92.15 with SMTP id ci15mr1603271vdb.34.1377107538306; Wed, 21 Aug 2013 10:52:18 -0700 (PDT)
In-Reply-To <02caf0a8-1506-4746-9136-3452cbdea14b@googlegroups.com>
References <a50210f8-8959-46da-a386-2d9a7a17a79e@googlegroups.com> <mailman.81.1377099024.19984.python-list@python.org> <bfd5cc17-8901-47b4-944f-7841c8d7cc15@googlegroups.com> <mailman.83.1377100719.19984.python-list@python.org> <02caf0a8-1506-4746-9136-3452cbdea14b@googlegroups.com>
Date Wed, 21 Aug 2013 13:52:18 -0400
Subject Re: I wonder if I would be able to collect data from such page using Python
From Joel Goldstick <joel.goldstick@gmail.com>
To Comment Holder <commentholder@gmail.com>
Content-Type text/plain; charset=UTF-8
Cc "python-list@python.org" <python-list@python.org>
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.89.1377107547.19984.python-list@python.org> (permalink)
Lines 22
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1377107547 news.xs4all.nl 15866 [2001:888:2000:d::a6]:53857
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:52777

Show key headers only | View raw


On Wed, Aug 21, 2013 at 1:41 PM, Comment Holder <commentholder@gmail.com> wrote:
> Dear Joel,
>
> Many thanks for your help - I think I shall start with this way and see how it goes. My concerns were if the task can be accomplished with Python, and from your posts, I guess it can - so I shall give it a try :).
>
> Again, thanks a lot & all best//
>
> --
> http://mail.python.org/mailman/listinfo/python-list


You're welcome.  One thought popped into my mind.  Since the site
seems to be from the Wall Street Journal, you may want to look into
whether they have an api for searching and retrieving articles.  If
they do, this would be simpler and probably safer than parsing web
pages.  From time to time, websites change their layout, which would
probably break your program.  However APIs are more stable

good luck to you
-- 
Joel Goldstick
http://joelgoldstick.com

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 07:55 -0700
  Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 11:30 -0400
    Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 08:44 -0700
      Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 11:58 -0400
        Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 10:41 -0700
          Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 13:52 -0400
          Re: I wonder if I would be able to collect data from such page using Python Terry Reedy <tjreedy@udel.edu> - 2013-08-21 15:18 -0400
            Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-22 07:58 -0700
  Re: I wonder if I would be able to collect data from such page using Python Piet van Oostrum <piet@vanoostrum.org> - 2013-08-22 00:54 -0400
    Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-22 08:03 -0700
      Re: I wonder if I would be able to collect data from such page using Python Chris Angelico <rosuav@gmail.com> - 2013-08-23 01:11 +1000

csiph-web