Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #78014

Best approach to get data from web page continuously

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.albasani.net!weretis.net!feeder4.news.weretis.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed3a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <juan0christian@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.018
X-Spam-Evidence '*H*': 0.97; '*S*': 0.00; 'value,': 0.04; '(python': 0.07; 'level,': 0.07; 'things,': 0.09; 'topics,': 0.09; 'python': 0.11; 'checking,': 0.16; 'fetch': 0.16; 'soup': 0.16; 'topics': 0.16; 'subject:page': 0.19; 'thanks.': 0.20; 'print': 0.22; 'script': 0.25; 'pass': 0.26; 'message-id:@mail.gmail.com': 0.30; 'skip:c 30': 0.32; 'checking': 0.33; 'subject:from': 0.34; 'info': 0.35; 'received:google.com': 0.35; 'subject:data': 0.36; "i'll": 0.36; 'to:addr:python-list': 0.38; 'to:addr:python.org': 0.39; 'new': 0.61; 'account': 0.65; 'beautiful': 0.68; 'subject:get': 0.81; '&lt;a': 0.84; 'url:app': 0.84; 'subject:Best': 0.91
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=v7T4DTZjdGEOLTk2F+sKt/F2pTtwbL585BHkxEwQP+Y=; b=C9yUmKDfDSSAfQw9g0Juz5GsRYQhbQcdUGO4PSvpX/yEq9Hi++lrZgWxVxFolk1d8q n6zSFURjQ6YwfOKyG6VsxmaMAs/WRfXLhk3nmqFYB6oIbL9zP+K8iQmydMvRf54TVbvN JIz5P7htp5MKXIghprWJzSqS99to9L3R+2kWcroljLRIJ5V74Ka9RjJEOu4HuaAfI9D9 +ILO1hNwn8PJ1sQIXY7R6Zltj029u4SjHOMsOenzjt6WFhY8xTbqj/80Hpy6PK+KC5dm 77XRW+b6AhYvHprhz6uK1E9nD7YTRMEFcNybp2VnZVjwVYvj1tTbLIGZ2iNloY5ejYY9 RpXA==
X-Received by 10.112.13.10 with SMTP id d10mr4433695lbc.10.1411047066756; Thu, 18 Sep 2014 06:31:06 -0700 (PDT)
MIME-Version 1.0
From Juan Christian <juan0christian@gmail.com>
Date Thu, 18 Sep 2014 10:30:46 -0300
Subject Best approach to get data from web page continuously
To Python <python-list@python.org>
Content-Type multipart/alternative; boundary=001a11c3a11a2cec66050356ff46
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.14102.1411047074.18130.python-list@python.org> (permalink)
Lines 37
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1411047074 news.xs4all.nl 2946 [2001:888:2000:d::a6]:56673
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:78014

Show key headers only | View raw


[Multipart message — attachments visible in raw view] - view raw

I'll write a python (Python 3.4.1) script to fetch for new data (topics)
from this page (http://steamcommunity.com/app/440/tradingforum)
continuously.

All the topics follow this structure: <a class="forum_topic_overlay" href="
http://steamcommunity.com/app/440/tradingforum/TOPIC_ID/"> </a>

It will work like that: I'll get the last topics, do some background
checking regarding user level, inventory value, account age, and other
things, if the user pass in the checking, I'll print some info and links in
the terminal. The only thing I need to know to start is: What's the better
way the get this data? Beautiful Soup 4 + requests? urllib? Others?

Thanks.

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Best approach to get data from web page continuously Juan Christian <juan0christian@gmail.com> - 2014-09-18 10:30 -0300

csiph-web