Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #52770
| Path | csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <joel.goldstick@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.012 |
| X-Spam-Evidence | '*H*': 0.98; '*S*': 0.00; 'python.': 0.02; 'subject:Python': 0.06; '21,': 0.07; 'subject:would': 0.07; 'urllib2': 0.07; 'classes.': 0.09; 'pages.': 0.09; 'proficient': 0.09; 'subject:using': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'programs.': 0.14; 'background,': 0.16; 'ground,': 0.16; 'guessing': 0.16; 'non-native': 0.16; 'think.': 0.16; 'wrote:': 0.18; 'wed,': 0.18; 'module': 0.19; 'subject:page': 0.19; 'things.': 0.19; 'version.': 0.19; 'aug': 0.22; 'cc:addr:python.org': 0.22; 'simpler': 0.24; 'cc:2**0': 0.24; 'defined': 0.27; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'wonder': 0.29; 'message-id:@mail.gmail.com': 0.30; 'url:mailman': 0.30; 'asked': 0.31; 'page.': 0.31; 'requests': 0.31; 'languages': 0.32; 'stuff': 0.32; 'url:python': 0.33; 'not.': 0.33; 'comment': 0.34; 'subject:from': 0.34; 'could': 0.34; 'but': 0.35; 'received:google.com': 0.35; 'google': 0.35; 'there': 0.35; 'really': 0.36; 'subject:data': 0.36; 'url:listinfo': 0.36; 'thanks': 0.36; 'url:org': 0.36; 'question,': 0.38; 'weekend': 0.38; 'that,': 0.38; 'little': 0.38; 'structure': 0.39; 'url:mail': 0.40; 'how': 0.40; 'referred': 0.60; 'complete': 0.62; 'developed': 0.63; 'such': 0.63; 'great': 0.65; 'to:addr:gmail.com': 0.65; 'hours': 0.66; 'latest': 0.67; 'sample': 0.67; 'biggest': 0.67; 'beautiful': 0.68; 'study': 0.69; 'finance': 0.70; 'yourself': 0.78; '11:44': 0.84; 'actually,': 0.84; 'extent.': 0.84; 'joel': 0.91; 'skilled,': 0.91; '2013': 0.98 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=aptjYUqxInjH0lArhrX9vIcnxdwv8049yH7C/LW85EQ=; b=MJYt63C1m5RDIs/rCv0u7wK7xKi22Tr96z5H4exYHQDFmwar0kst06aHejg/brYvJ2 dz3yrv4Jw/eNJslHs1RNiiLVyJHWpwctARLBnnCCoDnzv4JJVgzmjphu29v5you/tQ4Q hfsiD4rJPZEEiP1vxDg/OySRfw5+01rFf+C4knBcSOyelogtV4mmE7wx7G+HMBmINzrM p0UJigcCx/Nmj1VCJWMEJxhxuG1zNjmjxosoimUl/KUKyF1hhUz3xSQMOAftJjGhMnqU BHEqXZAAUzASJgNUZu85E/r7m36gse2S4T/gC69lahKT+QG4rGW0N6mKCO4WEGW28Pxv Jf9w== |
| MIME-Version | 1.0 |
| X-Received | by 10.58.196.132 with SMTP id im4mr1592241vec.28.1377100710226; Wed, 21 Aug 2013 08:58:30 -0700 (PDT) |
| In-Reply-To | <bfd5cc17-8901-47b4-944f-7841c8d7cc15@googlegroups.com> |
| References | <a50210f8-8959-46da-a386-2d9a7a17a79e@googlegroups.com> <mailman.81.1377099024.19984.python-list@python.org> <bfd5cc17-8901-47b4-944f-7841c8d7cc15@googlegroups.com> |
| Date | Wed, 21 Aug 2013 11:58:30 -0400 |
| Subject | Re: I wonder if I would be able to collect data from such page using Python |
| From | Joel Goldstick <joel.goldstick@gmail.com> |
| To | Comment Holder <commentholder@gmail.com> |
| Content-Type | text/plain; charset=UTF-8 |
| Cc | "python-list@python.org" <python-list@python.org> |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.83.1377100719.19984.python-list@python.org> (permalink) |
| Lines | 32 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1377100719 news.xs4all.nl 15986 [2001:888:2000:d::a6]:41230 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:52770 |
Show key headers only | View raw
On Wed, Aug 21, 2013 at 11:44 AM, Comment Holder <commentholder@gmail.com> wrote: > Many thanks Joel, > > You are right to some extent. I come from Finance background, but I am very familiar with what could be referred to as non-native languages such as Matlab, VBA,.. actually, I have developed couple of complete programs. > > I have asked this question, because I am a little worried about the structure of this particular page, as there are no specific defined classes. > > I know how powerful Python is, but I wonder if it could do the job with this particular page. > > Again, many thanks Joel, I appreciate your guidance. > All Best// > -- > http://mail.python.org/mailman/listinfo/python-list Your biggest hurdle will be to get proficient with python. Give yourself a weekend with a good tutorial. You won't be very skilled, but you will get the gist of things. Also, google Beautiful Soup. You need the latest version. Its v4 I think. They have a GREAT tutorial. Spend a few hours with it and you will see your way to get the data you want from your web pages. Since you gave a sample web page, I am guessing that you need to log in to the site for 'real data'. For that, you need to really understand stuff that you might not. At any rate, study the Requests Module documentation. Python comes with urllib, and urllib2 that cover the same ground, but Requests is a lot simpler to understand -- Joel Goldstick http://joelgoldstick.com
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 07:55 -0700
Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 11:30 -0400
Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 08:44 -0700
Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 11:58 -0400
Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-21 10:41 -0700
Re: I wonder if I would be able to collect data from such page using Python Joel Goldstick <joel.goldstick@gmail.com> - 2013-08-21 13:52 -0400
Re: I wonder if I would be able to collect data from such page using Python Terry Reedy <tjreedy@udel.edu> - 2013-08-21 15:18 -0400
Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-22 07:58 -0700
Re: I wonder if I would be able to collect data from such page using Python Piet van Oostrum <piet@vanoostrum.org> - 2013-08-22 00:54 -0400
Re: I wonder if I would be able to collect data from such page using Python Comment Holder <commentholder@gmail.com> - 2013-08-22 08:03 -0700
Re: I wonder if I would be able to collect data from such page using Python Chris Angelico <rosuav@gmail.com> - 2013-08-23 01:11 +1000
csiph-web