Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'importerror:': 0.07; 'urllib2': 0.07; 'subject:help': 0.08; 'string': 0.09; '#print': 0.09; 'item,': 0.09; 'try:': 0.09; 'url:rss': 0.09; '23,': 0.16; 'cleaner': 0.16; 'iteration': 0.16; 'soup': 0.16; 'try?': 0.16; 'subject:python': 0.16; 'wrote:': 0.18; 'import': 0.22; 'email addr:gmail.com>': 0.22; 'print': 0.22; 'parse': 0.24; 'skip:e 30': 0.24; '>': 0.26; 'second': 0.26; 'header:In-Reply-To:1': 0.27; 'am,': 0.29; 'message-id:@mail.gmail.com': 0.30; 'getting': 0.31; 'index,': 0.31; 'object.': 0.31; 'text': 0.33; 'problem': 0.35; 'subject:with': 0.35; 'except': 0.35; 'skip:s 30': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'date.': 0.36; 'url:org': 0.36; 'should': 0.36; 'to:addr:python-list': 0.38; 'track': 0.38; '\xa0\xa0\xa0': 0.39; 'sure': 0.39; 'to:addr:python.org': 0.39; 'either': 0.39; 'skip:p 20': 0.39; 'how': 0.40; 'skip:u 10': 0.60; 'mar': 0.68 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=4yX8oBlszBpsHs6O6Nn+ISMOnwrEhFvA6gUVZ+5LwQY=; b=AjZi9QFL1rbre4n6nRjMWZGJN1MegGGnWcVUo3pzvxwYr9/UxfGgkMWI+8WilHd1mh rlikh00GsK0T6PXnrJsZUSLFiDPsR+P7ykkR73179S+CBB1uUe2lX6sTWJfEMJDTo2Xe +J9mP88wRqXhAvCgOKzkpJeL0PipikowpFwFDLKyA9hg2NpYa/tYu9zQnqPYBGvcjdQK 6Jv44SL7FXIYBhRDH7HKZcLWIXXpfOj9ETGuQKvcgHS9A2VM0xTr5sVJHucq1kpxUalC uH9ooyIoPptOuIe/foeyFONHRu/3+EeGyzKBXU09BEG/dERlJs/IniXfLEzwovgNBVnZ uQ7A== MIME-Version: 1.0 X-Received: by 10.66.228.37 with SMTP id sf5mr66206904pac.19.1395596951346; Sun, 23 Mar 2014 10:49:11 -0700 (PDT) In-Reply-To: <03c8b5d0-363e-4287-80d0-a43b0266f2a3@googlegroups.com> References: <84eb4c69-d43d-4777-8a99-34eed9be73d6@googlegroups.com> <03c8b5d0-363e-4287-80d0-a43b0266f2a3@googlegroups.com> Date: Sun, 23 Mar 2014 11:49:11 -0600 Subject: Re: help with for loop----python 2.7.2 From: Ian Kelly To: Python Content-Type: multipart/alternative; boundary=047d7b111dd988e48a04f549bc3d X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 89 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1395596955 news.xs4all.nl 2943 [2001:888:2000:d::a6]:46938 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:68818 --047d7b111dd988e48a04f549bc3d Content-Type: text/plain; charset=ISO-8859-1 On Mar 23, 2014 11:31 AM, "tad na" wrote: > OK . second problem :) > I can print the date. not sure how to do this one.. Why not? What happens when you try? > try: > from urllib2 import urlopen > except ImportError: > from urllib.request import urlopen > import urllib2 > from bs4 import BeautifulSoup > > soup = BeautifulSoup(urlopen('http://bl.ocks.org/mbostock.rss')) > #print soup.find_all('item') > #print (soup) > data = soup.find_all("item") > > x=0 > for item in soup.find_all('item'): > title = item.find('title').text > link = item.find('link').text > date = item.find('pubDate') > # print date > print('+++++++++++++++++') > print data[x].title.text > print data[x].link.text > print data[x].guid.text > print data[x].pubDate > x = x + 1 data[x] should be the same object as item, no? If you want to keep track of the current iteration index, a cleaner way to do that is by using enumerate: for x, item in enumerate(soup.find_all('item')): As far as printing the pubDate goes, why not start by getting its text property as you do with the other tags? From there you can either print the string out directly or parse it into a datetime object. --047d7b111dd988e48a04f549bc3d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable


On Mar 23, 2014 11:31 AM, "tad na" <teddybubu@gmail.com> wrote:
> OK . second problem :)
> I can print the date. =A0not sure how to do this one..

Why not? What happens when you try?

> try:
> =A0 =A0 from urllib2 import urlopen
> except ImportError:
> =A0 =A0 from urllib.request import urlopen
> import urllib2
> from bs4 import BeautifulSoup
>
> soup =3D BeautifulSoup(urlopen('http://bl.ocks.org/mbostock.rss'))
> #print soup.find_all('item')
> #print (soup)
> data =3D soup.find_all("item")
>
> x=3D0
> for item in soup.find_all('item'):
> =A0 =A0 title =3D item.find('title').text
> =A0 =A0 link =3D item.find('link').text
> =A0 =A0 date =3D item.find('pubDate')
> =A0 =A0# print date
> =A0 =A0 print('+++++++++++++++++')
> =A0 =A0 print data[x].title.text
> =A0 =A0 print data[x].link.text
> =A0 =A0 print data[x].guid.text
> =A0 =A0 print data[x].pubDate
> =A0 =A0 x =3D x + 1

data[x] should be the same object as item, no? If you want t= o keep track of the current iteration index, a cleaner way to do that is by= using enumerate:

=A0=A0=A0 for x, item in enumerate(soup.find_all('item&#= 39;)):

As far as printing the pubDate goes, why not start by gettin= g its text property as you do with the other tags? From there you can eithe= r print the string out directly or parse it into a datetime object.

--047d7b111dd988e48a04f549bc3d--