Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #60050 > unrolled thread

parsing RSS XML feed for item value

Started byLarry Wilson <itdlw1@gmail.com>
First post2013-11-19 19:39 -0800
Last post2013-11-20 15:44 -0800
Articles 7 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  parsing RSS XML feed for item value Larry Wilson <itdlw1@gmail.com> - 2013-11-19 19:39 -0800
    Re: parsing RSS XML feed for item value xDog Walker <thudfoo@gmail.com> - 2013-11-19 20:06 -0800
    Re: parsing RSS XML feed for item value Larry Wilson <itdlw1@gmail.com> - 2013-11-20 05:44 -0800
      Fwd: parsing RSS XML feed for item value Neil Cerutti <mr.cerutti@gmail.com> - 2013-11-20 09:48 -0500
      Re: parsing RSS XML feed for item value xDog Walker <thudfoo@gmail.com> - 2013-11-20 08:17 -0800
      Re: parsing RSS XML feed for item value xDog Walker <thudfoo@gmail.com> - 2013-11-20 08:31 -0800
    Re: parsing RSS XML feed for item value Larry Wilson <itdlw1@gmail.com> - 2013-11-20 15:44 -0800

#60050 — parsing RSS XML feed for item value

FromLarry Wilson <itdlw1@gmail.com>
Date2013-11-19 19:39 -0800
Subjectparsing RSS XML feed for item value
Message-ID<bcd7481a-45c2-4dbb-a9e3-c5faa80ac899@googlegroups.com>
Wanting to parse out the the temperature value in the "<w:current" element, just after the guid element using ElementTree or xml.sax. 

Still learning python and getting to know the XML terminology, so need to ask for help, many thank in advance.


This RSS is from "http://rss.weather.com.au/nsw/newcastle"
===================================================
<?xml version="1.0" encoding="ISO-8859-1"?>
<!-- Weather.com.au RSS Feed must be used in accordance with the terms and conditions listed at http://www.weather.com.au/about/rss -->
<rss version="2.0" xmlns:w="http://rss.weather.com.au/w.dtd">
	<channel>
		<title>Weather.com.au - Newcastle Weather</title>
		<link>http://www.weather.com.au/nsw/newcastle</link>
		<description>Current conditions and forecast for Newcastle, New South Wales.</description>
		<language>en-au</language>
		<copyright>Copyright 2013 - Weather.com.au Pty Ltd</copyright>
		<pubDate>Tue, 19 Nov 2013 05:00:00 GMT</pubDate>
		<lastBuildDate>Tue, 19 Nov 2013 05:00:00 GMT</lastBuildDate>
		<ttl>15</ttl>
		<item>
			<title>Newcastle Current Conditions</title>
			<link>http://www.weather.com.au/nsw/newcastle/current</link>
			<description>
				<![CDATA[
					<b>Temperature:</b> 20.3&deg;C<br />
					<b>Dew Point:</b> 18.6&deg;C<br />
					<b>Relative Humidity:</b> 90%<br />
					<b>Wind Speed:</b> 22.2km/h<br />
					<b>Wind Gusts:</b> 29.6km/h<br />
					<b>Wind Direction:</b> SSW<br />
					<b>Pressure:</b> 0.0hPa<br />
					<b>Rain Since 9AM:</b> 0.6mm<br />
				]]>
			</description>
			<pubDate>Tue, 19 Nov 2013 05:00:00 GMT</pubDate>
			<guid isPermaLink="false">C1384837200</guid>
			<w:current temperature="20.3" dewPoint="18.6" humidity="90" windSpeed="22.2" windGusts="29.6" windDirection="SSW" pressure="0.0" rain="0.6" />
		</item>
		<item>
...etc
===================================================

[toc] | [next] | [standalone]


#60052

FromxDog Walker <thudfoo@gmail.com>
Date2013-11-19 20:06 -0800
Message-ID<mailman.2951.1384920408.18130.python-list@python.org>
In reply to#60050
On Tuesday 2013 November 19 19:39, Larry Wilson wrote:
> Wanting to parse out the the temperature value in the "<w:current"
> element, just after the guid element using ElementTree or xml.sax.

When you get tired of that, take a look at Universal Feedparser, a Python
Package:

http://code.google.com/p/feedparser/

http://packages.python.org/feedparser

-- 
Yonder nor sorghum stenches shut ladle gulls stopper torque wet 
strainers.

[toc] | [prev] | [next] | [standalone]


#60075

FromLarry Wilson <itdlw1@gmail.com>
Date2013-11-20 05:44 -0800
Message-ID<e3614644-c3fe-46e6-b5f1-7f285f1e81b1@googlegroups.com>
In reply to#60050
>>> feed.entries[0].w_current
{'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts': u'29.6', 'rain': u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed': u'22.2', 'winddirection': u'SSW'}
>>>

in the above I get the subitem as shown. How do I extract the label, values pairs?

[toc] | [prev] | [next] | [standalone]


#60082

FromNeil Cerutti <mr.cerutti@gmail.com>
Date2013-11-20 09:48 -0500
Message-ID<mailman.2963.1384958939.18130.python-list@python.org>
In reply to#60075
Larry Wilson itdlw1@gmail.com via python.org
10:39 PM (10 hours ago) wrote:
>
> Wanting to parse out the the temperature value in the
> "<w:current" element, just after the guid element using
> ElementTree or xml.sax.

Since you aren't building up a complex data structure, xml.sax
will be an OK choice.

Here's a quick and dirty job:

import io
import xml.sax as sax

the_xml = io.StringIO("""SNIPPED XML""")

class WeatherHandler(sax.handler.ContentHandler):
    def startDocument(self):
        self.temperatures = []

    def startElement(self, name, attrs):
        if name == 'w:current': # Nice namespace handling, eh?
            self.temperatures.append(attrs)


handler = WeatherHandler()
sax.parse(the_xml, handler)
for temp in handler.temperatures:
    for key, val in temp.items():
        print("{}: {}".format(key, val))

Output (from your example):

windGusts: 29.6
dewPoint: 18.6
pressure: 0.0
windDirection: SSW
humidity: 90
rain: 0.6
temperature: 20.3
windSpeed: 22.2

For most jobs you would want to keep track of your nesting level, but
that's left out here. I didn't try to capture location or info you
might want but didn't specify, either; left that as an exercise.

[toc] | [prev] | [next] | [standalone]


#60105

FromxDog Walker <thudfoo@gmail.com>
Date2013-11-20 08:17 -0800
Message-ID<mailman.2972.1384964274.18130.python-list@python.org>
In reply to#60075
On Wednesday 2013 November 20 05:44, Larry Wilson wrote:
> {'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts': u'29.6', 'rain':
> u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed': u'22.2',
> 'winddirection': u'SSW'}
Python 2.7.2 (default, Oct 10 2011, 10:47:36)
[GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> w_current = {'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts': 
u'29.6', 'rain': u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed': 
u'22.2', 'winddirection': u'SSW'}
>>> for label, value in w_current.iteritems():
...     print label, value
...
pressure 0.0
windspeed 22.2
temperature 20.3
dewpoint 18.6
windgusts 29.6
winddirection SSW
rain 0.6
humidity 90
>>>         

-- 
Yonder nor sorghum stenches shut ladle gulls stopper torque wet 
strainers.

[toc] | [prev] | [next] | [standalone]


#60109

FromxDog Walker <thudfoo@gmail.com>
Date2013-11-20 08:31 -0800
Message-ID<mailman.2976.1384965090.18130.python-list@python.org>
In reply to#60075
On Wednesday 2013 November 20 05:44, Larry Wilson wrote:
> >>> feed.entries[0].w_current
>
> {'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts': u'29.6', 'rain':
> u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed': u'22.2',
> 'winddirection': u'SSW'}
>
>
> in the above I get the subitem as shown. How do I extract the label, values
> pairs?

Python 3.3.0 (default, Sep 30 2012, 09:02:56)
[GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> w_current = {'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts': 
u'29.6', 'rain': u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed': 
u'22.2', 'winddirection': u'SSW'}
>>> for label, value in w_current.items():
...    print (label, value)
...
dewpoint 18.6
temperature 20.3
rain 0.6
pressure 0.0
windspeed 22.2
humidity 90
winddirection SSW
windgusts 29.6
>>>
-- 
Yonder nor sorghum stenches shut ladle gulls stopper torque wet 
strainers.

[toc] | [prev] | [next] | [standalone]


#60144

FromLarry Wilson <itdlw1@gmail.com>
Date2013-11-20 15:44 -0800
Message-ID<603b5211-3254-4008-8ea4-0e628fd85075@googlegroups.com>
In reply to#60050
Thank you folks, now I know what I don't know and have a solution. 

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web