Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #60050 > unrolled thread
| Started by | Larry Wilson <itdlw1@gmail.com> |
|---|---|
| First post | 2013-11-19 19:39 -0800 |
| Last post | 2013-11-20 15:44 -0800 |
| Articles | 7 — 3 participants |
Back to article view | Back to comp.lang.python
parsing RSS XML feed for item value Larry Wilson <itdlw1@gmail.com> - 2013-11-19 19:39 -0800
Re: parsing RSS XML feed for item value xDog Walker <thudfoo@gmail.com> - 2013-11-19 20:06 -0800
Re: parsing RSS XML feed for item value Larry Wilson <itdlw1@gmail.com> - 2013-11-20 05:44 -0800
Fwd: parsing RSS XML feed for item value Neil Cerutti <mr.cerutti@gmail.com> - 2013-11-20 09:48 -0500
Re: parsing RSS XML feed for item value xDog Walker <thudfoo@gmail.com> - 2013-11-20 08:17 -0800
Re: parsing RSS XML feed for item value xDog Walker <thudfoo@gmail.com> - 2013-11-20 08:31 -0800
Re: parsing RSS XML feed for item value Larry Wilson <itdlw1@gmail.com> - 2013-11-20 15:44 -0800
| From | Larry Wilson <itdlw1@gmail.com> |
|---|---|
| Date | 2013-11-19 19:39 -0800 |
| Subject | parsing RSS XML feed for item value |
| Message-ID | <bcd7481a-45c2-4dbb-a9e3-c5faa80ac899@googlegroups.com> |
Wanting to parse out the the temperature value in the "<w:current" element, just after the guid element using ElementTree or xml.sax. Still learning python and getting to know the XML terminology, so need to ask for help, many thank in advance. This RSS is from "http://rss.weather.com.au/nsw/newcastle" =================================================== <?xml version="1.0" encoding="ISO-8859-1"?> <!-- Weather.com.au RSS Feed must be used in accordance with the terms and conditions listed at http://www.weather.com.au/about/rss --> <rss version="2.0" xmlns:w="http://rss.weather.com.au/w.dtd"> <channel> <title>Weather.com.au - Newcastle Weather</title> <link>http://www.weather.com.au/nsw/newcastle</link> <description>Current conditions and forecast for Newcastle, New South Wales.</description> <language>en-au</language> <copyright>Copyright 2013 - Weather.com.au Pty Ltd</copyright> <pubDate>Tue, 19 Nov 2013 05:00:00 GMT</pubDate> <lastBuildDate>Tue, 19 Nov 2013 05:00:00 GMT</lastBuildDate> <ttl>15</ttl> <item> <title>Newcastle Current Conditions</title> <link>http://www.weather.com.au/nsw/newcastle/current</link> <description> <![CDATA[ <b>Temperature:</b> 20.3°C<br /> <b>Dew Point:</b> 18.6°C<br /> <b>Relative Humidity:</b> 90%<br /> <b>Wind Speed:</b> 22.2km/h<br /> <b>Wind Gusts:</b> 29.6km/h<br /> <b>Wind Direction:</b> SSW<br /> <b>Pressure:</b> 0.0hPa<br /> <b>Rain Since 9AM:</b> 0.6mm<br /> ]]> </description> <pubDate>Tue, 19 Nov 2013 05:00:00 GMT</pubDate> <guid isPermaLink="false">C1384837200</guid> <w:current temperature="20.3" dewPoint="18.6" humidity="90" windSpeed="22.2" windGusts="29.6" windDirection="SSW" pressure="0.0" rain="0.6" /> </item> <item> ...etc ===================================================
[toc] | [next] | [standalone]
| From | xDog Walker <thudfoo@gmail.com> |
|---|---|
| Date | 2013-11-19 20:06 -0800 |
| Message-ID | <mailman.2951.1384920408.18130.python-list@python.org> |
| In reply to | #60050 |
On Tuesday 2013 November 19 19:39, Larry Wilson wrote: > Wanting to parse out the the temperature value in the "<w:current" > element, just after the guid element using ElementTree or xml.sax. When you get tired of that, take a look at Universal Feedparser, a Python Package: http://code.google.com/p/feedparser/ http://packages.python.org/feedparser -- Yonder nor sorghum stenches shut ladle gulls stopper torque wet strainers.
[toc] | [prev] | [next] | [standalone]
| From | Larry Wilson <itdlw1@gmail.com> |
|---|---|
| Date | 2013-11-20 05:44 -0800 |
| Message-ID | <e3614644-c3fe-46e6-b5f1-7f285f1e81b1@googlegroups.com> |
| In reply to | #60050 |
>>> feed.entries[0].w_current
{'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts': u'29.6', 'rain': u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed': u'22.2', 'winddirection': u'SSW'}
>>>
in the above I get the subitem as shown. How do I extract the label, values pairs?
[toc] | [prev] | [next] | [standalone]
| From | Neil Cerutti <mr.cerutti@gmail.com> |
|---|---|
| Date | 2013-11-20 09:48 -0500 |
| Message-ID | <mailman.2963.1384958939.18130.python-list@python.org> |
| In reply to | #60075 |
Larry Wilson itdlw1@gmail.com via python.org
10:39 PM (10 hours ago) wrote:
>
> Wanting to parse out the the temperature value in the
> "<w:current" element, just after the guid element using
> ElementTree or xml.sax.
Since you aren't building up a complex data structure, xml.sax
will be an OK choice.
Here's a quick and dirty job:
import io
import xml.sax as sax
the_xml = io.StringIO("""SNIPPED XML""")
class WeatherHandler(sax.handler.ContentHandler):
def startDocument(self):
self.temperatures = []
def startElement(self, name, attrs):
if name == 'w:current': # Nice namespace handling, eh?
self.temperatures.append(attrs)
handler = WeatherHandler()
sax.parse(the_xml, handler)
for temp in handler.temperatures:
for key, val in temp.items():
print("{}: {}".format(key, val))
Output (from your example):
windGusts: 29.6
dewPoint: 18.6
pressure: 0.0
windDirection: SSW
humidity: 90
rain: 0.6
temperature: 20.3
windSpeed: 22.2
For most jobs you would want to keep track of your nesting level, but
that's left out here. I didn't try to capture location or info you
might want but didn't specify, either; left that as an exercise.
[toc] | [prev] | [next] | [standalone]
| From | xDog Walker <thudfoo@gmail.com> |
|---|---|
| Date | 2013-11-20 08:17 -0800 |
| Message-ID | <mailman.2972.1384964274.18130.python-list@python.org> |
| In reply to | #60075 |
On Wednesday 2013 November 20 05:44, Larry Wilson wrote:
> {'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts': u'29.6', 'rain':
> u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed': u'22.2',
> 'winddirection': u'SSW'}
Python 2.7.2 (default, Oct 10 2011, 10:47:36)
[GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> w_current = {'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts':
u'29.6', 'rain': u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed':
u'22.2', 'winddirection': u'SSW'}
>>> for label, value in w_current.iteritems():
... print label, value
...
pressure 0.0
windspeed 22.2
temperature 20.3
dewpoint 18.6
windgusts 29.6
winddirection SSW
rain 0.6
humidity 90
>>>
--
Yonder nor sorghum stenches shut ladle gulls stopper torque wet
strainers.
[toc] | [prev] | [next] | [standalone]
| From | xDog Walker <thudfoo@gmail.com> |
|---|---|
| Date | 2013-11-20 08:31 -0800 |
| Message-ID | <mailman.2976.1384965090.18130.python-list@python.org> |
| In reply to | #60075 |
On Wednesday 2013 November 20 05:44, Larry Wilson wrote:
> >>> feed.entries[0].w_current
>
> {'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts': u'29.6', 'rain':
> u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed': u'22.2',
> 'winddirection': u'SSW'}
>
>
> in the above I get the subitem as shown. How do I extract the label, values
> pairs?
Python 3.3.0 (default, Sep 30 2012, 09:02:56)
[GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> w_current = {'temperature': u'20.3', 'dewpoint': u'18.6', 'windgusts':
u'29.6', 'rain': u'0.6', 'humidity': u'90', 'pressure': u'0.0', 'windspeed':
u'22.2', 'winddirection': u'SSW'}
>>> for label, value in w_current.items():
... print (label, value)
...
dewpoint 18.6
temperature 20.3
rain 0.6
pressure 0.0
windspeed 22.2
humidity 90
winddirection SSW
windgusts 29.6
>>>
--
Yonder nor sorghum stenches shut ladle gulls stopper torque wet
strainers.
[toc] | [prev] | [next] | [standalone]
| From | Larry Wilson <itdlw1@gmail.com> |
|---|---|
| Date | 2013-11-20 15:44 -0800 |
| Message-ID | <603b5211-3254-4008-8ea4-0e628fd85075@googlegroups.com> |
| In reply to | #60050 |
Thank you folks, now I know what I don't know and have a solution.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web