Path: csiph.com!usenet.pasdenom.info!news.albasani.net!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'parsing': 0.07; 'brackets': 0.09; 'literal': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'subject:xml': 0.09; 'properly': 0.15; 'ah,': 0.16; 'angle': 0.16; 'fine.': 0.16; 'from:addr:behnel.de': 0.16; 'from:addr:stefan_ml': 0.16; 'from:name:stefan behnel': 0.16; 'installs': 0.16; 'received:80.91.229.3': 0.16; 'received:plane.gmane.org': 0.16; 'substitute': 0.16; 'tags.': 0.16; 'wrote:': 0.17; 'section.': 0.17; 'stefan': 0.17; 'tests': 0.18; 'memory': 0.18; 'basis,': 0.22; 'parse': 0.22; 'header:In- Reply-To:1': 0.25; 'header:User-Agent:1': 0.26; 'url:wiki': 0.26; 'header:X-Complaints-To:1': 0.28; 'schedules': 0.29; 'subject:other': 0.29; 'testcase': 0.29; 'url:wikipedia': 0.29; 'gets': 0.32; 'received:84': 0.32; 'correctly.': 0.33; 'subject:data': 0.33; 'handle': 0.33; 'to:addr:python-list': 0.33; 'text': 0.34; 'subject:?': 0.35; 'received:org': 0.36; 'url:org': 0.36; 'should': 0.36; 'xml': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'files': 0.38; 'sure': 0.38; 'url:en': 0.38; 'description': 0.39; 'to:addr:python.org': 0.39; 'header:Received:5': 0.40; 'think': 0.40; 'your': 0.60; 'skip:u 10': 0.60; 'you.': 0.61; "you'll": 0.62; 'due': 0.66; 'received:arcor-ip.net': 0.84; 'received:pools.arcor-ip.net': 0.84 X-Injected-Via-Gmane: http://gmane.org/ To: python-list@python.org From: Stefan Behnel Subject: Re: xml data or other? Date: Tue, 20 Nov 2012 06:48:20 +0100 References: <96b24715-cb4b-4588-844e-fc2e2f51a170@m4g2000pbd.googlegroups.com> <50A8E36A.5010606@gmail.com> <5B80DD153D7D744689F57F4FB69AF474167B8C1C@SCACMX008.exchad.jpmchase.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Gmane-NNTP-Posting-Host: dslb-084-056-042-230.pools.arcor-ip.net User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121028 Thunderbird/16.0.2 In-Reply-To: <5B80DD153D7D744689F57F4FB69AF474167B8C1C@SCACMX008.exchad.jpmchase.net> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 33 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1353390518 news.xs4all.nl 6904 [2001:888:2000:d::a6]:49150 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:33587 Prasad, Ramit, 19.11.2012 22:42: > Artie Ziff wrote: >> Writing XML files so to see whats happening. My plan is to >> keep xml data in memory and parse with xml.etree.ElementTree. >> >> Unfortunately, xml parsing fails due to angle brackets inside >> description tags. In particular, xml.etree.ElementTree.parse() >> aborts on '<' inside xml data such as the following: >> >> >> >> This testcase tests if crontab installs the cronjob >> and cron schedules the job correctly. >> <\description> >> >> ## >> >> What is right way to handle the extra angle brackets? >> Substitute on line-by-line basis, if that works? >> Or learn to write a simple stack-style parser, or >> recursive descent, it may be called? > > I think your description text should be in a CDATA section. > http://en.wikipedia.org/wiki/CDATA#CDATA_sections_in_XML Ah, don't bother with CDATA. Just make sure the data gets properly escaped, any XML serialiser will do that for you. Just generate the XML using ElementTree and you'll be fine. Generating XML as literal text is not a good idea. Stefan