Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!eternal-september.org!feeder.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Marko Rauhamaa Newsgroups: comp.lang.python Subject: Re: parsing multiple root element XML into text Date: Fri, 09 May 2014 15:31:20 +0300 Organization: A noiseless patient Spider Lines: 27 Message-ID: <8738gjf813.fsf@elektro.pacujo.net> References: <0e5e9a24-3663-4293-a530-239486cf28fc@googlegroups.com> <87oaz7uvo4.fsf@dpt-info.u-strasbg.fr> <87a9arfdha.fsf@elektro.pacujo.net> <87k39vupnc.fsf@dpt-info.u-strasbg.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: mx05.eternal-september.org; posting-host="ff5cf27ef3d5b31f034d3b72bdc27a41"; logging-data="26988"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+e2xaYNFOl9YIf2xVjPsfo" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) Cancel-Lock: sha1:HMQgcvV9TstYoYk3qOdrClqp0vg= sha1:manJh2r18ZRq90GBQc6SIloKuHQ= Xref: csiph.com comp.lang.python:71168 Alain Ketterlin : > Marko Rauhamaa writes: >> Sometimes the XML elements come through a pipe as an endless >> sequence. You can still use the wrapping technique and a SAX parser. >> However, the other option is to write a tiny XML scanner that >> identifies the end of each element. Then, you can cut out the >> complete XML element and hand it over to a DOM parser. > > Well maybe, even though I see no point in doing so. If the whole > transaction is a single document and you need to get sub-elements on > the fly, just use the SAX parser: there is no need to use a "tiny XML > scanner" (whatever that is), and building a DOM for a part of the > document in your SAX handler is easy if needed (for the OP's case a > simple state machine would be enough, probably). An example is . The "document" is potentially infinitely long. The elements are messages. The programmer would rather process the elements as DOM trees than follow the meandering SAX parser. Marko