Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!weretis.net!feeder4.news.weretis.net!feeder.news-service.com!aioe.org!.POSTED!not-for-mail From: boris Newsgroups: comp.lang.java.programmer Subject: Re: large xml file... Date: Wed, 24 Aug 2011 14:40:11 -0400 Organization: Aioe.org NNTP Server Lines: 39 Message-ID: References: <4e5309a2$0$303$14726298@news.sunsite.dk> NNTP-Posting-Host: Ke8UCN5oorh7H0A48Bcxtw.user.speranza.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.16) Gecko/20110704 Icedove/3.0.11 X-Notice: Filtered by postfilter v. 0.8.2 Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:7347 On 08/22/2011 09:59 PM, Arne Vajhøj wrote: > On 8/22/2011 8:05 PM, boris wrote: >> I need to process large xml file and dump some documents to a different >> file based on content of some elements. >> >> let's say I need to check content of and dump the whole to >> a different file: >> >> >> >> >> ... etc >> >> >> >> I'm trying to do this using sax. Are there any examples how to do this? >> Is using sax ok for this task? > > SAX or StAX seems as the most obvious choices given the context. > > Any textbook SAX example should lead you to working code. > > I can post some code, but I doubt that it will show anything > various books and tutorials does not. > > Arne > > I tried to accumulate the whole xml(...) as string using sax, but in this case all special characters are processed by parser and are just characters and not "predefined entities" like " Using stax, I get correct xml, if I print events right away, but I if I store them in collection and print them later , I don't get the same result.