Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!.POSTED!not-for-mail From: boris Newsgroups: comp.lang.java.programmer Subject: Re: large xml file... Date: Mon, 22 Aug 2011 20:53:36 -0400 Organization: Aioe.org NNTP Server Lines: 61 Message-ID: References: NNTP-Posting-Host: Ke8UCN5oorh7H0A48Bcxtw.user.speranza.aioe.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: abuse@aioe.org User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.16) Gecko/20110704 Icedove/3.0.11 X-Notice: Filtered by postfilter v. 0.8.2 Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:7304 On 08/22/2011 08:43 PM, Ian Shef wrote: > boris wrote in news:j2uqp4$n8h$1 > @speranza.aioe.org: > >> hi all, >> I need to process large xml file and dump some documents to a different >> file based on content of some elements. >> >> let's say I need to check content of and dump the whole to >> a different file: >> >> >> >> >> ... etc >> >> >> >> I'm trying to do this using sax. Are there any examples how to do this? >> Is using sax ok for this task? >> thanks. >> >> >> > > What you are asking is unclear to me. > Do you mean that will determine whether you dump the whole to > another file? > Do you mean that will determine what file the whole will be > dumped to? > Or do you mean that the whole will be dumped to some other file, and > while you are at it, will also be checked and reported in some way? > > Can you read the "large xml file" twice? > Can you put the whole "large xml file" (or at least the part preceeding > ) into memory? > Can you copy the "large xml file" to another file while it is being > processed? > > Sorry about the questions, but I need clarification. I have used SAX and > may be able to provide enlightenment. SAX has its uses, but is not so good > when 'memory' is involved unless _you_ provide the memory. SAX appears to > excel when processing can take place in a single pass with very little > lokking backwards. Consequently, it does not use as much memory as some > other methods. > > Do you mean that will determine whether you dump the >whole to > another file? yes > Can you read the "large xml file" twice? I would like to read it once. > Can you put the whole "large xml file" (or at least the part >preceeding > ) into memory? no.