Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #7303

Re: large xml file...

Newsgroups comp.lang.java.programmer
Subject Re: large xml file...
From Ian Shef <invalid@avoiding.spam>
References <j2uqp4$n8h$1@speranza.aioe.org>
Message-ID <Xns9F49B4434CADvaj4088ianshef@138.125.254.103> (permalink)
Date 2011-08-23 00:43 +0000
Organization Raytheon Company

Show all headers | View raw


boris <boris@localhost.domain> wrote in news:j2uqp4$n8h$1
@speranza.aioe.org:

> hi all,
> I need to process large xml file and dump some documents to a different 
> file based on content of some elements.
> 
> let's say I need to check content of <text3> and dump the whole <doc> to 
> a different file:
> 
> <doc>
>      <text1>
>      <text2>
>      <text3>  ... etc
> 
> </doc>
> 
> I'm trying to do this using sax. Are there any examples how to do this?
> Is using sax ok for this task?
> thanks.
> 
>      
> 

What you are asking is unclear to me.  
Do you mean that <text3> will determine whether you dump the whole <doc> to 
another file?
Do you mean that <text3> will determine what file the whole <doc> will be 
dumped to?
Or do you mean that the whole <doc> will be dumped to some other file, and 
while you are at it, <text3> will also be checked and reported in some way?

Can you read the "large xml file" twice?
Can you put the whole "large xml file" (or at least the part preceeding 
<text3>) into memory?
Can you copy the "large xml file" to another file while it is being 
processed?

Sorry about the questions, but I need clarification.  I have used SAX and 
may be able to provide enlightenment.  SAX has its uses, but is not so good 
when 'memory' is involved unless _you_ provide the memory.  SAX appears to 
excel when processing can take place in a single pass with very little 
lokking backwards.  Consequently, it does not use as much memory as some 
other methods.




Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

large xml file... boris <boris@localhost.domain> - 2011-08-22 20:05 -0400
  Re: large xml file... Ian Shef <invalid@avoiding.spam> - 2011-08-23 00:43 +0000
    Re: large xml file... boris <boris@localhost.domain> - 2011-08-22 20:53 -0400
      Re: large xml file... boris <boris@localhost.domain> - 2011-08-22 20:55 -0400
        Re: large xml file... Ian Shef <invalid@avoiding.spam> - 2011-08-23 19:48 +0000
  Re: large xml file... Arne Vajhøj <arne@vajhoej.dk> - 2011-08-22 21:59 -0400
    Re: large xml file... boris <boris@localhost.localdomain> - 2011-08-24 14:40 -0400
      Re: large xml file... Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2011-08-24 18:59 +0000
      Re: large xml file... Arne Vajhøj <arne@vajhoej.dk> - 2011-08-24 19:10 -0400
        Re: large xml file... Stanimir Stamenkov <s7an10@netscape.net> - 2011-08-25 07:57 +0300
          Re: large xml file... RedGrittyBrick <RedGrittyBrick@spamweary.invalid> - 2011-08-25 10:39 +0100
            Re: large xml file... Stanimir Stamenkov <s7an10@netscape.net> - 2011-08-26 08:47 +0300

csiph-web