Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #19011

Re: Threading model for reading 1,000 files quickly?

From Eric Sosman <esosman@ieee-dot-org.invalid>
Newsgroups comp.lang.java.programmer
Subject Re: Threading model for reading 1,000 files quickly?
Date 2012-10-01 09:32 -0400
Organization A noiseless patient Spider
Message-ID <k4c62a$o26$1@dont-email.me> (permalink)
References <051fc3d6-d22c-438a-b4d3-84378e447733@googlegroups.com>

Show all headers | View raw


On 10/1/2012 3:11 AM, leegee@gmail.com wrote:
> I have directory with many sub-directories, each with many thousands of files.
>
> I wish to process each file, which takes one or two seconds.

     "Many" sub-directories (let's say a hundred) times "many
thousands" of files each (let's say ten thousand) times "one or
two" seconds per file (let's say 1.5) -- Okay, that's about two
and a half weeks if you process them one at a time.

     If two and a half weeks is an acceptable amount of time, you
should probably turn your attention to things like checkpointing
of partial results, so that a power failure or crash on Day Nine
doesn't force you to restart from the very beginning.

> I wish to simultaneously process as many files as possible.

     "As many as possible" -- With what kind of equipment?  What's
"possible" for an Amazon data center might be beyond the reach
of an Amazon Kindle ...

> I'm not clear the best approach to this.
>
> Should I use a thread pool? Is Java capable of maximising the hardware resources to determine how many threads it can simultaneously execute?

     You should use a thousand machines, each doing a thousandth
of the overall job.  They'll finish in about twenty minutes.

     (If you want more definite answers, you'll need to provide a
more definite problem statement.)

-- 
Eric Sosman
esosman@ieee-dot-org.invalid

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Threading model for reading 1,000 files quickly? leegee@gmail.com - 2012-10-01 00:11 -0700
  Re: Threading model for reading 1,000 files quickly? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-10-01 09:43 +0100
    Re: Threading model for reading 1,000 files quickly? Patricia Shanahan <pats@acm.org> - 2012-10-01 05:00 -0700
      Re: Threading model for reading 1,000 files quickly? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-10-03 08:24 +0100
        Re: Threading model for reading 1,000 files quickly? Robert Klemme <shortcutter@googlemail.com> - 2012-10-03 13:58 +0200
    Re: Threading model for reading 1,000 files quickly? markspace <-@.> - 2012-10-01 09:35 -0700
  Re: Threading model for reading 1,000 files quickly? Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-10-01 09:32 -0400
  Re: Threading model for reading 1,000 files quickly? Kevin McMurtrie <mcmurtrie@pixelmemory.us> - 2012-10-01 20:11 -0700

csiph-web