Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #13566

Re: Pattern suggestion

Date 2012-04-15 21:58 -0400
From Arne Vajhøj <arne@vajhoej.dk>
Newsgroups comp.lang.java.programmer
Subject Re: Pattern suggestion
References <jmel0t$jrh$1@news2.carnet.hr>
Message-ID <4f8b7cb6$0$293$14726298@news.sunsite.dk> (permalink)
Organization SunSITE.dk - Supporting Open source

Show all headers | View raw


On 4/15/2012 10:11 AM, FrenKy wrote:
> I have a huge file (~10GB) which I'm reading line by line. Each line has
> to be analyzed by many number of different analyzers. The problem I have
> is that to make it at least a bit performance optimized due to sometimes
> time consuming processing (usually because of delays due to external
> interfaces) i would need to make it heavily multithreaded.
> File should be read only once to reduce IO on disks.
>
> So I need "1 driver to many workers" pattern where workers are
> multithreaded.
>
> I have a solution now based on Observable/Observer that I use (and it
> works) but I'm not sure if it is the best way.

As I see it then you need 3 things:
* A single reader thread. That is relative simple just be sure to
   read big chunks of data
* N threads doing M analysis's. There are various ways of doing this.
   Manually started threads and thread pool. I think the best choice
   between those will depend on the solution for the next bullet.
* A way of moving data data from the reader to M analyzers.

The first two solutions that come to my mind are:

A1) Use a single java.util.concurrent blocking queue, use
     a custom thread pool, use command pattern, have
     the reader put M commands on the queue containing the
     same data and the analysis to perform, the N threads
     read the commands from the queue and analyze as instructed.
A2) Use the standard ExecutorService thread pool, use command
     pattern, have the reader submit M commands that are also tasks
     to the executor containing the same data and the analysis
     to perform, the N threads read the commands from the queue
     and analyze as instructed.
(A1 and A2 are really the same solution just slightingly different
implementation)
B) Use non persistent message queue and JMS, use publish subscribe
    pattern, have the reader publish the data to the queue, have a
    multipla of M custom treads each implementing a single analysis
    subscribing to the queue, reading and analyzing.

A has less overhead than B. A is more efficient than B if some
analysis's take longer time than others.

But B can be used in a clustered approach.

(I guess you could do A3 with commands on a message queue and
a thread pool on each cluster member as well)

Arne

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

Pattern suggestion FrenKy <frenky__nn@gmail.com> - 2012-04-15 16:11 +0200
  Re: Pattern suggestion Rui Maciel <rui.maciel@gmail.com> - 2012-04-15 16:04 +0100
    Re: Pattern suggestion Lew <noone@lewscanon.com> - 2012-04-15 08:15 -0700
  Re: Pattern suggestion markspace <-@.> - 2012-04-15 08:17 -0700
    Re: Pattern suggestion Arne Vajhøj <arne@vajhoej.dk> - 2012-04-15 22:01 -0400
  Re: Pattern suggestion Jan Burse <janburse@fastmail.fm> - 2012-04-15 17:41 +0200
    Re: Pattern suggestion Jan Burse <janburse@fastmail.fm> - 2012-04-16 00:37 +0200
  Re: Pattern suggestion Patricia Shanahan <pats@acm.org> - 2012-04-15 09:17 -0700
    Re: Pattern suggestion Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-04-15 13:57 -0300
      Re: Pattern suggestion Martin Gregorie <martin@address-in-sig.invalid> - 2012-04-15 19:56 +0000
        Re: Pattern suggestion Robert Klemme <shortcutter@googlemail.com> - 2012-04-16 09:55 +0200
  Re: Pattern suggestion Arne Vajhøj <arne@vajhoej.dk> - 2012-04-15 21:58 -0400

csiph-web