Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #16692 > unrolled thread

How can you make idle processors pick up java work?

Started byqwertmonkey@syberiaoutpost.ru
First post2012-07-30 23:27 +0000
Last post2012-07-30 23:45 -0400
Articles 5 — 5 participants

Back to article view | Back to comp.lang.java.programmer


Contents

  How can you make idle processors pick up java work? qwertmonkey@syberiaoutpost.ru - 2012-07-30 23:27 +0000
    Re: How can you make idle processors pick up java work? David Lamb <dalamb@cs.queensu.ca> - 2012-07-30 19:34 -0400
    Re: How can you make idle processors pick up java work? Patricia Shanahan <pats@acm.org> - 2012-07-30 16:40 -0700
    Re: How can you make idle processors pick up java work? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-07-30 19:51 -0400
    Re: How can you make idle processors pick up java work? "John B. Matthews" <nospam@nospam.invalid> - 2012-07-30 23:45 -0400

#16692 — How can you make idle processors pick up java work?

Fromqwertmonkey@syberiaoutpost.ru
Date2012-07-30 23:27 +0000
SubjectHow can you make idle processors pick up java work?
Message-ID<jv759b$s7$1@speranza.aioe.org>
>> Is there a way to make these processors pick up/share work also, or
>> do you have to use some sort of scheduling framework on top of java?

> Use multiple threads?
~
 a) I need to actually scan large text files (10+ million lines).
 b) On each line there is a NL sentence.
 c) That processing should be run only once, but as fast as possible.
~
 d) If you go:
 d.1) int iPrx = Runtime.getRuntime().availableProcessors();
 d.2) count all lines
 d.3) split the file in (total lines)/iPrx
 d.4) then run iPrx threads (or executable instances using a batch script)
 the time you waste on d.2) and d.3) will make all that strat senseless
~
 I have no way to influence how those large files are generated
~
 e) because of the large sizes of the files you can't even go
~
 FIS = new FileInputStream(IFl);
 FileChannel IFlChnl = FIS.getChannel();
 int iChnlSz = (int)IFlChnl.size();
 MappedByteBuffer MptBytBfr = IFlChnl.map(FileChannel.MapMode.READ_ONLY, 0, iChnlSz);
~
 so, apparently, the only option I have is:
~
     BfR = Files.newBufferedReader(DirPth, ChrStUTF8);
     String aSx = BfR.readLine();
     while(aSx != null){ 

      aSx = BfR.readLine();
     }
~
 do you know of a faster way to go about this?
~
 lbrtchx

[toc] | [next] | [standalone]


#16693

FromDavid Lamb <dalamb@cs.queensu.ca>
Date2012-07-30 19:34 -0400
Message-ID<jv75mu$a21$1@dont-email.me>
In reply to#16692
On 30/07/2012 7:27 PM, qwertmonkey@syberiaoutpost.ru wrote:
>>> Is there a way to make these processors pick up/share work also, or
>>> do you have to use some sort of scheduling framework on top of java?
>
>> Use multiple threads?
> ~
>   a) I need to actually scan large text files (10+ million lines).
>   b) On each line there is a NL sentence.
>   c) That processing should be run only once, but as fast as possible.
> ~
>   d) If you go:
>   d.1) int iPrx = Runtime.getRuntime().availableProcessors();
>   d.2) count all lines
>   d.3) split the file in (total lines)/iPrx
>   d.4) then run iPrx threads (or executable instances using a batch script)
>   the time you waste on d.2) and d.3) will make all that strat senseless

How slow is the NL processing? Does it make any sense to read lines in 
one thread and pass each off to one of the iPrx-1 other threads that 
might run on separate processors?

[toc] | [prev] | [next] | [standalone]


#16694

FromPatricia Shanahan <pats@acm.org>
Date2012-07-30 16:40 -0700
Message-ID<IdydnUpkR4P-horNnZ2dnUVZ_jednZ2d@earthlink.com>
In reply to#16692
On 7/30/2012 4:27 PM, qwertmonkey@syberiaoutpost.ru wrote:
>>> Is there a way to make these processors pick up/share work also, or
>>> do you have to use some sort of scheduling framework on top of java?
>
>> Use multiple threads?
> ~
>   a) I need to actually scan large text files (10+ million lines).
>   b) On each line there is a NL sentence.
>   c) That processing should be run only once, but as fast as possible.
> ~
>   d) If you go:
>   d.1) int iPrx = Runtime.getRuntime().availableProcessors();
>   d.2) count all lines
>   d.3) split the file in (total lines)/iPrx
>   d.4) then run iPrx threads (or executable instances using a batch script)
>   the time you waste on d.2) and d.3) will make all that strat senseless

Why worry about splitting by actual line count, rather than by byte
position in file?

Patricia

[toc] | [prev] | [next] | [standalone]


#16695

FromJoshua Cranmer <Pidgeot18@verizon.invalid>
Date2012-07-30 19:51 -0400
Message-ID<jv76m4$eja$1@dont-email.me>
In reply to#16692
[Gah, your newsreader is incapable of threading posts correctly. Please 
find a non-broken one.]

On 7/30/2012 7:27 PM, qwertmonkey@syberiaoutpost.ru wrote:
>>> Is there a way to make these processors pick up/share work also, or
>>> do you have to use some sort of scheduling framework on top of java?
>
>> Use multiple threads?
> ~
>   a) I need to actually scan large text files (10+ million lines).
>   b) On each line there is a NL sentence.
>   c) That processing should be run only once, but as fast as possible.

Only 10M-line files?

The easiest way to do this is to just make a ThreadPoolExecutor and have 
your main thread dispatch requests as fast as possible to the pool. Or 
you can do the work pooling yourself, which may be faster since you're 
not continually posting Runnable's, but timing results would be 
necessary to convince me.

There are other options, but chances are, your disk drive is going to 
saturate first (in short, it involves reading non-consecutive pages of 
the file, which is generally a recipe for disaster).

-- 
Beware of bugs in the above code; I have only proved it correct, not 
tried it. -- Donald E. Knuth

[toc] | [prev] | [next] | [standalone]


#16698

From"John B. Matthews" <nospam@nospam.invalid>
Date2012-07-30 23:45 -0400
Message-ID<nospam-9C3463.23455530072012@news.aioe.org>
In reply to#16692
In article <jv759b$s7$1@speranza.aioe.org>,
 qwertmonkey@syberiaoutpost.ru wrote:

>  d.2) count all lines

Maybe ask ProcessBuilder to `wc -l`, or similar?

-- 
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.java.programmer


csiph-web