Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!eternal-september.org!feeder.eternal-september.org!mx05.eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.forth
Subject: Re: Parallel FIB
Date: Sat, 04 May 2013 13:45:30 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 69
Message-ID: <2013May4.154530@mips.complang.tuwien.ac.at>
References: <2013May3.142604@mips.complang.tuwien.ac.at> <19891400988434@frunobulax.edu>
Injection-Info: mx05.eternal-september.org; posting-host="d47d3421039fe8026514328ad0ebacae"; logging-data="1246"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18Jr38eyR7HhCweqKCe2Rbx"
X-newsreader: xrn 10.00-beta-3
Cancel-Lock: sha1:JSNkEUTbdHGxIbiBg01c6l72JEc=
Xref: csiph.com comp.lang.forth:22250

mhx@iae.nl (Marcel Hendrix) writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes Re: Parallel FIB
>
>> mhx@iae.nl (Marcel Hendrix) writes:
>>>In the context of a benchmark the actual speed of solving the problem 
>>>doesn't matter.
>
>> That's debatable.
>
>Please debate, then :-)

It's easy to produce a "benchmark" that does nothing, slowly, an
parallelizes well.  But what does it show?  Does it show that we can
usefully parallelize the solutions to interesting problems?  No.  So,
what does it show?

If you want to parallelize stuff with a structure similar to the
doubly-recursive Fibonacci benchmark, use mergesort or quicksort as
benchmark.

>> Actually my suggestion (based on your earlier version) is:
>
>> : pfib ( n1 -- n2 )
>>	TO n  n 6 < IF  n fib EXIT  ENDIF
>>	fibs OFF
>>	PAR
>>	  STARTP  n 4 - fib 3 * 
>>	          n 5 - fib 5 * + fibs LOCKED+! DROP  ENDP
>>	  STARTP  n 6 - fib 2*    fibs LOCKED+! DROP  ENDP
>>	ENDPAR
>>	fibs @ ;
>
>You want to swap n 4- with n 6 -

Yes.

>> For getting more cores to work, let's just start with the unoptimized
>> fib and parallelize the recursions:
>
>> : pfib1 ( n1 -- )
>>    dup 2 < if
>>	drop 1 fibs locked+! drop
>>    else {: n :}
>>	PAR
>>         STARTP n 1- recurse ENDP
>>	  STARTP n 2- recurse ENDP
>>        ENDPAR
>>    endif ;
>
>> : pfib ( n1 -- n2 )
>>  0 fibs ! pfib1 fibs @ ;
>
>I am not sure I understand this. We have only 4 (or 6) 
>cores, so this is just a test of how long it takes to queue 
>and dequeue tasks. As every recursive call needs a thread,
>this will be horrendously slow. That's not what I want
>to test.

That's why I then refined it to produce only a few dozen tasks each of
which runs so long that the threading overhead should become
negligible.  And that version should scale up to a dozen or so cores,
and if you want more (or less), you just need to change one parameter.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2013: http://www.euroforth.org/ef13/