Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!eternal-september.org!feeder.eternal-september.org!mx05.eternal-september.org!.POSTED!not-for-mail From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.lang.forth Subject: Re: Parallel FIB Date: Sat, 04 May 2013 13:45:30 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 69 Message-ID: <2013May4.154530@mips.complang.tuwien.ac.at> References: <2013May3.142604@mips.complang.tuwien.ac.at> <19891400988434@frunobulax.edu> Injection-Info: mx05.eternal-september.org; posting-host="d47d3421039fe8026514328ad0ebacae"; logging-data="1246"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Jr38eyR7HhCweqKCe2Rbx" X-newsreader: xrn 10.00-beta-3 Cancel-Lock: sha1:JSNkEUTbdHGxIbiBg01c6l72JEc= Xref: csiph.com comp.lang.forth:22250 mhx@iae.nl (Marcel Hendrix) writes: >anton@mips.complang.tuwien.ac.at (Anton Ertl) writes Re: Parallel FIB > >> mhx@iae.nl (Marcel Hendrix) writes: >>>In the context of a benchmark the actual speed of solving the problem >>>doesn't matter. > >> That's debatable. > >Please debate, then :-) It's easy to produce a "benchmark" that does nothing, slowly, an parallelizes well. But what does it show? Does it show that we can usefully parallelize the solutions to interesting problems? No. So, what does it show? If you want to parallelize stuff with a structure similar to the doubly-recursive Fibonacci benchmark, use mergesort or quicksort as benchmark. >> Actually my suggestion (based on your earlier version) is: > >> : pfib ( n1 -- n2 ) >> TO n n 6 < IF n fib EXIT ENDIF >> fibs OFF >> PAR >> STARTP n 4 - fib 3 * >> n 5 - fib 5 * + fibs LOCKED+! DROP ENDP >> STARTP n 6 - fib 2* fibs LOCKED+! DROP ENDP >> ENDPAR >> fibs @ ; > >You want to swap n 4- with n 6 - Yes. >> For getting more cores to work, let's just start with the unoptimized >> fib and parallelize the recursions: > >> : pfib1 ( n1 -- ) >> dup 2 < if >> drop 1 fibs locked+! drop >> else {: n :} >> PAR >> STARTP n 1- recurse ENDP >> STARTP n 2- recurse ENDP >> ENDPAR >> endif ; > >> : pfib ( n1 -- n2 ) >> 0 fibs ! pfib1 fibs @ ; > >I am not sure I understand this. We have only 4 (or 6) >cores, so this is just a test of how long it takes to queue >and dequeue tasks. As every recursive call needs a thread, >this will be horrendously slow. That's not what I want >to test. That's why I then refined it to produce only a few dozen tasks each of which runs so long that the threading overhead should become negligible. And that version should scale up to a dozen or so cores, and if you want more (or less), you just need to change one parameter. - anton -- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: http://www.forth200x.org/forth200x.html EuroForth 2013: http://www.euroforth.org/ef13/