So follw with me please

From	aminer <aminer@toto.net>
Newsgroups	comp.lang.pascal.misc
Subject	So follw with me please
Date	2014-05-18 19:51 -0700
Organization	albasani.net
Message-ID	<llbh33$muk$3@news.albasani.net> (permalink)

Show all headers | View raw

Hello,

We have to be smart please...

So follow with me please...


As i have told you before lockfree algorithms are complete crap
cause with more and more cores they generate higher and higher 
cache-coherence traffic that brings the lockfree algorithm to its Knees,
so lockfree algorithms don't scale, lockfree algorithms look like ticket 
spinlock without a proportional backoff that don't scale
cause of higher and higher cache-coherence traffic on more and more 
cores...


There is also the two locks concurrent FIFO algorithm that is not
so good cause it contains less parallelism,  than my new concurrent
FIFO queue here:

https://sites.google.com/site/aminer68/concurrent-fifo-queue-1


But my new algorithm of a concurrent FIFO queue above is a  scalable 
algorithm because it minimizes efficiently the cache-coherence traffic,
and more than that it contains more parallelism than the two locks
algorithm, because of the following reason:

I have found a paper that is interresting to read , here it is:

http://htor.inf.ethz.ch/publications/img/hoefler-ramos-hpdc13-cc_modeling.pdf


Since the new CPU architectures are using extended MESI protocols,  that 
are scalable, look at for example Xeon Phi processor in the above link , 
you will notice that it is using an extended MESI protocol and in its 
arhitecture its cores are arranged on a  bidirectional ring bus 
thatprovides high scalability, but here scalability means
that the cache-to-cache communication is scalable , that
means we can have many bidirectional cache-to-cache communications 
running in parallel , so that makes the things different , cause
my new algorithm of a concurrent FIFO queue
here: http://pages.videotron.com/aminer/CQueue2.htm
will be more scalable with this kind of extended MESI protocol and
with this kind of scalable architecture and those kind of scalable 
architectures will allow my scalable concurrent FIFO queue to scale 
better and to score more throughput, cause those  kind of scalable 
architectures will lower the contention and makes our
algorithms scalable.


So if you take a look at my new algorithm, you will notice on the
push() side that i am doing something like this:

==
repeat
if fcount1^[lastHead and fMask].flag1=0 then break;
{$IFDEF FPC}
ThreadSwitch;
{$ENDIF}
{$IFDEF Delphi}
sleep(0);
{$ENDIF}
until false;
setObject(lastHead,tm);
fcount1^[lastHead and fMask].flag1:=1;

==

So my new algorithm has more parallelism than the two locks algorithm,
because many threads can execute the if "fcount1^[lastHead and 
fMask].flag1=0" in parallel and hence it will higher the throughput, 
because in new CPU architechtures we  can have many bidirectional 
cache-to-cache communications  running in parallel... also my algorithm 
can execute the "setObject(lastHead,tm)" in parallel and higher the 
throughput, because of write-back caches that allows that..


That's the same for the pop() side , it has more parallelism than
the two locks version, so with more and more cores it will score
more throughput.. and since my new algorithm minimized efficiently
the cache coherence traffic etc. that means it is scalable, and more 
than that my new algorithm is starvation-free.
So all in all that makes me happy, so hope it will make you happy.




Thank you,
Amine Moulay Ramdane.

Back to comp.lang.pascal.misc | Previous | Next — Next in thread | Find similar

Thread

So follw with me please aminer <aminer@toto.net> - 2014-05-18 19:51 -0700
  Re: So follw with me please aminer <aminer@toto.net> - 2014-05-18 20:12 -0700

csiph-web