Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.pascal.misc > #822
| From | aminer <aminer@toto.net> |
|---|---|
| Newsgroups | comp.lang.pascal.misc |
| Subject | So follw with me please |
| Date | 2014-05-18 19:51 -0700 |
| Organization | albasani.net |
| Message-ID | <llbh33$muk$3@news.albasani.net> (permalink) |
Hello,
We have to be smart please...
So follow with me please...
As i have told you before lockfree algorithms are complete crap
cause with more and more cores they generate higher and higher
cache-coherence traffic that brings the lockfree algorithm to its Knees,
so lockfree algorithms don't scale, lockfree algorithms look like ticket
spinlock without a proportional backoff that don't scale
cause of higher and higher cache-coherence traffic on more and more
cores...
There is also the two locks concurrent FIFO algorithm that is not
so good cause it contains less parallelism, than my new concurrent
FIFO queue here:
https://sites.google.com/site/aminer68/concurrent-fifo-queue-1
But my new algorithm of a concurrent FIFO queue above is a scalable
algorithm because it minimizes efficiently the cache-coherence traffic,
and more than that it contains more parallelism than the two locks
algorithm, because of the following reason:
I have found a paper that is interresting to read , here it is:
http://htor.inf.ethz.ch/publications/img/hoefler-ramos-hpdc13-cc_modeling.pdf
Since the new CPU architectures are using extended MESI protocols, that
are scalable, look at for example Xeon Phi processor in the above link ,
you will notice that it is using an extended MESI protocol and in its
arhitecture its cores are arranged on a bidirectional ring bus
thatprovides high scalability, but here scalability means
that the cache-to-cache communication is scalable , that
means we can have many bidirectional cache-to-cache communications
running in parallel , so that makes the things different , cause
my new algorithm of a concurrent FIFO queue
here: http://pages.videotron.com/aminer/CQueue2.htm
will be more scalable with this kind of extended MESI protocol and
with this kind of scalable architecture and those kind of scalable
architectures will allow my scalable concurrent FIFO queue to scale
better and to score more throughput, cause those kind of scalable
architectures will lower the contention and makes our
algorithms scalable.
So if you take a look at my new algorithm, you will notice on the
push() side that i am doing something like this:
==
repeat
if fcount1^[lastHead and fMask].flag1=0 then break;
{$IFDEF FPC}
ThreadSwitch;
{$ENDIF}
{$IFDEF Delphi}
sleep(0);
{$ENDIF}
until false;
setObject(lastHead,tm);
fcount1^[lastHead and fMask].flag1:=1;
==
So my new algorithm has more parallelism than the two locks algorithm,
because many threads can execute the if "fcount1^[lastHead and
fMask].flag1=0" in parallel and hence it will higher the throughput,
because in new CPU architechtures we can have many bidirectional
cache-to-cache communications running in parallel... also my algorithm
can execute the "setObject(lastHead,tm)" in parallel and higher the
throughput, because of write-back caches that allows that..
That's the same for the pop() side , it has more parallelism than
the two locks version, so with more and more cores it will score
more throughput.. and since my new algorithm minimized efficiently
the cache coherence traffic etc. that means it is scalable, and more
than that my new algorithm is starvation-free.
So all in all that makes me happy, so hope it will make you happy.
Thank you,
Amine Moulay Ramdane.
Back to comp.lang.pascal.misc | Previous | Next — Next in thread | Find similar
So follw with me please aminer <aminer@toto.net> - 2014-05-18 19:51 -0700 Re: So follw with me please aminer <aminer@toto.net> - 2014-05-18 20:12 -0700
csiph-web