Groups > comp.programming.threads > #1868

About Lockfree and cache coherence-traffic

From	aminer <aminer@toto.net>
Newsgroups	comp.programming.threads, comp.programming
Subject	About Lockfree and cache coherence-traffic
Date	2013-10-07 22:11 -0700
Organization	albasani.net
Message-ID	<l2vpj2$f9g$1@news.albasani.net> (permalink)

Cross-posted to 2 groups.

Show all headers | View raw

I wrote:
 >Those lockfree FIFO queue algorithms higher the cache-coherency 
 >traffic when there is contention, in worst scenarios you can
 >have 1+2+3+4+ ..N = ((N^2+N)/2) or even worse N+N+N+N...N = N^2
 >cache lines transfers from the memory and those Lockfree FIFO queue
 >algorithms are not FIFO fair , hence you can also have starvation with 
 >Lockfree FIFO queue algorithms, and this is not good at all,

Amine, where did you come with the N^2 ?

Sorry it's not  N^2, i mean   with lockfree FIFO queue algorithms and in 
worst case scenarios when N threads are waiting and spinning on a global 
variable you will have 1+2+3+4...+N = (N^2+N)/2 cache lines tranfers , 
cause the first thread will cross the CAS() and it will have a cache 
miss and one cache-line transfer and the others threads waiting will 
spin and have N-1 cache misses and N-1 cache
line transfers , this will continue like that, so the second thread
will cross the CAS() and have one cache line transfer and the others
will have (N-1)-1 cache misses and  (N-1)-1 cache lines transfers
this will continue like that until the last thread , so this
will equal to 1+...N = (N^2 + N) /2 , but when you will change that and 
wrap the push() and pop() with two  Locks that lower the cache-coherence 
traffic such us  my scalable distributed Lock or the scalable Anderson 
Lock or the scalable MCS lock,
this (N^2+N)/2 cache lines tranfers, in worst cache scenarios, will be 
reduced to a constant qual to N(number of threads) cache lines 
transfers, where did i come with the constant N ? since in the scalable 
Anderson Lock or in my scalable distributed algorithm
(this is true also for the scalable MCS algorithm) every threads is 
spining on a separate element inside the array , so this will reduce a lot
the cache-coherence traffic to N cache misses and N cache lines 
transfers , so this is why the Lockfree FIFO queue algorithms are very 
bad, other than that the lockfree FIFO queue algorithms are not FIFO 
fair hence they are not starvation-free , so they can have starvation, 
so this is why i advice you to use the Locks that are also FIFO fair 
around the push() and the pop() such us my scalable distributed Lock or 
  the Anderson Lock or the MCS Lock.

Thank you,
Amine Moulay Ramdane.

Back to comp.programming.threads | Previous | Next — Next in thread | Find similar

Thread

About Lockfree and cache coherence-traffic aminer <aminer@toto.net> - 2013-10-07 22:11 -0700
  Re: About Lockfree and cache coherence-traffic aminer <aminer@toto.net> - 2013-10-08 09:03 -0700

csiph-web