Groups > comp.programming.threads > #1034

Re: CAS operations and scalability...

From	"aminer" <aminer@videotron.ca>
Newsgroups	comp.programming, comp.programming.threads
Subject	Re: CAS operations and scalability...
Date	2012-08-26 12:56 -0500
Organization	A noiseless patient Spider
Message-ID	<k1dkgg$uj0$2@dont-email.me> (permalink)
References	<k1dira$jt9$1@dont-email.me>

Cross-posted to 2 groups.

Show all headers | View raw

 wrote:

"When the CAS operation "goes on the bus", use of CAS can impair 
scalability.
but CAS can be accomplished locally -- that is, with no bus transactions --
and then it can scale.

If we then change the CAS operation that goes on the bus to a normal store
you'll also see a similar slow-down in terms of coherency bus traffic, CAS
isn't appreciably different than a normal store. Also the lock: prefix
caused
the LOCK# signal to be asserted, acquiring exclusive access to the bus.
This doesn't scale of course"


If you take a look at my threadpool engine

http://pages.videotron.com/aminer/

in the TThreadPool.execute() method i am using a
local_balance:=LockedIncLong(balance1) mod FThreadCount;
and this instruction is going on the bus and it can generate a lot of 
contention, so
it does not scale if there is a high arrival rate and a lot of contention on 
it.
But even though it does not scale , in the majority of real world 
applications
the processing rate of the worker threads is lower than the arrival rate to
the TThreadPool.execute(),  so this is not a problem i think.


Thank you,

Amine Moulay Ramdne



"aminer" <aminer@videotron.ca> wrote in message 
news:k1dira$jt9$1@dont-email.me...
>
> Hello,
>
>
> When the CAS operation "goes on the bus", use of CAS can impair 
> scalability.
> but CAS can be accomplished locally -- that is, with no bus 
> transactions --
> and then it can scale.
>
> If we then change the CAS operation that goes on the bus to a normal store
> you'll also see a similar slow-down in terms of coherency bus traffic, CAS
> isn't appreciably different than a normal store. Also the lock: prefix 
> caused
> the LOCK# signal to be asserted, acquiring exclusive access to the bus.
> This doesn't scale of course
>
> As you have noticed i  have wrote parallelhashlist (a parallel hashtable),
> you can find parallelhashlist here:
>
> http://pages.videotron.com/aminer/
>
> It's a parallel Hashtable with O(1) best case and O(log(n)) worst case
> access that uses lock striping and lightweight MREWs(multiple-readers
> -exclusive-writer) , this allows multiple threads to write and read
> concurently. also parallelhashlist maintains an independant counter , that
> counts the number of entries , for each segment of the hashtable and uses
> a lock for each counter, this is also for better scalability. and 
> parallelhashlist
> is scaling very well,  but since it is a parallel hashtable so the 
> possibility of
> contention is low so why doi need the distributed reader-writer lock of
> Dmitry Vyukov inside my parallel hashlist ?
>
> Other than that I have done some tests with the lightweight MREW that i am
> using inside my parallelhashlist and i have done also some tests with my
> lockfree mpmc fifo queue and what i think is that the CAS is generating
> a lot of contention this is is why the lightweight MREW and my 
> lockfree_mpmc
> are not scaling , but parallelhashlist is scaling very well cause i am 
> using
> lock-striping that is lowering contention.
>
> What are doing Dmitry Vyukov in his distributed rwlock is lowering
> the contention using the same method as lock striping that i am using 
> inside
> parallelhashlist it is why it is scaling, but there is still a possibility
> of contention in his distributed rwlock that can cause a problem to the
> scalability if there is too many threads and not a sufficient number of
> rwlocks in the Dmitry distributed rwlock to be able to lower the 
> contention.
>
> I have tested parallelhashlist(a parallel hashtable that i have 
> implemented)
> with four threads on a quad core and it's giving a very well scaling on 
> both
> reads and writes.
>
> Also i have done some scalability tests on my parallelsort library and i 
> have
> come
> to the conclusion that parallel heapsort is better on scalability than
> parallel quicksort
> cause the P part (of the Amdahl equation) is bigger in parallel heapsort
> than in parallel
> quicksort, the parallel heapsort is doing more on the parallel part, it's
> why it scales better than parallel quicksort, but parallel quicksort is
> still
> faster than parallel heapsort and parallel merge sort on my tests on a
> quad core processor.
>
> And about lockfree_mpmc( a lockfree fifo queue), i have done some tests
> and it's not scaling cause when you are using a single thread some 
> variables
> are updated locally on the L1 cache but using multiple threads those 
> variables are
> loaded from the L2 cache and it's more expensive to load them from the L2
> cache.and this does generate much more contention
>
> But even though lockfree_mpmc is not scalable, you can increase
> the P (parallel) part by doing more of the same: Increase the volume of
> data processed by the P part (and therefore the percentage p of time spent
> in computing). This is Gustafson's Law and you will get more scalability.
>
> For example i have used the IntToStr() function on each of the four 
> threads
> (on
> a quad core) on my lockfree_mpmc test programs to convert from and integer
> to a string, so i have increased the P (parallel) part and i have got more
> scalability,
> this is Gustafson's Law, and you have to remember Gustafson's Law ,
> this is very important.
>
>
> You can download my parallel libraries from
>
> http://pages.videotron.com/aminer/
>
>
>
>
> Sincerely,
> Amine Moulay Ramdane.
>
>
>
>

Back to comp.programming.threads | Previous | Next — Previous in thread | Next in thread | Find similar

Thread

CAS operations and scalability... "aminer" <aminer@videotron.ca> - 2012-08-26 12:28 -0500
  Re: CAS operations and scalability... "aminer" <aminer@videotron.ca> - 2012-08-26 12:56 -0500
  Re: CAS operations and scalability... Patricia Shanahan <pats@acm.org> - 2012-08-26 10:36 -0700
    Re: CAS operations and scalability... "aminer" <aminer@videotron.ca> - 2012-08-26 14:08 -0500
  Re: CAS operations and scalability... "aminer" <aminer@videotron.ca> - 2012-09-13 19:13 -0500

csiph-web