Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.pascal.misc > #757
| From | aminer <aminer@toto.net> |
|---|---|
| Newsgroups | comp.lang.pascal.misc |
| Subject | More about scalable algorithms |
| Date | 2014-04-28 12:06 -0700 |
| Organization | albasani.net |
| Message-ID | <ljlua7$2i5$3@news.albasani.net> (permalink) |
Hello,
You have to understand that the Chriss Thomasson concurrent FIFO queue
that uses the bakery algorithm, here it is:
https://groups.google.com/d/topic/lock-free/acjQ3-89abE/discussion
Look at the producer() source code:
void producer(double state) {
uint32_t ver = XADD(&head, 1);
cell& c = cells[ver & (N - 1)];
while (LOAD(&c.ver) != ver) backoff();
c.state = state;
STORE(&c.ver, ver + 1);
}
Since the algorithm is more parralelized so the "c.state = state;"
will be wrote in parallel by many threads to the cache or write-back
cache, so this will higher the throughput of the pop() side, but
even if the push() side is faster , notice that under contention the
throughtput islimited by the throughput of the pop() side.
But look at the consumer() function:
double consumer() {
uint32_t ver = XADD(&tail, 1);
cell& c = cells[ver & (N - 1)];
while (LOAD(&c.ver) != ver + 1) backoff();
double state = c.state;
STORE(&c.ver, ver + N);
return state;
}
So as you have noticed the "double state = c.state" is serialized
cause the "Bus" serializes the data movements between the caches and the
mmemory subsystem and the caches... but notice how many variables
causes data movements between caches and between memory and caches on
the Chriss Thomasson algorithm on consumer() side,
There is the "tail" variable right on "XADD(&tail, 1);", there
is the "&c.ver" variable right on the "while (LOAD(&c.ver) != ver + 1)"
and there is "c.state"variable right on "double state = c.state;"
but since those variables generate data movements between the caches
and between the memory and the caches, they will cause contention
on the Bus as i have explained it to you in my previous posts, cause
those data movements are serialized on the "Bus" subsystem , so this
contention will cause more waiting time and more waiting time means less
throughput... but notice with me that in my concurrent FIFO queue that
uses the two lock algorithm since almost all the variable are located
inside the critial section, that means they will generate less
contention , hence the two locks algorithm is more efficient than the
Chriss Thomasson algorithm cause it lowers the contention efficently,
this is why even if my two locks algorithm uses more variables (4 in
total) than the Chriss Thomasson algorithm that is using 3 variables,
the two locks algorithm is scoring the same throughput on the
pop() side than the Chriss Thomasson algorithm , that means
that the two locks algorithm is more efficient when it comes
to "contention" than the Chriss Thomasson algorithm.
Hope you have undertstood what i want you to understand in this post.
Thank you,
Amine Moulay Ramdane.
Back to comp.lang.pascal.misc | Previous | Next — Next in thread | Find similar
More about scalable algorithms aminer <aminer@toto.net> - 2014-04-28 12:06 -0700
Re: More about scalable algorithms aminer <aminer@toto.net> - 2014-04-28 12:11 -0700
Re: More about scalable algorithms aminer <aminer@toto.net> - 2014-04-28 12:28 -0700
csiph-web