Groups > comp.programming.threads > #1823

Re: forcing the compiler to reload from memory with c++0x

From	Anthony Williams <anthony.ajw@gmail.com>
Newsgroups	comp.programming.threads
Subject	Re: forcing the compiler to reload from memory with c++0x
References	(11 earlier) <87sjw8w4b5.fsf@justsoftwaresolutions.co.uk> <iietha$f3a$1@news.eternal-september.org> <iifb30$9d3$1@news.eternal-september.org> <99c58954-6434-4135-b90a-da60dc1d865d@i14g2000prm.googlegroups.com> <iifuoj$1j3$1@news.eternal-september.org>
Date	2011-02-04 08:42 +0000
Message-ID	<87wrlg9omp.fsf@justsoftwaresolutions.co.uk> (permalink)
Organization	CNNTP

Show all headers | View raw

Andy Venikov <swojchelowek@gmail.com> writes:

> On 2/3/2011 6:54 PM, Joshua Maurice wrote:
>
>>
>> This is part of the problem, I think, this bad kind of thinking about
>> threads. The very name of this thread, "Forcing the compiler to reload
>> from memory with C++0x" shows this "wrong" kind of thinking. Under c+
>> +0x, just like POSIX pthreads, you don't talk about reloading from
>> memory. There is no "main memory" in the abstraction. Instead, you
>> talk about visibility guarantees. For each read, you need to identify
>> which write it is conditionally required to see, and then put in the
>> appropriate memory orderings.

Absolutely.

> Yes and I'm trying to make sure that C++0x has sufficient means to do
> that. Ordering and visibility guarantees are not enough (apparently)
> when it comes to the rules of the abstract machine. We need to make
> sure that rules for the abstract machine won't allow to optimize the
> linearization points away. We need a way to control the
> rules. Volatile used to be a (brittle) way. I thought that the atomics
> of the new standard would allow us to that too, but according to
> Anthony's replies (if I understand them correctly) it may not be
> necessarily so.
>
> Earlier in the thread I asked a question:
> atomic<int> a_int;
> ...
> while (!a_int.get(relaxed))
> {
> }
>
> Will the compiler be allowed to optimize the whole loop away? The
> answer was yes.

I didn't explain myself very well. The compiler is allowed to optimize
the loop away if it can prove that the FIRST call of a_int.load() can
return a non-zero value (e.g. because you just stored a non-zero value
to a_int), thus preventing the while loop ever executing. 

If the visibility rules mean that the compiler cannot prove this, then
it must emit the while loop, and consequently emit the reads.

Since you used a relaxed load for the loop condition, the constraints on
this read are minimal, so the while loop may never execute anyway (even
if the compiler emits the code).

If you want the loop to execute at least once, use do-while instead.

> And:  (I added atomics this time)

[and thus made the example clearer, and prevented the omitting of the
read]

> struct Node
> {
>   atomic<Node*> pNext;
> };
> atomic<Node *> sharedHead;
> Node * pLocalCurrent = sharedHead.load(consume);
> Node * pLocalNext = pLocalCurrent->next.load(acquire);
>
> if (pLocalCurrent == sharedHead.load(acquire))
> {
>    //Here we can assume that pLocalNext is valid
> }
>
> The answer was that the compiler would be allowed to always run the
> body of the if statement.

With the memory barriers as you've written here, this is not true: the
acquire on the read of pLocalNext means that any changes written by the
thread that wrote plocalCurrent->next must be visible to the current
thread, including any writes to sharedHead. The compiler must thus
re-read sharedHead, since a new value may have become visible.

[New values written by threads other than the one that wrote
pLocalCurrent->next need not be visible unless there is a transitive
happens-before relationship]

> These are direct violation of the algorithms as they eliminate
> linearization points and break the algorithms.

No, they don't. You just need to think carefully about what is actually
required of the algorithms.

> So, in addition to order and visibility, we need some means to be able
> to change the rules of the abstract machine. What are they?

I don't think any are needed.

>
>>
>> As to the specific question, why do we not get deadlock, I think we
>> have to rely on a special and perhaps underspecified guarantee in the
>> standard: each atomic write is guaranteed to become visible to every
>> atomic read in all threads in a "reasonable" amount of time. I think
>> this has been interpreted as saying that a compiler must ensure that
>> each atomic write will become visible to every atomic read in all
>> threads in a /finite/ amount of time. This is exactly the situation at
>> lines 3 and 4. The compiler cannot optimize away the atomic reads in
>> the loop at lines 3 and 4 because of that "finite time" requirement.
>
> I'm not so sure. If I understood Anthony correctly, the write to the
> turn variable may well be "visible", but no one is looking for it
> because the second read is not there. Visibility has nothing to do
> with optimizing the second read away. Visibility says that if someone
> is looking for the value, eventually that someone will see the new
> value, but if no one is looking then no one will see anything.

The compiler can omit a single read if it can provide a valid value for
that read to return which doesn't break any visibility rules. e.g. two
consecutive reads of the same variable without any intervening
operations can be assumed to read the same value; a read following a
write to that variable can be assumed to read the value written.

The compiler cannot omit a read in a loop, since that would break the
rule about writes being visible in a finite amount of time.

Anthony
-- 
Author of C++ Concurrency in Action     http://www.stdthread.co.uk/book/
just::thread C++0x thread library             http://www.stdthread.co.uk
Just Software Solutions Ltd       http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976

Back to comp.programming.threads | Previous | Next — Previous in thread | Next in thread | Find similar

Thread

Re: forcing the compiler to reload from memory with c++0x Andy Venikov <swojchelowek@gmail.com> - 2011-02-03 23:19 -0500
  Re: forcing the compiler to reload from memory with c++0x Andy Venikov <swojchelowek@gmail.com> - 2011-02-06 23:17 -0500
  Re: forcing the compiler to reload from memory with c++0x Anthony Williams <anthony.ajw@gmail.com> - 2011-02-04 09:21 +0000
  Re: forcing the compiler to reload from memory with c++0x Anthony Williams <anthony.ajw@gmail.com> - 2011-02-04 08:42 +0000
  Re: forcing the compiler to reload from memory with c++0x Joshua Maurice <joshuamaurice@gmail.com> - 2011-02-03 22:45 -0800
    Re: forcing the compiler to reload from memory with c++0x Joshua Maurice <joshuamaurice@gmail.com> - 2011-02-04 00:28 -0800

csiph-web