Groups | Search | Server Info | Keyboard shortcuts | Login | Register
| Message-ID | <c7f2c29a-3cdb-409a-bf5b-9fb30efc90cf@googlegroups.com> (permalink) |
|---|---|
| Newsgroups | comp.std.c++ |
| From | itaj sherman <itajsherman@googlemail.com> |
| Subject | Re: Refreshing cpu cache before atomic relaxed loads |
| Organization | unknown |
| References | <f2af6f2a-c936-41c5-9461-9b0d89299993@googlegroups.com> |
| Date | 2015-06-26 01:29 -0600 |
On Monday, June 22, 2015 at 10:30:04 PM UTC+3, itaj sherman wrote:
> My question turned up while implementing spin-wait
> when a thread needs to wait for another thread to complete a short work,
> so short that locking/releasing a mutex might take longer than the work.
I guess it's important to add:
knowing that the other thread is actually currently doing that work,
which is shorter than thread context switch (and mutex/condvar operations).
> The problem the way I see it is the standard does not explicitly define a
> clear way to refresh the cpu cache before re-loading an atomic variable.
>
> In my example code below, atomic_oneway_flag::spin_wait_flag has 3
suggested
> implementations. They are equivalent w.r.t memory ordering as seen by user
> code. But they might not be equal in speed.
>
> Specifically, it seems that practically (correct me if I'm wrong),
> implementation-2 of atomic_oneway_flag::spin_wait_flag below
> is faster/better than implementation-1.
> I.e. load( ..., memory_order_acquire ) or fence( memory_order_acquire )
>
> Now, is it somehow implied by the standard that an acquire operation might
> refresh the following loads faster than a relaxed operation?
> I cannot see that it is.
> Thus, I would expect to do something like implementation-3 below using an
> operation that explicitly and specifically refreshes the cache for the
next
> relaxed load.
>
Seems possibly I'm asking about something like the x86 "pause" instruction.
As explained here:
http://www.quora.com/What-is-the-purpose-of-the-pause-instruction-in-the-x86-ISA
I've seen this instruction is used in implementation of
boost::atomics::detail::pause().
So is there anything like that in the standard?
Was it ever discussed?
> ....
>
> public: void spin_wait_flag() //implementation 3
> {
> while( true ) {
> bool x( std::atomic::load( m, memory_order_relaxed ) );
> if( x ) {
> std::atomic::fence( memory_order_acquire );
> return;
> } else {
> /* Some code that refreshes the cache for the */
> /* following relaxed load. */
> /* Supposedly std::atomic::load_memory_barrier(); */
> }
> }
> }
>
to fix implementation 3 with that:
public: void spin_wait_flag() //implementation 3
{
while( true ) {
bool x( std::atomic::load( m, memory_order_relaxed ) );
if( x ) {
std::atomic::fence( memory_order_acquire );
return;
} else {
/* Some code that refreshes the cache for the */
/* following relaxed load. */
boost::atomics::detail::pause();
}
}
}
//code from boost_1_58_0\boost\atomic\detail\pause.hpp(30)
BOOST_FORCEINLINE void pause() BOOST_NOEXCEPT
{
#if defined(_MSC_VER) && (defined(_M_AMD64) || defined(_M_IX86))
_mm_pause();
#elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__))
__asm__ __volatile__("pause;");
#endif
}
> ...
>
> regards,
> itaj
>
regards,
itaj
--
[ comp.std.c++ is moderated. To submit articles, try posting with your ]
[ newsreader. If that fails, use mailto:std-cpp-submit@vandevoorde.com ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to comp.std.c++ | Previous | Next — Previous in thread | Find similar
Refreshing cpu cache before atomic relaxed loads itaj sherman <itajsherman@googlemail.com> - 2015-06-22 14:22 -0600 Re: Refreshing cpu cache before atomic relaxed loads via.usov@googlemail.com - 2015-06-25 14:32 -0600 Re: Refreshing cpu cache before atomic relaxed loads itaj sherman <itajsherman@googlemail.com> - 2015-06-26 01:29 -0600
csiph-web