Path: csiph.com!optima2.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!Xl.tags.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!news.giganews.com.POSTED!not-for-mail NNTP-Posting-Date: Fri, 26 Jun 2015 01:30:01 -0500 Return-Path: Sender: std-cpp-request@vandevoorde.com Approved: james.dennett@gmail.com Message-ID: Newsgroups: comp.std.c++ From: itaj sherman Subject: Re: Refreshing cpu cache before atomic relaxed loads Organization: unknown References: Content-Type: text/plain; charset=ISO-8859-1 X-Original-Date: Wed, 24 Jun 2015 23:42:01 -0700 (PDT) X-Submission-Address: std-cpp-submit@vandevoorde.com Date: Fri, 26 Jun 2015 01:29:39 CST Lines: 106 X-Usenet-Provider: http://www.giganews.com X-Trace: sv3-5Lz/hyrnjYIWakadRcEwHhKiejDLaOM7yTQhK6o9kw6SRsTC39AzCbLx+y3yesP/7DYcUOG5TMj1oaa!A9cFsl1pm/XmW98R7curHoRXKMEAw2MlFlud8l9MvKU/VdpoB0tqjY3uCC3jkysPbHJ9kdjtSNo8!7ofSF75zYZ8= X-Complaints-To: abuse@giganews.com X-DMCA-Notifications: http://www.giganews.com/info/dmca.html X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 4633 Xref: csiph.com comp.std.c++:760 On Monday, June 22, 2015 at 10:30:04 PM UTC+3, itaj sherman wrote: > My question turned up while implementing spin-wait > when a thread needs to wait for another thread to complete a short work, > so short that locking/releasing a mutex might take longer than the work. I guess it's important to add: knowing that the other thread is actually currently doing that work, which is shorter than thread context switch (and mutex/condvar operations). > The problem the way I see it is the standard does not explicitly define a > clear way to refresh the cpu cache before re-loading an atomic variable. > > In my example code below, atomic_oneway_flag::spin_wait_flag has 3 suggested > implementations. They are equivalent w.r.t memory ordering as seen by user > code. But they might not be equal in speed. > > Specifically, it seems that practically (correct me if I'm wrong), > implementation-2 of atomic_oneway_flag::spin_wait_flag below > is faster/better than implementation-1. > I.e. load( ..., memory_order_acquire ) or fence( memory_order_acquire ) > > Now, is it somehow implied by the standard that an acquire operation might > refresh the following loads faster than a relaxed operation? > I cannot see that it is. > Thus, I would expect to do something like implementation-3 below using an > operation that explicitly and specifically refreshes the cache for the next > relaxed load. > Seems possibly I'm asking about something like the x86 "pause" instruction. As explained here: http://www.quora.com/What-is-the-purpose-of-the-pause-instruction-in-the-x86-ISA I've seen this instruction is used in implementation of boost::atomics::detail::pause(). So is there anything like that in the standard? Was it ever discussed? > .... > > public: void spin_wait_flag() //implementation 3 > { > while( true ) { > bool x( std::atomic::load( m, memory_order_relaxed ) ); > if( x ) { > std::atomic::fence( memory_order_acquire ); > return; > } else { > /* Some code that refreshes the cache for the */ > /* following relaxed load. */ > /* Supposedly std::atomic::load_memory_barrier(); */ > } > } > } > to fix implementation 3 with that: public: void spin_wait_flag() //implementation 3 { while( true ) { bool x( std::atomic::load( m, memory_order_relaxed ) ); if( x ) { std::atomic::fence( memory_order_acquire ); return; } else { /* Some code that refreshes the cache for the */ /* following relaxed load. */ boost::atomics::detail::pause(); } } } //code from boost_1_58_0\boost\atomic\detail\pause.hpp(30) BOOST_FORCEINLINE void pause() BOOST_NOEXCEPT { #if defined(_MSC_VER) && (defined(_M_AMD64) || defined(_M_IX86)) _mm_pause(); #elif defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__)) __asm__ __volatile__("pause;"); #endif } > ... > > regards, > itaj > regards, itaj -- [ comp.std.c++ is moderated. To submit articles, try posting with your ] [ newsreader. If that fails, use mailto:std-cpp-submit@vandevoorde.com ] [ --- Please see the FAQ before posting. --- ] [ FAQ: http://www.comeaucomputing.com/csc/faq.html ]