Groups | Search | Server Info | Login | Register


Groups > comp.std.c++ > #758

Refreshing cpu cache before atomic relaxed loads

Message-ID <f2af6f2a-c936-41c5-9461-9b0d89299993@googlegroups.com> (permalink)
Newsgroups comp.std.c++
From itaj sherman <itajsherman@googlemail.com>
Subject Refreshing cpu cache before atomic relaxed loads
Organization unknown
Date 2015-06-22 14:22 -0600

Show all headers | View raw


My question turned up while implementing spin-wait
when a thread needs to wait for another thread to complete a short work,
so short that locking/releasing a mutex might take longer than the work.
The problem the way I see it is the standard does not explicitly define a
clear way to refresh the cpu cache before re-loading an atomic variable.

In my example code below, atomic_oneway_flag::spin_wait_flag has 3 suggested
implementations. They are equivalent w.r.t memory ordering as seen by user
code. But they might not be equal in speed.

Specifically, it seems that practically (correct me if I'm wrong),
implementation-2 of atomic_oneway_flag::spin_wait_flag below
is faster/better than implementation-1.
I.e. load( ..., memory_order_acquire ) or fence( memory_order_acquire )

Now, is it somehow implied by the standard that an acquire operation might
refresh the following loads faster than a relaxed operation?
I cannot see that it is.
Thus, I would expect to do something like implementation-3 below using an
operation that explicitly and specifically refreshes the cache for the next
relaxed load.

class atomic_oneway_flag
{

//data
  private: atomic<bool> m;

//ctors
  public: atomic_oneway_flag()
  :
    m(false)
  {
  }

//methods
  public: void turn_on()
  {
    std::atomic::store( m, true, memory_order_release );
  }

  public: bool test()
  {
    bool x( std::atomic::load( m, memory_order_relaxed ) );
    if( x ) {
      std::atomic::fence( memory_order_acquire );
    }
    return x;
  }

#if USE_IMPLEMENTATION() == 1

  public: void spin_wait_flag() //implementation 1
  {
    while( true ) {
      bool x( std::atomic::load( m, memory_order_relaxed ) );
      if( x ) {
        std::atomic::fence( memory_order_acquire );
        return;
      }
    }
  }

#elif USE_IMPLEMENTATION() == 2

  public: void spin_wait_flag() //implementation 2
  {
    while( true ) {
      bool x( std::atomic::load( m, memory_order_acquire ) );
      /* if x is false acquire might cause */
      /* cpu to refresh faster for next load */
      if( x ) {
        return;
      }
    }
  }

#elif USE_IMPLEMENTATION() == 3

  public: void spin_wait_flag() //implementation 3
  {
    while( true ) {
      bool x( std::atomic::load( m, memory_order_relaxed ) );
      if( x ) {
        std::atomic::fence( memory_order_acquire );
        return;
      } else {
        /* Some code that refreshes the cache for the */
        /* following relaxed load. */
        /* Supposedly std::atomic::load_memory_barrier(); */
      }
    }
  }

#elif
#error
#endif

};


//user code

atomic_oneway_flag flag;

//thread 1
... do some very short work
flag.turn_on();

//threads 2..N
flag.spin_wait_flag(); //while thread 1 does short work.
... do some work

regards,
itaj


--
[ comp.std.c++ is moderated.  To submit articles, try posting with your ]
[ newsreader.  If that fails, use mailto:std-cpp-submit@vandevoorde.com ]
[              --- Please see the FAQ before posting. ---               ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html                      ]

Back to comp.std.c++ | Previous | NextNext in thread | Find similar


Thread

Refreshing cpu cache before atomic relaxed loads itaj sherman <itajsherman@googlemail.com> - 2015-06-22 14:22 -0600
  Re: Refreshing cpu cache before atomic relaxed loads via.usov@googlemail.com - 2015-06-25 14:32 -0600
  Re: Refreshing cpu cache before atomic relaxed loads itaj sherman <itajsherman@googlemail.com> - 2015-06-26 01:29 -0600

csiph-web