Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.programming.threads > #1333

Re: condvars of ACE framwork - possible bug

Newsgroups comp.programming.threads
Date 2013-02-06 13:42 -0800
References <9c4fabd6-a4e1-4ef1-9301-5fe7ef5df016@googlegroups.com>
Message-ID <a665fcf7-0bab-4207-b265-dc84a90c0d59@googlegroups.com> (permalink)
Subject Re: condvars of ACE framwork - possible bug
From Codeplug <graham.greene@charter.net>

Show all headers | View raw


On Tuesday, February 5, 2013 1:38:11 AM UTC-5, Michael Podolsky wrote:
> Hello All,
> 
> 
> 
> There is a condvar implementation in ACE framework
> 
> made for systems which lack them.
> 
> This implementation, I suppose, may have a bug. 
> 
> 
> 
> The ACE implementation, first of all, has a known and 
> 
> documented limitation: 'signal'
> 
> and 'broadcast' functions must be called only when the mutex associated 
> 
> with the condvar is locked. Still even with this limitation
> 
> there may be a bug which I am going to demonstrate:
> 
> 
> 
> Consider the signal function (file OS_NS_Thread.cpp)
> 
> looking like (I removed error handling):
> 
> 
> 
> int ACE_OS::cond_signal (ACE_cond_t *cv)
> 
> {
> 
>   ACE_OS::thread_mutex_lock (&cv->waiters_lock_);
> 
>   bool const have_waiters = cv->waiters_ > 0;
> 
>   ACE_OS::thread_mutex_unlock (&cv->waiters_lock_);
> 
>  
> 
>   if (have_waiters)
> 
>        return ACE_OS::sema_post (&cv->sema_);
> 
>   else
> 
>       return 0; // No-op
> 
> } 
> 
> 
> 
> The supposed problem with it is:
> 
> the signaled semaphore cv->sema_  may be "consumed"
> 
> by any thread, not especially by a thread which is 
> 
> supposed to currently wait on it. It may very well be consumed
> 
> by the thread which called  'cond_signal' function and 
> 
> then decided to 'wait' for condvar by itself, that
> 
> is, the thread may be waken up by its own signal,
> 
> and the other thread which called the wait
> 
> function before the condvar was signaled, would
> 
> not be woken up at all!
> 
> 
> 
> Now the code of 'wait' function is a bit complicated
> 
> (error handling and some unimportant logic is removed):
> 
> 
> 
> int ACE_OS::cond_wait (ACE_cond_t *cv,  ACE_mutex_t *external_mutex)
> 
> {
> 
>   ACE_OS::thread_mutex_lock (&cv->waiters_lock_) ;
> 
>   ++cv->waiters_;
> 
>   ACE_OS::thread_mutex_unlock (&cv->waiters_lock_);
> 
>   int result = 0;
> 
>   if (external_mutex->type_ == USYNC_PROCESS)
> 
>     {
> 
>       // This call will automatically release the mutex and wait on the semaphore.
> 
> /* LINE A */      ::SignalObjectAndWait (external_mutex->proc_mutex_, cv->sema_, INFINITE, FALSE),
> 
>     }
> 
>   else
> 
>     {
> 
>       // We keep the lock held just long enough to increment the count of
> 
>        ACE_OS::mutex_unlock (external_mutex);
> 
>       /* LINE B */ result = ACE_OS::sema_wait (&cv->sema_);
> 
>     }
> 
> 
> 
>   ACE_OS::thread_mutex_lock (&cv->waiters_lock_);
> 
>   --cv->waiters_;
> 
>   bool const last_waiter = cv->was_broadcast_ && cv->waiters_ == 0;
> 
>   ACE_OS::thread_mutex_unlock (&cv->waiters_lock_);
> 
> 
> 
>     if(external_mutex->type_ == USYNC_PROCESS)
> 
>     {
> 
>       if (last_waiter)
> 
>            ::SignalObjectAndWait (cv->waiters_done_,
> 
>                                                                 external_mutex->proc_mutex_,
> 
>                                                                 INFINITE, FALSE);
> 
>       else
> 
>             ACE_OS::mutex_lock (external_mutex);
> 
>       return result;
> 
>     }
> 
>   else if (last_waiter)
> 
>         ACE_OS::event_signal (&cv->waiters_done_);
> 
> 
> 
>   ACE_OS::mutex_lock (external_mutex);
> 
> 
> 
>   return result;
> 
> }
> 
> 
> 
> So what I wanted to say is that on 'LINE B' (in the prev code) we have the external mutex unlocked and only then approach the wait on
> 
> the semaphore, but that means that any other thread may gain the external mutex, call cond_wait function on the same condvar
> 
> and inside it succeed with wait on the same semaphore before the first thread even called ACE_OS::sema_wait (&cv->sema_).
> 
> It may very well be a scenario in which: Thread A decides to wait on a codvar, releases the mutex, gets preempted
> 
> by Thread B which gains mutex, signals condvar, then waits on it and consumes its own signal!
> 
> 
> 
> As for 'LINE A' which uses some elaborated SignalObjectAndWait of Win API, this function is not considered to be atomic
> 
> anymore as for current MSDN documentation. So all arguments for LINE B stay true for LINE A.
> 
> 
> 
> The proposed fix (if the bug actually exists) is simple: 'broadcast' function (which I do not show here) has a special
> 
> functionality to wait until last waiter completes on its wait on semaphore (this particularly
> 
> works as broadcast is made while holding the external mutex). The same should be
> 
> done for the case of 'signal': the cond_signal function should wait until one thread completes
> 
> its wait.
> 
> 
> 
> Thanks for your attention and comments )

If ACE condvars are anything like Posix condvars, then looping on a predicate condition should solve the problem. See page 165 here (pdf): http://www.cs.wustl.edu/~schmidt/PDF/ACE-tutorial.pdf

gg

Back to comp.programming.threads | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-04 22:38 -0800
  Re: condvars of ACE framwork - possible bug Codeplug <graham.greene@charter.net> - 2013-02-06 13:42 -0800
    Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-06 18:36 -0800
      Re: condvars of ACE framwork - possible bug Codeplug <graham.greene@charter.net> - 2013-02-06 19:39 -0800
        Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-06 20:01 -0800
          Re: condvars of ACE framwork - possible bug Codeplug <graham.greene@charter.net> - 2013-02-06 20:16 -0800
            Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-06 20:30 -0800
              Re: condvars of ACE framwork - possible bug Codeplug <graham.greene@charter.net> - 2013-02-07 09:05 -0800
                Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-07 19:27 -0800
  Re: condvars of ACE framwork - possible bug Noob <root@127.0.0.1> - 2013-02-07 11:54 +0100
    Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-07 19:24 -0800

csiph-web