Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.programming.threads > #1333
| Newsgroups | comp.programming.threads |
|---|---|
| Date | 2013-02-06 13:42 -0800 |
| References | <9c4fabd6-a4e1-4ef1-9301-5fe7ef5df016@googlegroups.com> |
| Message-ID | <a665fcf7-0bab-4207-b265-dc84a90c0d59@googlegroups.com> (permalink) |
| Subject | Re: condvars of ACE framwork - possible bug |
| From | Codeplug <graham.greene@charter.net> |
On Tuesday, February 5, 2013 1:38:11 AM UTC-5, Michael Podolsky wrote:
> Hello All,
>
>
>
> There is a condvar implementation in ACE framework
>
> made for systems which lack them.
>
> This implementation, I suppose, may have a bug.
>
>
>
> The ACE implementation, first of all, has a known and
>
> documented limitation: 'signal'
>
> and 'broadcast' functions must be called only when the mutex associated
>
> with the condvar is locked. Still even with this limitation
>
> there may be a bug which I am going to demonstrate:
>
>
>
> Consider the signal function (file OS_NS_Thread.cpp)
>
> looking like (I removed error handling):
>
>
>
> int ACE_OS::cond_signal (ACE_cond_t *cv)
>
> {
>
> ACE_OS::thread_mutex_lock (&cv->waiters_lock_);
>
> bool const have_waiters = cv->waiters_ > 0;
>
> ACE_OS::thread_mutex_unlock (&cv->waiters_lock_);
>
>
>
> if (have_waiters)
>
> return ACE_OS::sema_post (&cv->sema_);
>
> else
>
> return 0; // No-op
>
> }
>
>
>
> The supposed problem with it is:
>
> the signaled semaphore cv->sema_ may be "consumed"
>
> by any thread, not especially by a thread which is
>
> supposed to currently wait on it. It may very well be consumed
>
> by the thread which called 'cond_signal' function and
>
> then decided to 'wait' for condvar by itself, that
>
> is, the thread may be waken up by its own signal,
>
> and the other thread which called the wait
>
> function before the condvar was signaled, would
>
> not be woken up at all!
>
>
>
> Now the code of 'wait' function is a bit complicated
>
> (error handling and some unimportant logic is removed):
>
>
>
> int ACE_OS::cond_wait (ACE_cond_t *cv, ACE_mutex_t *external_mutex)
>
> {
>
> ACE_OS::thread_mutex_lock (&cv->waiters_lock_) ;
>
> ++cv->waiters_;
>
> ACE_OS::thread_mutex_unlock (&cv->waiters_lock_);
>
> int result = 0;
>
> if (external_mutex->type_ == USYNC_PROCESS)
>
> {
>
> // This call will automatically release the mutex and wait on the semaphore.
>
> /* LINE A */ ::SignalObjectAndWait (external_mutex->proc_mutex_, cv->sema_, INFINITE, FALSE),
>
> }
>
> else
>
> {
>
> // We keep the lock held just long enough to increment the count of
>
> ACE_OS::mutex_unlock (external_mutex);
>
> /* LINE B */ result = ACE_OS::sema_wait (&cv->sema_);
>
> }
>
>
>
> ACE_OS::thread_mutex_lock (&cv->waiters_lock_);
>
> --cv->waiters_;
>
> bool const last_waiter = cv->was_broadcast_ && cv->waiters_ == 0;
>
> ACE_OS::thread_mutex_unlock (&cv->waiters_lock_);
>
>
>
> if(external_mutex->type_ == USYNC_PROCESS)
>
> {
>
> if (last_waiter)
>
> ::SignalObjectAndWait (cv->waiters_done_,
>
> external_mutex->proc_mutex_,
>
> INFINITE, FALSE);
>
> else
>
> ACE_OS::mutex_lock (external_mutex);
>
> return result;
>
> }
>
> else if (last_waiter)
>
> ACE_OS::event_signal (&cv->waiters_done_);
>
>
>
> ACE_OS::mutex_lock (external_mutex);
>
>
>
> return result;
>
> }
>
>
>
> So what I wanted to say is that on 'LINE B' (in the prev code) we have the external mutex unlocked and only then approach the wait on
>
> the semaphore, but that means that any other thread may gain the external mutex, call cond_wait function on the same condvar
>
> and inside it succeed with wait on the same semaphore before the first thread even called ACE_OS::sema_wait (&cv->sema_).
>
> It may very well be a scenario in which: Thread A decides to wait on a codvar, releases the mutex, gets preempted
>
> by Thread B which gains mutex, signals condvar, then waits on it and consumes its own signal!
>
>
>
> As for 'LINE A' which uses some elaborated SignalObjectAndWait of Win API, this function is not considered to be atomic
>
> anymore as for current MSDN documentation. So all arguments for LINE B stay true for LINE A.
>
>
>
> The proposed fix (if the bug actually exists) is simple: 'broadcast' function (which I do not show here) has a special
>
> functionality to wait until last waiter completes on its wait on semaphore (this particularly
>
> works as broadcast is made while holding the external mutex). The same should be
>
> done for the case of 'signal': the cond_signal function should wait until one thread completes
>
> its wait.
>
>
>
> Thanks for your attention and comments )
If ACE condvars are anything like Posix condvars, then looping on a predicate condition should solve the problem. See page 165 here (pdf): http://www.cs.wustl.edu/~schmidt/PDF/ACE-tutorial.pdf
gg
Back to comp.programming.threads | Previous | Next — Previous in thread | Next in thread | Find similar
condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-04 22:38 -0800
Re: condvars of ACE framwork - possible bug Codeplug <graham.greene@charter.net> - 2013-02-06 13:42 -0800
Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-06 18:36 -0800
Re: condvars of ACE framwork - possible bug Codeplug <graham.greene@charter.net> - 2013-02-06 19:39 -0800
Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-06 20:01 -0800
Re: condvars of ACE framwork - possible bug Codeplug <graham.greene@charter.net> - 2013-02-06 20:16 -0800
Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-06 20:30 -0800
Re: condvars of ACE framwork - possible bug Codeplug <graham.greene@charter.net> - 2013-02-07 09:05 -0800
Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-07 19:27 -0800
Re: condvars of ACE framwork - possible bug Noob <root@127.0.0.1> - 2013-02-07 11:54 +0100
Re: condvars of ACE framwork - possible bug Michael Podolsky <michael.podolsky.69@gmail.com> - 2013-02-07 19:24 -0800
csiph-web