Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #393284 > unrolled thread

notifying from inside or outside

Started byBonita Montero <Bonita.Montero@gmail.com>
First post2025-05-09 15:05 +0200
Last post2025-05-13 16:58 +0200
Articles 6 — 4 participants

Back to article view | Back to comp.lang.c


Contents

  notifying from inside or outside Bonita Montero <Bonita.Montero@gmail.com> - 2025-05-09 15:05 +0200
    Re: notifying from inside or outside NotAorB <atod101101@gmail.com> - 2025-05-12 16:11 -0400
    Re: notifying from inside or outside NotAorB <atod101101@gmail.com> - 2025-05-12 16:17 -0400
    Re: notifying from inside or outside Kaz Kylheku <643-408-1753@kylheku.com> - 2025-05-12 20:52 +0000
      Re: notifying from inside or outside "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-05-12 14:34 -0700
        Re: notifying from inside or outside Bonita Montero <Bonita.Montero@gmail.com> - 2025-05-13 16:58 +0200

#393284 — notifying from inside or outside

FromBonita Montero <Bonita.Montero@gmail.com>
Date2025-05-09 15:05 +0200
Subjectnotifying from inside or outside
Message-ID<vvkuj8$2qbbc$1@raubtier-asyl.eternal-september.org>
These are the results:

10000 rounds
inside:
         notify_one:
                 2901.1 context switches per thread
         notify_all:
                 2851.94 context switches per thread
outside:
         notify_one:
                 10003.3 context switches per thread
         notify_all:
                 7292.81 context switches per thread

notify_one is done n times, notify_all only once.
So with glibc it's better to notify while holding the mutex.

For Windows I've got only the CPU-times:

10000 rounds
inside:
         one:
                 2.29688 seconds
         all:
                 5.5 seconds
outside:
         one:
                 6.10938 seconds
         all:
                 7.39062 seconds

So for Windows it's the best to notify individually while holding the
mutex.
All tests are with 31 threads waiting for a notification and one thread
which is notifying.



#if defined(_WIN32)
	#include <Windows.h>
#endif
#include <iostream>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <atomic>
#include <semaphore>
#include <vector>
#include <string_view>
#if defined(__unix__)
	#include <sys/resource.h>
#endif

using namespace std;

struct params
{
	params( unsigned argc, char **argv );
	bool outside, add, all;
};

int main( int argc, char **argv )
{
	constexpr size_t N = 10'000;
	cout << N << " rounds" << endl;
	int hc = thread::hardware_concurrency(), nClients = hc - 1;
	for( unsigned outside = 0; outside <= 1; ++outside )
	{
		cout << (outside ? "outside:" : "inside:") << endl;
		for( unsigned all = 0; all <= 1; ++all )
		{
			cout << (all ? "\tall:" : "\tone:") << endl;
			mutex mtx;
			int signalled = 0;
			condition_variable cv;
			atomic_int ai( 0 );
			binary_semaphore bs( false );
			vector<jthread> threads;
			atomic_int64_t nVoluntary( 0 );
			atomic_bool stop( false );
			for( int c = nClients; c; --c )
				threads.emplace_back( [&]
					{
						for( size_t r = N; r; --r )
						{
							unique_lock lock( mtx );
							cv.wait( lock, [&] { return (bool)signalled; } );
							--signalled;
							lock.unlock();
							if( ai.fetch_sub( 1, memory_order_relaxed ) == 1 )
								bs.release( 1 );
						}
#if defined(__unix__)
						rusage ru;
						getrusage( RUSAGE_THREAD, &ru );
						nVoluntary.fetch_add( ru.ru_nvcsw, memory_order_relaxed );
#endif
					} );
			for( size_t r = N; r; --r )
			{
				auto notify = [&]
				{
					if( all )
						cv.notify_all();
					else
						for( int c = nClients; c; cv.notify_one(), --c );
				};
				unique_lock lock( mtx );
				signalled = nClients;
				if( !outside )
					notify();
				ai.store( nClients, memory_order_relaxed );
				lock.unlock();
				if( outside )
					notify();
				bs.acquire();
			}
			stop.store( true, memory_order_relaxed );
			threads.resize( 0 );
#if defined(_WIN32)
			FILETIME ftDummy, ftKernel, ftUser;
			GetProcessTimes( GetCurrentProcess(), &ftDummy, &ftDummy, &ftKernel, 
&ftUser );
			auto ftToU64 = []( FILETIME const &ft ) { return 
(uint64_t)ft.dwHighDateTime << 32 | ft.dwLowDateTime; };
			int64_t t = ftToU64( ftKernel ) + ftToU64( ftUser );
			cout << "\t\t" << t / 1.0e7 << " seconds" << endl;
#elif defined(__unix__)
			cout << "\t\t" << (double)nVoluntary.load( memory_order_relaxed ) / 
nClients << " context switches per thread" << endl;
#endif
		}
	}
}

[toc] | [next] | [standalone]


#393357

FromNotAorB <atod101101@gmail.com>
Date2025-05-12 16:11 -0400
Message-ID<m2v7q5fv1x.fsf@gmail.com>
In reply to#393284
Looks good to me.

[toc] | [prev] | [next] | [standalone]


#393358

FromNotAorB <atod101101@gmail.com>
Date2025-05-12 16:17 -0400
Message-ID<m2o6vxfusq.fsf@gmail.com>
In reply to#393284
Thanks for the results!

[toc] | [prev] | [next] | [standalone]


#393363

FromKaz Kylheku <643-408-1753@kylheku.com>
Date2025-05-12 20:52 +0000
Message-ID<20250512133259.596@kylheku.com>
In reply to#393284
On 2025-05-09, Bonita Montero <Bonita.Montero@gmail.com> wrote:
> So for Windows it's the best to notify individually while holding the
> mutex.
> All tests are with 31 threads waiting for a notification and one thread
> which is notifying.

When you hit a condition variable while holding the mutex, you're
including, in the mutex's critical region, all those instructions needed
to perform that operation, possibly requiring a trip to the kernel.

There have to be conditions (no pun intended) under which that causes
a problem; you're just not hitting them in your test case.

-- 
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

[toc] | [prev] | [next] | [standalone]


#393364

From"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com>
Date2025-05-12 14:34 -0700
Message-ID<vvtpg9$1aacu$1@dont-email.me>
In reply to#393363
On 5/12/2025 1:52 PM, Kaz Kylheku wrote:
> On 2025-05-09, Bonita Montero <Bonita.Montero@gmail.com> wrote:
>> So for Windows it's the best to notify individually while holding the
>> mutex.
>> All tests are with 31 threads waiting for a notification and one thread
>> which is notifying.
> 
> When you hit a condition variable while holding the mutex, you're
> including, in the mutex's critical region, all those instructions needed
> to perform that operation, possibly requiring a trip to the kernel.
> 
> There have to be conditions (no pun intended) under which that causes
> a problem; you're just not hitting them in your test case.
> 

Signalling from outside vs inside is a pretty old debate. I say signal 
from outside. Wait morphing aside for a moment...

He made a "correction" over in comp.lang.c++:
_______________
The Windows-times were summed-up times where each iteration included
the former iterations. Now it's corrected:

10000 rounds
inside:
         one:
                 2.04687 seconds
         all:
                 4 seconds
outside:
         one:
                 1.03125 seconds
         all:
                 1.14062 seconds

Am 09.05.2025 um 15:06 schrieb Bonita Montero:
[...]
__________

[toc] | [prev] | [next] | [standalone]


#393380

FromBonita Montero <Bonita.Montero@gmail.com>
Date2025-05-13 16:58 +0200
Message-ID<vvvmma$1snhd$1@raubtier-asyl.eternal-september.org>
In reply to#393364
Am 12.05.2025 um 23:34 schrieb Chris M. Thomasson:

 > Signalling from outside vs inside is a pretty old debate. I say signal
> from outside. Wait morphing aside for a moment...

It's less efficient with glibc.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.c


csiph-web