Groups > comp.lang.python > #71318 > unrolled thread

Using threads for audio computing?

Started by	lgabiot <lgabiot@hotmail.com>
First post	2014-05-11 16:18 +0200
Last post	2014-05-14 16:23 +0200
Articles	12 — 5 participants

Back to article view | Back to comp.lang.python

  Using threads for audio computing? lgabiot <lgabiot@hotmail.com> - 2014-05-11 16:18 +0200
    Re: Using threads for audio computing? Roy Smith <roy@panix.com> - 2014-05-11 10:40 -0400
      Re: Using threads for audio computing? lgabiot <lgabiot@hotmail.com> - 2014-05-11 17:40 +0200
        Re: Using threads for audio computing? lgabiot <lgabiot@hotmail.com> - 2014-05-12 07:33 +0200
          Re: Using threads for audio computing? Chris Angelico <rosuav@gmail.com> - 2014-05-12 15:41 +1000
            Re: Using threads for audio computing? lgabiot <lgabiot@hotmail.com> - 2014-05-12 07:54 +0200
              Re: Using threads for audio computing? Chris Angelico <rosuav@gmail.com> - 2014-05-12 15:58 +1000
                Re: Using threads for audio computing? lgabiot <lgabiot@hotmail.com> - 2014-05-12 08:02 +0200
          Re: Using threads for audio computing? Stefan Behnel <stefan_ml@behnel.de> - 2014-05-12 08:13 +0200
            Re: Using threads for audio computing? lgabiot <lgabiot@hotmail.com> - 2014-05-12 10:14 +0200
              Re: Using threads for audio computing? lgabiot <lgabiot@hotmail.com> - 2014-05-12 11:17 +0200
          Re: Using threads for audio computing? Sturla Molden <sturla.molden@gmail.com> - 2014-05-14 16:23 +0200

#71318 — Using threads for audio computing?

From	lgabiot <lgabiot@hotmail.com>
Date	2014-05-11 16:18 +0200
Subject	Using threads for audio computing?
Message-ID	<536f869c$0$2178$426a74cc@news.free.fr>

Hello,

I'd like to be able to analyze incoming audio from a sound card using 
Python, and I'm trying to establish a correct architecture for this.

Getting the audio is OK (using PyAudio), as well as the calculations 
needed, so won't be discussing those, but the general idea of being able 
at (roughly) the same time: getting audio, and performing calculation on 
it, while not loosing any incoming audio.
I also make the assumption that my calculations on audio will be done 
faster than the time I need to get the audio itself, so that the 
application would be almost real time.


So far my idea (which works according to the small tests I did) consist 
of using a Queue object as a buffer for the incoming audio and two 
threads, one to feed the queue, the other to consume it.


The queue could store the audio as a collection of numpy array of x samples.
The first thread work would be to put() into the queue new chunks of 
audio as they are received from the audio card, while the second would 
get() from the queue chunks and perform the necessary calculations on them.

Am I in the right direction, or is there a better general idea?

Thanks!

[toc] | [next] | [standalone]

#71319

From	Roy Smith <roy@panix.com>
Date	2014-05-11 10:40 -0400
Message-ID	<roy-73C8F2.10404911052014@news.panix.com>
In reply to	#71318

In article <536f869c$0$2178$426a74cc@news.free.fr>,
 lgabiot <lgabiot@hotmail.com> wrote:

> Hello,
> 
> I'd like to be able to analyze incoming audio from a sound card using 
> Python, and I'm trying to establish a correct architecture for this.
> 
> Getting the audio is OK (using PyAudio), as well as the calculations 
> needed, so won't be discussing those, but the general idea of being able 
> at (roughly) the same time: getting audio, and performing calculation on 
> it, while not loosing any incoming audio.
> I also make the assumption that my calculations on audio will be done 
> faster than the time I need to get the audio itself, so that the 
> application would be almost real time.
> 
> 
> So far my idea (which works according to the small tests I did) consist 
> of using a Queue object as a buffer for the incoming audio and two 
> threads, one to feed the queue, the other to consume it.
> 
> 
> The queue could store the audio as a collection of numpy array of x samples.
> The first thread work would be to put() into the queue new chunks of 
> audio as they are received from the audio card, while the second would 
> get() from the queue chunks and perform the necessary calculations on them.
> 
> Am I in the right direction, or is there a better general idea?
> 
> Thanks!

If you are going to use threads, the architecture you describe seems 
perfectly reasonable.  It's a classic producer-consumer pattern.

But, I wonder if you even need anything this complicated.  Using a queue 
to buffer work between threads makes sense if the workload presented is 
uneven.  Sometimes you'll get a burst of work all at once and don't have 
the capacity to process it in real-time, so you want to buffer it up.

I would think sampling audio would be a steady stream.  Every x ms, you 
get another chunk of samples, like clockwork.  Is this not the case?

[toc] | [prev] | [next] | [standalone]

#71321

From	lgabiot <lgabiot@hotmail.com>
Date	2014-05-11 17:40 +0200
Message-ID	<536f99eb$0$2109$426a74cc@news.free.fr>
In reply to	#71319

Le 11/05/14 16:40, Roy Smith a écrit :
> In article <536f869c$0$2178$426a74cc@news.free.fr>,
>   lgabiot <lgabiot@hotmail.com> wrote:
>
>> Hello,
>>
Le 11/05/14 16:40, Roy Smith a écrit :
> If you are going to use threads, the architecture you describe seems
> perfectly reasonable.  It's a classic producer-consumer pattern.
>
> But, I wonder if you even need anything this complicated.  Using a queue
> to buffer work between threads makes sense if the workload presented is
> uneven.  Sometimes you'll get a burst of work all at once and don't have
> the capacity to process it in real-time, so you want to buffer it up.
>
> I would think sampling audio would be a steady stream.  Every x ms, you
> get another chunk of samples, like clockwork.  Is this not the case?
>

Thanks for your answer,

yes, I guess I can consider audio as a steady stream. PyAudio gives me 
the audio samples by small chunks (2048 samples at a time for instance, 
while the sound card gives 48 000 samples/seconds). I accumulate the 
samples into a numpy array, and once the numpy array has reached the 
needed size (for instance 5 seconds of audio), I put this numpy array in 
the queue. So I think you are right in thinking that every 5 seconds I 
get a new chunk of audio to work on. Then I perform a calculation on 
this 5 seconds of audio (which needs to be done in less than 5 seconds, 
so that it will be ready to process the next 5 second chunk), but 
meanwhile, I need to still constantly get from Pyaudio a new 5 second 
chunk of audio. Hence my system.

I guess if my calculation had to be performed on a small number of 
samples (i.e. under the value of the Pyaudio buffer size (2048 samples 
for instance), and that the calculation would last less than the time it 
takes to get the next 2048 samples from Pyaudio, I wouldn't need the 
Queue and Thread system.
But in my case where I need a large buffer, it might not work?
Unless I ask pyaudio to feed me directly with 5 seconds chunks (instead 
of the usual buffer sizes: 1024, 2048, etc...), which I didn't try, 
because I hadn't though of it.

[toc] | [prev] | [next] | [standalone]

#71370

From	lgabiot <lgabiot@hotmail.com>
Date	2014-05-12 07:33 +0200
Message-ID	<53705d34$0$2374$426a74cc@news.free.fr>
In reply to	#71321

Le 11/05/14 17:40, lgabiot a écrit :

> I guess if my calculation had to be performed on a small number of
> samples (i.e. under the value of the Pyaudio buffer size (2048 samples
> for instance), and that the calculation would last less than the time it
> takes to get the next 2048 samples from Pyaudio, I wouldn't need the
> Queue and Thread system.
> But in my case where I need a large buffer, it might not work?
> Unless I ask pyaudio to feed me directly with 5 seconds chunks (instead
> of the usual buffer sizes: 1024, 2048, etc...), which I didn't try,
> because I hadn't though of it.

I guess this solution might probably not work, since it would mean that 
the calculation should be quick enough so it wouldn't last longer than 1 
sample (1/48000 s for instance), since while doing the calculation, no 
audio would be ingested (unless pyAudio possess some kind of internal 
concurrency system).
Which leads me to think that a buffer (queue) and separate threads 
(producer and consumer) are necessary for this task.

But AFAIK the python GIL (and in smaller or older computers that have 
only one core) does not permit true paralell execution of two threads. I 
believe it is quite like the way multiple processes are handled by an OS 
on a single CPU computer: process A has x CPU cycles, then process B has 
y CPU cycles, etc...
So in my case, I must have a way to make sure that:
thread 1 (which gets audio from Pyaudio and put() it in the Queue) is 
not interrupted long enough to miss a sample.
If I suppose a worst case scenario for the computer, like a 
raspberry-pi, the CPU speed is 700MHz, which gives approx 14 000 CPU 
cycles between each audio samples (at 48 kHz FS). I don't know if 14 000 
CPU cycle is a lot or not for the tasks at hands.

Well, at least, it is what I understand, but since I'm really both a 
beginner and an hobbyist, I might be totally wrong...

[toc] | [prev] | [next] | [standalone]

#71371

From	Chris Angelico <rosuav@gmail.com>
Date	2014-05-12 15:41 +1000
Message-ID	<mailman.9908.1399873286.18130.python-list@python.org>
In reply to	#71370

On Mon, May 12, 2014 at 3:33 PM, lgabiot <lgabiot@hotmail.com> wrote:
> But AFAIK the python GIL (and in smaller or older computers that have only
> one core) does not permit true paralell execution of two threads. I believe
> it is quite like the way multiple processes are handled by an OS on a single
> CPU computer: process A has x CPU cycles, then process B has y CPU cycles,
> etc...
> So in my case, I must have a way to make sure that:
> thread 1 (which gets audio from Pyaudio and put() it in the Queue) is not
> interrupted long enough to miss a sample.
> If I suppose a worst case scenario for the computer, like a raspberry-pi,
> the CPU speed is 700MHz, which gives approx 14 000 CPU cycles between each
> audio samples (at 48 kHz FS). I don't know if 14 000 CPU cycle is a lot or
> not for the tasks at hands.
>
> Well, at least, it is what I understand, but since I'm really both a
> beginner and an hobbyist, I might be totally wrong...

The GIL is almost completely insignificant here. One of your threads
will be blocked practically the whole time (waiting for more samples;
collecting them into a numpy array doesn't take long), and the other
is, if I understand correctly, spending most of its time inside numpy,
which releases the GIL. You should be able to thread just fine.

ChrisA

[toc] | [prev] | [next] | [standalone]

#71372

From	lgabiot <lgabiot@hotmail.com>
Date	2014-05-12 07:54 +0200
Message-ID	<53706210$0$2921$426a74cc@news.free.fr>
In reply to	#71371

Le 12/05/14 07:41, Chris Angelico a écrit :
>
> The GIL is almost completely insignificant here. One of your threads
> will be blocked practically the whole time (waiting for more samples;
> collecting them into a numpy array doesn't take long), and the other
> is, if I understand correctly, spending most of its time inside numpy,
> which releases the GIL. You should be able to thread just fine.
>
> ChrisA
>
Thanks Chris for your answer.

So back to my original question: A Queue and two threads 
(producer/consumer) seems a good answer to my problem, or is there a 
better way to solve it?
(again, I'm really a beginner, so I made up this solution, but really 
wonder if I do not miss a well known obvious much better idea).

[toc] | [prev] | [next] | [standalone]

#71373

From	Chris Angelico <rosuav@gmail.com>
Date	2014-05-12 15:58 +1000
Message-ID	<mailman.9909.1399874311.18130.python-list@python.org>
In reply to	#71372

On Mon, May 12, 2014 at 3:54 PM, lgabiot <lgabiot@hotmail.com> wrote:
> So back to my original question: A Queue and two threads (producer/consumer)
> seems a good answer to my problem, or is there a better way to solve it?
> (again, I'm really a beginner, so I made up this solution, but really wonder
> if I do not miss a well known obvious much better idea).

Well, the first thing I'd try is simply asking for more data when
you're ready for it - can you get five seconds' of data all at once?
Obviously this won't work if your upstream buffers only a small
amount, in which case your thread is there to do that buffering; also,
if you can't absolutely *guarantee* that you can process the data
quickly enough, every time, then you need to use the queue to buffer
that.

But otherwise, it sounds like a quite reasonable way to do things.

ChrisA

[toc] | [prev] | [next] | [standalone]

#71374

From	lgabiot <lgabiot@hotmail.com>
Date	2014-05-12 08:02 +0200
Message-ID	<537063d9$0$2291$426a74cc@news.free.fr>
In reply to	#71373

Le 12/05/14 07:58, Chris Angelico a écrit :

> Well, the first thing I'd try is simply asking for more data when
> you're ready for it - can you get five seconds' of data all at once?
> Obviously this won't work if your upstream buffers only a small
> amount, in which case your thread is there to do that buffering; also,
> if you can't absolutely *guarantee* that you can process the data
> quickly enough, every time, then you need to use the queue to buffer
> that.
>
> But otherwise, it sounds like a quite reasonable way to do things.
>
> ChrisA
>

Ok, thanks a lot!

[toc] | [prev] | [next] | [standalone]

#71375

From	Stefan Behnel <stefan_ml@behnel.de>
Date	2014-05-12 08:13 +0200
Message-ID	<mailman.9910.1399875227.18130.python-list@python.org>
In reply to	#71370

lgabiot, 12.05.2014 07:33:
> Le 11/05/14 17:40, lgabiot a écrit :
> 
>> I guess if my calculation had to be performed on a small number of
>> samples (i.e. under the value of the Pyaudio buffer size (2048 samples
>> for instance), and that the calculation would last less than the time it
>> takes to get the next 2048 samples from Pyaudio, I wouldn't need the
>> Queue and Thread system.
>> But in my case where I need a large buffer, it might not work?
>> Unless I ask pyaudio to feed me directly with 5 seconds chunks (instead
>> of the usual buffer sizes: 1024, 2048, etc...), which I didn't try,
>> because I hadn't though of it.
> 
> I guess this solution might probably not work, since it would mean that the
> calculation should be quick enough so it wouldn't last longer than 1 sample
> (1/48000 s for instance), since while doing the calculation, no audio would
> be ingested (unless pyAudio possess some kind of internal concurrency system).
> Which leads me to think that a buffer (queue) and separate threads
> (producer and consumer) are necessary for this task.

This sounds like a use case for double buffering. Use two buffers, start
filling one. When it's full, switch buffers, start filling the second and
process the first. When the second is full, switch again.

Note that you have to make sure that the processing always terminates
within the time it takes to fill the other buffer. If you can't assure
that, however, you have a problem anyway and should see if there's a way to
improve your algorithm.

If the "fill my buffer" call in PyAudio is blocking (i.e. if it returns
only after filling the buffer), then you definitely need two threads for this.


> But AFAIK the python GIL (and in smaller or older computers that have only
> one core) does not permit true paralell execution of two threads.

Not for code that runs in the *interpreter", but it certainly allows I/O
and low-level NumPy array processing to happen in parallel, as they do not
need the interpreter.

Stefan

[toc] | [prev] | [next] | [standalone]

#71377

From	lgabiot <lgabiot@hotmail.com>
Date	2014-05-12 10:14 +0200
Message-ID	<537082f2$0$2141$426a74cc@news.free.fr>
In reply to	#71375

Le 12/05/14 08:13, Stefan Behnel a écrit :
> This sounds like a use case for double buffering. Use two buffers, start
> filling one. When it's full, switch buffers, start filling the second and
> process the first. When the second is full, switch again.
>
> Note that you have to make sure that the processing always terminates
> within the time it takes to fill the other buffer. If you can't assure
> that, however, you have a problem anyway and should see if there's a way to
> improve your algorithm.
>
> If the "fill my buffer" call in PyAudio is blocking (i.e. if it returns
> only after filling the buffer), then you definitely need two threads for this.
>
>
>> But AFAIK the python GIL (and in smaller or older computers that have only
>> one core) does not permit true paralell execution of two threads.
>
> Not for code that runs in the *interpreter", but it certainly allows I/O
> and low-level NumPy array processing to happen in parallel, as they do not
> need the interpreter.
>
> Stefan
>
>
Thanks for your answer.

If I follow your explanations, I guess I have to review my understanding 
of python execution model (I have to admit it is quite crude anyway).

In my understanding, without threads, I would have two functions:
- get_audio() would get the 5 seconds of audio from Pyaudio
- process_audio() would process the 5 seconds of audio

the main code would be roughly executing this:
while(True)
	get_audio()
	process_audio()

so since the audio is a live feed (which makes a difference, say, with 
an audio file analyser program), the get_audio() part must take 5 
seconds to execute. (but most probably the processor stays still most of 
the time during the get_audio() part).
then once get_audio() is done, process_audio() begins.
Process_audio will take some time. If that time is greater that the 
times it takes for the next audio sample to arrive, I have a problem.
(which you already explained differently maybe with:
 > If the "fill my buffer" call in PyAudio is blocking (i.e. if it returns
 > only after filling the buffer), then you definitely need two threads 
for this.
)

So if I follow you, if the Pyaudio part is "Non-blocking" there would be 
a way to make it work without the two threads things. I'm back to the 
Pyaudio doc, and try to get my head around the callback method, which 
might be the good lead.

[toc] | [prev] | [next] | [standalone]

#71382

From	lgabiot <lgabiot@hotmail.com>
Date	2014-05-12 11:17 +0200
Message-ID	<53709196$0$2074$426a74cc@news.free.fr>
In reply to	#71377

Le 12/05/14 10:14, lgabiot a écrit :
> So if I follow you, if the Pyaudio part is "Non-blocking" there would be
> a way to make it work without the two threads things. I'm back to the
> Pyaudio doc, and try to get my head around the callback method, which
> might be the good lead.

So far, if I understand correctly PyAudio, the callback method is a way 
to do some sort of computing on a Pyaudio stream, by declaring a 
function (the "callback" one) at stream opening time, the callback 
function being executed in a separate thread (as per the Pyaudio 
documentation)...
Still investigating.

[toc] | [prev] | [next] | [standalone]

#71562

From	Sturla Molden <sturla.molden@gmail.com>
Date	2014-05-14 16:23 +0200
Message-ID	<mailman.10008.1400077460.18130.python-list@python.org>
In reply to	#71370

On 12/05/14 07:33, lgabiot wrote:

> But AFAIK the python GIL (and in smaller or older computers that have
> only one core) does not permit true paralell execution of two threads. I
> believe it is quite like the way multiple processes are handled by an OS
> on a single CPU computer: process A has x CPU cycles, then process B has
> y CPU cycles, etc...

Python threads are native OS threads. The GIL serializes access to the 
Python interpreter.

If your thread is waiting for i/o or running computations in C or 
Fortran (e.g. with NumPy), it does not need the Python interpreter.

Scientists and engineers use Python threads for "true parallel 
processing" all the time. The FUD you will find about the GIL is written 
by people who don't fully understand the issue.

> So in my case, I must have a way to make sure that:
> thread 1 (which gets audio from Pyaudio and put() it in the Queue) is
> not interrupted long enough to miss a sample.

Here you are mistaken. The DMA controller takes care of the audio i/o. 
Your audio acquisition thread is asleep while its buffer fills up. You 
don't miss a sample because your thread is interrupted.

You do, however, have to make sure your thread don't block on the write 
to the Queue (use block=False in the call to Queue.put), but it is not a 
"GIL issue".

In your case you basically have on thread waiting for the DMA controller 
to fill up a buffer and another doing computations in NumPy. Neither 
needs the GIL for most of their work.

If you are worried about the GIL you can always use processes 
(multiprocessing, subprocess, or os.fork) instead of threads.

Sturla

[toc] | [prev] | [standalone]

csiph-web

Using threads for audio computing?

Contents

#71318 — Using threads for audio computing?

#71319

#71321

#71370

#71371

#71372

#71373

#71374

#71375

#71377

#71382

#71562