Groups > comp.dsp > #33802 > unrolled thread

How to downsample this signal?

Started by	Piotr Wyderski <peter.pan@neverland.mil>
First post	2017-05-13 09:23 +0200
Last post	2017-05-15 20:39 +0200
Articles	11 on this page of 31 — 8 participants

Back to article view | Back to comp.dsp

  How to downsample this signal? Piotr Wyderski <peter.pan@neverland.mil> - 2017-05-13 09:23 +0200
    Re: How to downsample this signal? Evgeny Filatov <filatov.ev@mipt.ru> - 2017-05-13 15:42 +0300
    Re: How to downsample this signal? bitterlemon40@yahoo.ie - 2017-05-13 07:53 -0700
    Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-13 11:48 -0400
    Re: How to downsample this signal? Randy Yates <yates@digitalsignallabs.com> - 2017-05-13 12:58 -0400
      Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-13 16:04 -0400
        Re: How to downsample this signal? Randy Yates <yates@digitalsignallabs.com> - 2017-05-14 14:19 -0400
      Re: How to downsample this signal? Piotr Wyderski <peter.pan@neverland.mil> - 2017-05-15 00:42 +0200
        Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-15 00:46 -0400
          Re: How to downsample this signal? Piotr Wyderski <peter.pan@neverland.mil> - 2017-05-15 07:21 +0200
            Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-15 10:51 -0400
              Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-15 11:49 -0400
                Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-15 17:20 -0400
                Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-16 02:59 -0400
                  Re: How to downsample this signal? eric.jacobsen@ieee.org - 2017-05-16 15:59 +0000
                    Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-16 15:36 -0400
                  Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-16 13:11 -0400
                    Re: How to downsample this signal? eric.jacobsen@ieee.org - 2017-05-16 18:08 +0000
                      Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-16 15:35 -0400
                        Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-16 15:45 -0400
                          Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-16 15:50 -0400
                            Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-16 21:36 -0400
                    Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-16 15:35 -0400
                      Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-16 15:36 -0400
                        Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-16 15:40 -0400
                          Re: How to downsample this signal? rickman <gnuarm@gmail.com> - 2017-05-16 22:06 -0400
                            Re: How to downsample this signal? Randy Yates <yates@digitalsignallabs.com> - 2017-05-17 01:25 -0400
        Re: How to downsample this signal? Evgeny Filatov <filatov.ev@mipt.ru> - 2017-05-15 14:51 +0300
        Re: How to downsample this signal? Randy Yates <randyy@garnerundergroundinc.com> - 2017-05-15 10:37 -0400
    Re: How to downsample this signal? makolber@yahoo.com - 2017-05-15 06:27 -0700
      Re: How to downsample this signal? Piotr Wyderski <peter.pan@neverland.mil> - 2017-05-15 20:39 +0200

Page 2 of 2 — ← Prev page 1 [2]

#33847

From	Randy Yates <randyy@garnerundergroundinc.com>
Date	2017-05-16 15:50 -0400
Message-ID	<877f1gmvi4.fsf@garnerundergroundinc.com>
In reply to	#33846

Randy Yates <randyy@garnerundergroundinc.com> writes:

> Randy Yates <randyy@garnerundergroundinc.com> writes:
>
>> eric.jacobsen@ieee.org writes:
>>
>>> On Tue, 16 May 2017 13:11:25 -0400, Randy Yates
>>> <randyy@garnerundergroundinc.com> wrote:
>>>
>>>>rickman <gnuarm@gmail.com> writes:
>>>>
>>>>> On 5/15/2017 11:49 AM, Randy Yates wrote:
>>>>>> rickman <gnuarm@gmail.com> writes:
>>>>>>
>>>>>>> On 5/15/2017 1:21 AM, Piotr Wyderski wrote:
>>>>>>>> rickman wrote:
>>>>>>>>
>>>>>>>>> Are the 248 words for data or also program?
>>>>>>>>
>>>>>>>> Data only, there are dedicated code memories. The control part
>>>>>>>> is totally crazy: there are two separate code memories working
>>>>>>>> in parallel (they take both paths simultaneosly in order to
>>>>>>>> provide instructions with no delay in the case of a branch).
>>>>>>>> But there is no such thing as a program counter, the code
>>>>>>>> memories store instructions at consecutive addresses without
>>>>>>>> any deeper structure. A run of instructions is called a block
>>>>>>>> and is ended with the presence of a jump instruction. There
>>>>>>>> is also a third control memory which stores the information
>>>>>>>> where the i-th block begins and to which j-th state the FSM
>>>>>>>> should go in the case of a branch. In short, hardware basic
>>>>>>>> blocks. The data path is a VLIW with exposed pipelining,
>>>>>>>> which adds fun.
>>>>>>>>
>>>>>>>>> That should be plenty of room for coefficients.
>>>>>>>>
>>>>>>>> But doesn't the FIR require a lot of cells for
>>>>>>>> the past data values?
>>>>>>>
>>>>>>> Like I said, I wrote this a long time ago, so I am not the resource to
>>>>>>> be asking.  I can't picture how it works, but I specifically remember
>>>>>>> NOT needing to store previous data inputs.  I am thinking you store
>>>>>>> outputs which will be fewer because of the decimation.  As the input
>>>>>>> data comes in the output samples can be built up and when one is
>>>>>>> complete it is outputted and replaced by a new one.
>>>>>>>
>>>>>>> But read a proper reference.  If I wasn't busy today I'd dig this up
>>>>>>> for you.
>>>>>>
>>>>>> Yes you do need to store N-1 previous inputs (plus 1 current input) for
>>>>>> a length N polyphase FIR decimation by M. Just think first principles:
>>>>>> for any output y[k], you need N inputs to compute it, even if k = n*M.
>>>>>
>>>>> I am pretty sure that conclusion is not correct.  When I was asked to
>>>>> implement a polyphase filter that was the part I had trouble getting
>>>>> my head around but finally understood how it worked.  At the time this
>>>>> was being done on a very early DSP chip with very limited memory, so
>>>>> storing the N-1 previous samples was not an option.  The guy
>>>>> presenting the problem to me was the classic engineer who could read
>>>>> and understand things, but couldn't explain anything to other people.
>>>>> So I had to take the paper he gave me and finally figure it out for
>>>>> myself.
>>>>>
>>>>> I will dig around when I get a chance and figure out exactly how this
>>>>> worked.  But the basic idea is that instead of saving multiple inputs
>>>>> and calculating the outputs one at a time (iterating over the stored
>>>>> inputs) - as each input comes in it is iterated over the stored
>>>>> outputs and each output is ... well output, when it has been fully
>>>>> calculated. Since there are fewer outputs than inputs (by the
>>>>> decimation factor of M) you store fewer outputs than you would inputs.
>>>>>
>>>>> This is not hard to understand if you just stop thinking it *has* to
>>>>> be the way you are currently seeing it.  (shown here with simplified
>>>>> notation since it is too messy to show the indexes with decimation)
>>>>>
>>>>> y(k)   = a(0)x(i)   + a(1)x(i-1) + a(2)x(i-2) + ...
>>>>> y(k+1) = a(0)x(i+1) + a(1)x(i)   + a(2)x(i-1) + ...
>>>>> y(k+2) = a(0)x(i+2) + a(1)x(i+1) + a(2)x(i)   + ...
>>>>> y(k+3) = a(0)x(i+3) + a(1)x(i+2) + a(2)x(i+1) + ...
>>>>>
>>>>> Instead of thinking you have to calculate all the terms of y(k) at one
>>>>> time (and so store the last N*M values of x), consider that when x(i)
>>>>> arrives, you can calculate the x(i) terms and add them into y(k)
>>>>> (which is then output) and y(k+1), y(k+2)...  This allows the storage
>>>>> of fewer values by a factor of M.  It does require more memory fetches
>>>>> than the standard MAC calculation since you have to fetch not only the
>>>>> coefficient and the input, but also the intermediate output value
>>>>> being calculated on each MAC operation.
>>>>>
>>>>> Maybe I won't have to dig it up.  I'm pretty sure this is how it
>>>>> worked. It ties in nicely with the "polyphase" aspect to allow a
>>>>> minimum number of calculations on arrival of each input since not all
>>>>> the coefficients are used on each input.  So the coefficients are
>>>>> divided into "phases" (N/M coefficients) with only one phase used for
>>>>> any given input sample.
>>>>
>>>>Counter-example: Consider M = 8, N = 8 (8 coefficients and decimating by
>>>>8). Then each output y(k) depends on x(k*M + n), 0 < n < M - 1, and thus
>>>>the sets of input values for different outputs are disjoint.
>>>
>>> How each input will be used for each output is known when that input
>>> arrives, so if all possible ouputs for that input have their own
>>> partial product accumulator, the input can be multiplied by the
>>> various coefficients for each partial product and accumulated in each,
>>> and then forgotten.
>>>
>>> It's just another way to do it, but it can save a lot of hardware in a
>>> hardware implementation, especially in an FPGA where the
>>> multiply-accumulators are already built and just laying around,
>>> anyway.
>>
>> So are you asserting this is not a counter-example? 
>>
>> To state that an N-tap FIR decimator does not require all N inputs for
>> an output at time k is, in general, not true, as I have shown. 
>>
>> I believe what you are claiming is that IN CERTAIN CASES you can get
>> away with not using all N inputs at output time k. That I can agree
>> with. However, neither you nor Rick put a constraint on your assertion.
>>
>> And I can see that, in certain situations, this can be mo' better,
>> namely when you have one or more MACs sitting around not being used (as
>> you said) and N > M. But in general it hardly seems a storage saver as a
>> MAC is a storage element+.
>
> In light of the light coming on due to Rick's post that came about the
> same time as I was writing this, I'll have to retract. 

This is a type of systolic array?
-- 
Randy Yates, Embedded Firmware Developer
Garner Underground, Inc.
866-260-9040, x3901
http://www.garnerundergroundinc.com

[toc] | [prev] | [next] | [standalone]

#33858

From	rickman <gnuarm@gmail.com>
Date	2017-05-16 21:36 -0400
Message-ID	<ofg983$lsf$1@dont-email.me>
In reply to	#33847

On 5/16/2017 3:50 PM, Randy Yates wrote:
> Randy Yates <randyy@garnerundergroundinc.com> writes:
>
>> Randy Yates <randyy@garnerundergroundinc.com> writes:
>>
>>> eric.jacobsen@ieee.org writes:
>>>
>>>> On Tue, 16 May 2017 13:11:25 -0400, Randy Yates
>>>> <randyy@garnerundergroundinc.com> wrote:
>>>>
>>>>> rickman <gnuarm@gmail.com> writes:
>>>>>
>>>>>> On 5/15/2017 11:49 AM, Randy Yates wrote:
>>>>>>> rickman <gnuarm@gmail.com> writes:
>>>>>>>
>>>>>>>> On 5/15/2017 1:21 AM, Piotr Wyderski wrote:
>>>>>>>>> rickman wrote:
>>>>>>>>>
>>>>>>>>>> Are the 248 words for data or also program?
>>>>>>>>>
>>>>>>>>> Data only, there are dedicated code memories. The control part
>>>>>>>>> is totally crazy: there are two separate code memories working
>>>>>>>>> in parallel (they take both paths simultaneosly in order to
>>>>>>>>> provide instructions with no delay in the case of a branch).
>>>>>>>>> But there is no such thing as a program counter, the code
>>>>>>>>> memories store instructions at consecutive addresses without
>>>>>>>>> any deeper structure. A run of instructions is called a block
>>>>>>>>> and is ended with the presence of a jump instruction. There
>>>>>>>>> is also a third control memory which stores the information
>>>>>>>>> where the i-th block begins and to which j-th state the FSM
>>>>>>>>> should go in the case of a branch. In short, hardware basic
>>>>>>>>> blocks. The data path is a VLIW with exposed pipelining,
>>>>>>>>> which adds fun.
>>>>>>>>>
>>>>>>>>>> That should be plenty of room for coefficients.
>>>>>>>>>
>>>>>>>>> But doesn't the FIR require a lot of cells for
>>>>>>>>> the past data values?
>>>>>>>>
>>>>>>>> Like I said, I wrote this a long time ago, so I am not the resource to
>>>>>>>> be asking.  I can't picture how it works, but I specifically remember
>>>>>>>> NOT needing to store previous data inputs.  I am thinking you store
>>>>>>>> outputs which will be fewer because of the decimation.  As the input
>>>>>>>> data comes in the output samples can be built up and when one is
>>>>>>>> complete it is outputted and replaced by a new one.
>>>>>>>>
>>>>>>>> But read a proper reference.  If I wasn't busy today I'd dig this up
>>>>>>>> for you.
>>>>>>>
>>>>>>> Yes you do need to store N-1 previous inputs (plus 1 current input) for
>>>>>>> a length N polyphase FIR decimation by M. Just think first principles:
>>>>>>> for any output y[k], you need N inputs to compute it, even if k = n*M.
>>>>>>
>>>>>> I am pretty sure that conclusion is not correct.  When I was asked to
>>>>>> implement a polyphase filter that was the part I had trouble getting
>>>>>> my head around but finally understood how it worked.  At the time this
>>>>>> was being done on a very early DSP chip with very limited memory, so
>>>>>> storing the N-1 previous samples was not an option.  The guy
>>>>>> presenting the problem to me was the classic engineer who could read
>>>>>> and understand things, but couldn't explain anything to other people.
>>>>>> So I had to take the paper he gave me and finally figure it out for
>>>>>> myself.
>>>>>>
>>>>>> I will dig around when I get a chance and figure out exactly how this
>>>>>> worked.  But the basic idea is that instead of saving multiple inputs
>>>>>> and calculating the outputs one at a time (iterating over the stored
>>>>>> inputs) - as each input comes in it is iterated over the stored
>>>>>> outputs and each output is ... well output, when it has been fully
>>>>>> calculated. Since there are fewer outputs than inputs (by the
>>>>>> decimation factor of M) you store fewer outputs than you would inputs.
>>>>>>
>>>>>> This is not hard to understand if you just stop thinking it *has* to
>>>>>> be the way you are currently seeing it.  (shown here with simplified
>>>>>> notation since it is too messy to show the indexes with decimation)
>>>>>>
>>>>>> y(k)   = a(0)x(i)   + a(1)x(i-1) + a(2)x(i-2) + ...
>>>>>> y(k+1) = a(0)x(i+1) + a(1)x(i)   + a(2)x(i-1) + ...
>>>>>> y(k+2) = a(0)x(i+2) + a(1)x(i+1) + a(2)x(i)   + ...
>>>>>> y(k+3) = a(0)x(i+3) + a(1)x(i+2) + a(2)x(i+1) + ...
>>>>>>
>>>>>> Instead of thinking you have to calculate all the terms of y(k) at one
>>>>>> time (and so store the last N*M values of x), consider that when x(i)
>>>>>> arrives, you can calculate the x(i) terms and add them into y(k)
>>>>>> (which is then output) and y(k+1), y(k+2)...  This allows the storage
>>>>>> of fewer values by a factor of M.  It does require more memory fetches
>>>>>> than the standard MAC calculation since you have to fetch not only the
>>>>>> coefficient and the input, but also the intermediate output value
>>>>>> being calculated on each MAC operation.
>>>>>>
>>>>>> Maybe I won't have to dig it up.  I'm pretty sure this is how it
>>>>>> worked. It ties in nicely with the "polyphase" aspect to allow a
>>>>>> minimum number of calculations on arrival of each input since not all
>>>>>> the coefficients are used on each input.  So the coefficients are
>>>>>> divided into "phases" (N/M coefficients) with only one phase used for
>>>>>> any given input sample.
>>>>>
>>>>> Counter-example: Consider M = 8, N = 8 (8 coefficients and decimating by
>>>>> 8). Then each output y(k) depends on x(k*M + n), 0 < n < M - 1, and thus
>>>>> the sets of input values for different outputs are disjoint.
>>>>
>>>> How each input will be used for each output is known when that input
>>>> arrives, so if all possible ouputs for that input have their own
>>>> partial product accumulator, the input can be multiplied by the
>>>> various coefficients for each partial product and accumulated in each,
>>>> and then forgotten.
>>>>
>>>> It's just another way to do it, but it can save a lot of hardware in a
>>>> hardware implementation, especially in an FPGA where the
>>>> multiply-accumulators are already built and just laying around,
>>>> anyway.
>>>
>>> So are you asserting this is not a counter-example?
>>>
>>> To state that an N-tap FIR decimator does not require all N inputs for
>>> an output at time k is, in general, not true, as I have shown.
>>>
>>> I believe what you are claiming is that IN CERTAIN CASES you can get
>>> away with not using all N inputs at output time k. That I can agree
>>> with. However, neither you nor Rick put a constraint on your assertion.
>>>
>>> And I can see that, in certain situations, this can be mo' better,
>>> namely when you have one or more MACs sitting around not being used (as
>>> you said) and N > M. But in general it hardly seems a storage saver as a
>>> MAC is a storage element+.
>>
>> In light of the light coming on due to Rick's post that came about the
>> same time as I was writing this, I'll have to retract.
>
> This is a type of systolic array?

I've not seen the two things connected.  This method of calculating a 
polyphase filter does not require multiple processors which is the only 
context I've seen the term "systolic array" used.  I believe that is 
just a collection of processing elements that each pump data between 
them at a regular rate and a run-time fixed pattern.  Certainly any FIR 
filter would be a good candidate for a systolic array.

I think systolic arrays are not so much used these days.  Processing 
speeds have increased so much that there are a lot more things that can 
be done on a single processing element without breaking it down into 
pieces.  I suppose radar work will always find a need for more 
processing and perhaps sonar.  I think the cell phone industry has been 
a real boon to many other fields in terms of pushing the state of the 
art in DSP hardware.

-- 

Rick C

[toc] | [prev] | [next] | [standalone]

#33841

From	rickman <gnuarm@gmail.com>
Date	2017-05-16 15:35 -0400
Message-ID	<offk46$fi4$2@dont-email.me>
In reply to	#33837

On 5/16/2017 1:11 PM, Randy Yates wrote:
> rickman <gnuarm@gmail.com> writes:
>
>> On 5/15/2017 11:49 AM, Randy Yates wrote:
>>> rickman <gnuarm@gmail.com> writes:
>>>
>>>> On 5/15/2017 1:21 AM, Piotr Wyderski wrote:
>>>>> rickman wrote:
>>>>>
>>>>>> Are the 248 words for data or also program?
>>>>>
>>>>> Data only, there are dedicated code memories. The control part
>>>>> is totally crazy: there are two separate code memories working
>>>>> in parallel (they take both paths simultaneosly in order to
>>>>> provide instructions with no delay in the case of a branch).
>>>>> But there is no such thing as a program counter, the code
>>>>> memories store instructions at consecutive addresses without
>>>>> any deeper structure. A run of instructions is called a block
>>>>> and is ended with the presence of a jump instruction. There
>>>>> is also a third control memory which stores the information
>>>>> where the i-th block begins and to which j-th state the FSM
>>>>> should go in the case of a branch. In short, hardware basic
>>>>> blocks. The data path is a VLIW with exposed pipelining,
>>>>> which adds fun.
>>>>>
>>>>>> That should be plenty of room for coefficients.
>>>>>
>>>>> But doesn't the FIR require a lot of cells for
>>>>> the past data values?
>>>>
>>>> Like I said, I wrote this a long time ago, so I am not the resource to
>>>> be asking.  I can't picture how it works, but I specifically remember
>>>> NOT needing to store previous data inputs.  I am thinking you store
>>>> outputs which will be fewer because of the decimation.  As the input
>>>> data comes in the output samples can be built up and when one is
>>>> complete it is outputted and replaced by a new one.
>>>>
>>>> But read a proper reference.  If I wasn't busy today I'd dig this up
>>>> for you.
>>>
>>> Yes you do need to store N-1 previous inputs (plus 1 current input) for
>>> a length N polyphase FIR decimation by M. Just think first principles:
>>> for any output y[k], you need N inputs to compute it, even if k = n*M.
>>
>> I am pretty sure that conclusion is not correct.  When I was asked to
>> implement a polyphase filter that was the part I had trouble getting
>> my head around but finally understood how it worked.  At the time this
>> was being done on a very early DSP chip with very limited memory, so
>> storing the N-1 previous samples was not an option.  The guy
>> presenting the problem to me was the classic engineer who could read
>> and understand things, but couldn't explain anything to other people.
>> So I had to take the paper he gave me and finally figure it out for
>> myself.
>>
>> I will dig around when I get a chance and figure out exactly how this
>> worked.  But the basic idea is that instead of saving multiple inputs
>> and calculating the outputs one at a time (iterating over the stored
>> inputs) - as each input comes in it is iterated over the stored
>> outputs and each output is ... well output, when it has been fully
>> calculated. Since there are fewer outputs than inputs (by the
>> decimation factor of M) you store fewer outputs than you would inputs.
>>
>> This is not hard to understand if you just stop thinking it *has* to
>> be the way you are currently seeing it.  (shown here with simplified
>> notation since it is too messy to show the indexes with decimation)
>>
>> y(k)   = a(0)x(i)   + a(1)x(i-1) + a(2)x(i-2) + ...
>> y(k+1) = a(0)x(i+1) + a(1)x(i)   + a(2)x(i-1) + ...
>> y(k+2) = a(0)x(i+2) + a(1)x(i+1) + a(2)x(i)   + ...
>> y(k+3) = a(0)x(i+3) + a(1)x(i+2) + a(2)x(i+1) + ...
>>
>> Instead of thinking you have to calculate all the terms of y(k) at one
>> time (and so store the last N*M values of x), consider that when x(i)
>> arrives, you can calculate the x(i) terms and add them into y(k)
>> (which is then output) and y(k+1), y(k+2)...  This allows the storage
>> of fewer values by a factor of M.  It does require more memory fetches
>> than the standard MAC calculation since you have to fetch not only the
>> coefficient and the input, but also the intermediate output value
>> being calculated on each MAC operation.
>>
>> Maybe I won't have to dig it up.  I'm pretty sure this is how it
>> worked. It ties in nicely with the "polyphase" aspect to allow a
>> minimum number of calculations on arrival of each input since not all
>> the coefficients are used on each input.  So the coefficients are
>> divided into "phases" (N/M coefficients) with only one phase used for
>> any given input sample.
>
> Counter-example: Consider M = 8, N = 8 (8 coefficients and decimating by
> 8). Then each output y(k) depends on x(k*M + n), 0 < n < M - 1, and thus
> the sets of input values for different outputs are disjoint.

Yes, so what?  You still don't need to store any inputs to calculate the 
output.

-- 

Rick C

[toc] | [prev] | [next] | [standalone]

#33843

From	Randy Yates <randyy@garnerundergroundinc.com>
Date	2017-05-16 15:36 -0400
Message-ID	<87pof8mw57.fsf@garnerundergroundinc.com>
In reply to	#33841

rickman <gnuarm@gmail.com> writes:

> On 5/16/2017 1:11 PM, Randy Yates wrote:
>> rickman <gnuarm@gmail.com> writes:
>>
>>> On 5/15/2017 11:49 AM, Randy Yates wrote:
>>>> rickman <gnuarm@gmail.com> writes:
>>>>
>>>>> On 5/15/2017 1:21 AM, Piotr Wyderski wrote:
>>>>>> rickman wrote:
>>>>>>
>>>>>>> Are the 248 words for data or also program?
>>>>>>
>>>>>> Data only, there are dedicated code memories. The control part
>>>>>> is totally crazy: there are two separate code memories working
>>>>>> in parallel (they take both paths simultaneosly in order to
>>>>>> provide instructions with no delay in the case of a branch).
>>>>>> But there is no such thing as a program counter, the code
>>>>>> memories store instructions at consecutive addresses without
>>>>>> any deeper structure. A run of instructions is called a block
>>>>>> and is ended with the presence of a jump instruction. There
>>>>>> is also a third control memory which stores the information
>>>>>> where the i-th block begins and to which j-th state the FSM
>>>>>> should go in the case of a branch. In short, hardware basic
>>>>>> blocks. The data path is a VLIW with exposed pipelining,
>>>>>> which adds fun.
>>>>>>
>>>>>>> That should be plenty of room for coefficients.
>>>>>>
>>>>>> But doesn't the FIR require a lot of cells for
>>>>>> the past data values?
>>>>>
>>>>> Like I said, I wrote this a long time ago, so I am not the resource to
>>>>> be asking.  I can't picture how it works, but I specifically remember
>>>>> NOT needing to store previous data inputs.  I am thinking you store
>>>>> outputs which will be fewer because of the decimation.  As the input
>>>>> data comes in the output samples can be built up and when one is
>>>>> complete it is outputted and replaced by a new one.
>>>>>
>>>>> But read a proper reference.  If I wasn't busy today I'd dig this up
>>>>> for you.
>>>>
>>>> Yes you do need to store N-1 previous inputs (plus 1 current input) for
>>>> a length N polyphase FIR decimation by M. Just think first principles:
>>>> for any output y[k], you need N inputs to compute it, even if k = n*M.
>>>
>>> I am pretty sure that conclusion is not correct.  When I was asked to
>>> implement a polyphase filter that was the part I had trouble getting
>>> my head around but finally understood how it worked.  At the time this
>>> was being done on a very early DSP chip with very limited memory, so
>>> storing the N-1 previous samples was not an option.  The guy
>>> presenting the problem to me was the classic engineer who could read
>>> and understand things, but couldn't explain anything to other people.
>>> So I had to take the paper he gave me and finally figure it out for
>>> myself.
>>>
>>> I will dig around when I get a chance and figure out exactly how this
>>> worked.  But the basic idea is that instead of saving multiple inputs
>>> and calculating the outputs one at a time (iterating over the stored
>>> inputs) - as each input comes in it is iterated over the stored
>>> outputs and each output is ... well output, when it has been fully
>>> calculated. Since there are fewer outputs than inputs (by the
>>> decimation factor of M) you store fewer outputs than you would inputs.
>>>
>>> This is not hard to understand if you just stop thinking it *has* to
>>> be the way you are currently seeing it.  (shown here with simplified
>>> notation since it is too messy to show the indexes with decimation)
>>>
>>> y(k)   = a(0)x(i)   + a(1)x(i-1) + a(2)x(i-2) + ...
>>> y(k+1) = a(0)x(i+1) + a(1)x(i)   + a(2)x(i-1) + ...
>>> y(k+2) = a(0)x(i+2) + a(1)x(i+1) + a(2)x(i)   + ...
>>> y(k+3) = a(0)x(i+3) + a(1)x(i+2) + a(2)x(i+1) + ...
>>>
>>> Instead of thinking you have to calculate all the terms of y(k) at one
>>> time (and so store the last N*M values of x), consider that when x(i)
>>> arrives, you can calculate the x(i) terms and add them into y(k)
>>> (which is then output) and y(k+1), y(k+2)...  This allows the storage
>>> of fewer values by a factor of M.  It does require more memory fetches
>>> than the standard MAC calculation since you have to fetch not only the
>>> coefficient and the input, but also the intermediate output value
>>> being calculated on each MAC operation.
>>>
>>> Maybe I won't have to dig it up.  I'm pretty sure this is how it
>>> worked. It ties in nicely with the "polyphase" aspect to allow a
>>> minimum number of calculations on arrival of each input since not all
>>> the coefficients are used on each input.  So the coefficients are
>>> divided into "phases" (N/M coefficients) with only one phase used for
>>> any given input sample.
>>
>> Counter-example: Consider M = 8, N = 8 (8 coefficients and decimating by
>> 8). Then each output y(k) depends on x(k*M + n), 0 < n < M - 1, and thus
>> the sets of input values for different outputs are disjoint.
>
> Yes, so what?  You still don't need to store any inputs to calculate
> the output.

Where would they come from if they were not stored?
-- 
Randy Yates, Embedded Firmware Developer
Garner Underground, Inc.
866-260-9040, x3901
http://www.garnerundergroundinc.com

[toc] | [prev] | [next] | [standalone]

#33845

From	Randy Yates <randyy@garnerundergroundinc.com>
Date	2017-05-16 15:40 -0400
Message-ID	<87inl0mvyx.fsf@garnerundergroundinc.com>
In reply to	#33843

Randy Yates <randyy@garnerundergroundinc.com> writes:

> rickman <gnuarm@gmail.com> writes:
>
>> On 5/16/2017 1:11 PM, Randy Yates wrote:
>>> rickman <gnuarm@gmail.com> writes:
>>>
>>>> On 5/15/2017 11:49 AM, Randy Yates wrote:
>>>>> rickman <gnuarm@gmail.com> writes:
>>>>>
>>>>>> On 5/15/2017 1:21 AM, Piotr Wyderski wrote:
>>>>>>> rickman wrote:
>>>>>>>
>>>>>>>> Are the 248 words for data or also program?
>>>>>>>
>>>>>>> Data only, there are dedicated code memories. The control part
>>>>>>> is totally crazy: there are two separate code memories working
>>>>>>> in parallel (they take both paths simultaneosly in order to
>>>>>>> provide instructions with no delay in the case of a branch).
>>>>>>> But there is no such thing as a program counter, the code
>>>>>>> memories store instructions at consecutive addresses without
>>>>>>> any deeper structure. A run of instructions is called a block
>>>>>>> and is ended with the presence of a jump instruction. There
>>>>>>> is also a third control memory which stores the information
>>>>>>> where the i-th block begins and to which j-th state the FSM
>>>>>>> should go in the case of a branch. In short, hardware basic
>>>>>>> blocks. The data path is a VLIW with exposed pipelining,
>>>>>>> which adds fun.
>>>>>>>
>>>>>>>> That should be plenty of room for coefficients.
>>>>>>>
>>>>>>> But doesn't the FIR require a lot of cells for
>>>>>>> the past data values?
>>>>>>
>>>>>> Like I said, I wrote this a long time ago, so I am not the resource to
>>>>>> be asking.  I can't picture how it works, but I specifically remember
>>>>>> NOT needing to store previous data inputs.  I am thinking you store
>>>>>> outputs which will be fewer because of the decimation.  As the input
>>>>>> data comes in the output samples can be built up and when one is
>>>>>> complete it is outputted and replaced by a new one.
>>>>>>
>>>>>> But read a proper reference.  If I wasn't busy today I'd dig this up
>>>>>> for you.
>>>>>
>>>>> Yes you do need to store N-1 previous inputs (plus 1 current input) for
>>>>> a length N polyphase FIR decimation by M. Just think first principles:
>>>>> for any output y[k], you need N inputs to compute it, even if k = n*M.
>>>>
>>>> I am pretty sure that conclusion is not correct.  When I was asked to
>>>> implement a polyphase filter that was the part I had trouble getting
>>>> my head around but finally understood how it worked.  At the time this
>>>> was being done on a very early DSP chip with very limited memory, so
>>>> storing the N-1 previous samples was not an option.  The guy
>>>> presenting the problem to me was the classic engineer who could read
>>>> and understand things, but couldn't explain anything to other people.
>>>> So I had to take the paper he gave me and finally figure it out for
>>>> myself.
>>>>
>>>> I will dig around when I get a chance and figure out exactly how this
>>>> worked.  But the basic idea is that instead of saving multiple inputs
>>>> and calculating the outputs one at a time (iterating over the stored
>>>> inputs) - as each input comes in it is iterated over the stored
>>>> outputs and each output is ... well output, when it has been fully
>>>> calculated. Since there are fewer outputs than inputs (by the
>>>> decimation factor of M) you store fewer outputs than you would inputs.
>>>>
>>>> This is not hard to understand if you just stop thinking it *has* to
>>>> be the way you are currently seeing it.  (shown here with simplified
>>>> notation since it is too messy to show the indexes with decimation)
>>>>
>>>> y(k)   = a(0)x(i)   + a(1)x(i-1) + a(2)x(i-2) + ...
>>>> y(k+1) = a(0)x(i+1) + a(1)x(i)   + a(2)x(i-1) + ...
>>>> y(k+2) = a(0)x(i+2) + a(1)x(i+1) + a(2)x(i)   + ...
>>>> y(k+3) = a(0)x(i+3) + a(1)x(i+2) + a(2)x(i+1) + ...
>>>>
>>>> Instead of thinking you have to calculate all the terms of y(k) at one
>>>> time (and so store the last N*M values of x), consider that when x(i)
>>>> arrives, you can calculate the x(i) terms and add them into y(k)
>>>> (which is then output) and y(k+1), y(k+2)...  This allows the storage
>>>> of fewer values by a factor of M.  It does require more memory fetches
>>>> than the standard MAC calculation since you have to fetch not only the
>>>> coefficient and the input, but also the intermediate output value
>>>> being calculated on each MAC operation.
>>>>
>>>> Maybe I won't have to dig it up.  I'm pretty sure this is how it
>>>> worked. It ties in nicely with the "polyphase" aspect to allow a
>>>> minimum number of calculations on arrival of each input since not all
>>>> the coefficients are used on each input.  So the coefficients are
>>>> divided into "phases" (N/M coefficients) with only one phase used for
>>>> any given input sample.
>>>
>>> Counter-example: Consider M = 8, N = 8 (8 coefficients and decimating by
>>> 8). Then each output y(k) depends on x(k*M + n), 0 < n < M - 1, and thus
>>> the sets of input values for different outputs are disjoint.
>>
>> Yes, so what?  You still don't need to store any inputs to calculate
>> the output.
>
> Where would they come from if they were not stored?

OK I see what you mean now. You can do a running MAC as they come in.
That is certainly true.
-- 
Randy Yates, Embedded Firmware Developer
Garner Underground, Inc.
866-260-9040, x3901
http://www.garnerundergroundinc.com

[toc] | [prev] | [next] | [standalone]

#33859

From	rickman <gnuarm@gmail.com>
Date	2017-05-16 22:06 -0400
Message-ID	<ofgb0r$pfq$1@dont-email.me>
In reply to	#33845

On 5/16/2017 3:40 PM, Randy Yates wrote:
> Randy Yates <randyy@garnerundergroundinc.com> writes:
>
>> rickman <gnuarm@gmail.com> writes:
>>
>>> On 5/16/2017 1:11 PM, Randy Yates wrote:
>>>> rickman <gnuarm@gmail.com> writes:
>>>>
>>>>> On 5/15/2017 11:49 AM, Randy Yates wrote:
>>>>>> rickman <gnuarm@gmail.com> writes:
>>>>>>
>>>>>>> On 5/15/2017 1:21 AM, Piotr Wyderski wrote:
>>>>>>>> rickman wrote:
>>>>>>>>
>>>>>>>>> Are the 248 words for data or also program?
>>>>>>>>
>>>>>>>> Data only, there are dedicated code memories. The control part
>>>>>>>> is totally crazy: there are two separate code memories working
>>>>>>>> in parallel (they take both paths simultaneosly in order to
>>>>>>>> provide instructions with no delay in the case of a branch).
>>>>>>>> But there is no such thing as a program counter, the code
>>>>>>>> memories store instructions at consecutive addresses without
>>>>>>>> any deeper structure. A run of instructions is called a block
>>>>>>>> and is ended with the presence of a jump instruction. There
>>>>>>>> is also a third control memory which stores the information
>>>>>>>> where the i-th block begins and to which j-th state the FSM
>>>>>>>> should go in the case of a branch. In short, hardware basic
>>>>>>>> blocks. The data path is a VLIW with exposed pipelining,
>>>>>>>> which adds fun.
>>>>>>>>
>>>>>>>>> That should be plenty of room for coefficients.
>>>>>>>>
>>>>>>>> But doesn't the FIR require a lot of cells for
>>>>>>>> the past data values?
>>>>>>>
>>>>>>> Like I said, I wrote this a long time ago, so I am not the resource to
>>>>>>> be asking.  I can't picture how it works, but I specifically remember
>>>>>>> NOT needing to store previous data inputs.  I am thinking you store
>>>>>>> outputs which will be fewer because of the decimation.  As the input
>>>>>>> data comes in the output samples can be built up and when one is
>>>>>>> complete it is outputted and replaced by a new one.
>>>>>>>
>>>>>>> But read a proper reference.  If I wasn't busy today I'd dig this up
>>>>>>> for you.
>>>>>>
>>>>>> Yes you do need to store N-1 previous inputs (plus 1 current input) for
>>>>>> a length N polyphase FIR decimation by M. Just think first principles:
>>>>>> for any output y[k], you need N inputs to compute it, even if k = n*M.
>>>>>
>>>>> I am pretty sure that conclusion is not correct.  When I was asked to
>>>>> implement a polyphase filter that was the part I had trouble getting
>>>>> my head around but finally understood how it worked.  At the time this
>>>>> was being done on a very early DSP chip with very limited memory, so
>>>>> storing the N-1 previous samples was not an option.  The guy
>>>>> presenting the problem to me was the classic engineer who could read
>>>>> and understand things, but couldn't explain anything to other people.
>>>>> So I had to take the paper he gave me and finally figure it out for
>>>>> myself.
>>>>>
>>>>> I will dig around when I get a chance and figure out exactly how this
>>>>> worked.  But the basic idea is that instead of saving multiple inputs
>>>>> and calculating the outputs one at a time (iterating over the stored
>>>>> inputs) - as each input comes in it is iterated over the stored
>>>>> outputs and each output is ... well output, when it has been fully
>>>>> calculated. Since there are fewer outputs than inputs (by the
>>>>> decimation factor of M) you store fewer outputs than you would inputs.
>>>>>
>>>>> This is not hard to understand if you just stop thinking it *has* to
>>>>> be the way you are currently seeing it.  (shown here with simplified
>>>>> notation since it is too messy to show the indexes with decimation)
>>>>>
>>>>> y(k)   = a(0)x(i)   + a(1)x(i-1) + a(2)x(i-2) + ...
>>>>> y(k+1) = a(0)x(i+1) + a(1)x(i)   + a(2)x(i-1) + ...
>>>>> y(k+2) = a(0)x(i+2) + a(1)x(i+1) + a(2)x(i)   + ...
>>>>> y(k+3) = a(0)x(i+3) + a(1)x(i+2) + a(2)x(i+1) + ...
>>>>>
>>>>> Instead of thinking you have to calculate all the terms of y(k) at one
>>>>> time (and so store the last N*M values of x), consider that when x(i)
>>>>> arrives, you can calculate the x(i) terms and add them into y(k)
>>>>> (which is then output) and y(k+1), y(k+2)...  This allows the storage
>>>>> of fewer values by a factor of M.  It does require more memory fetches
>>>>> than the standard MAC calculation since you have to fetch not only the
>>>>> coefficient and the input, but also the intermediate output value
>>>>> being calculated on each MAC operation.
>>>>>
>>>>> Maybe I won't have to dig it up.  I'm pretty sure this is how it
>>>>> worked. It ties in nicely with the "polyphase" aspect to allow a
>>>>> minimum number of calculations on arrival of each input since not all
>>>>> the coefficients are used on each input.  So the coefficients are
>>>>> divided into "phases" (N/M coefficients) with only one phase used for
>>>>> any given input sample.
>>>>
>>>> Counter-example: Consider M = 8, N = 8 (8 coefficients and decimating by
>>>> 8). Then each output y(k) depends on x(k*M + n), 0 < n < M - 1, and thus
>>>> the sets of input values for different outputs are disjoint.
>>>
>>> Yes, so what?  You still don't need to store any inputs to calculate
>>> the output.
>>
>> Where would they come from if they were not stored?
>
> OK I see what you mean now. You can do a running MAC as they come in.
> That is certainly true.

Ok, glad we got past that.

-- 

Rick C

[toc] | [prev] | [next] | [standalone]

#33860

From	Randy Yates <yates@digitalsignallabs.com>
Date	2017-05-17 01:25 -0400
Message-ID	<87h90km4vw.fsf@digitalsignallabs.com>
In reply to	#33859

rickman <gnuarm@gmail.com> writes:

> On 5/16/2017 3:40 PM, Randy Yates wrote:
>> Randy Yates <randyy@garnerundergroundinc.com> writes:
>>
>>> rickman <gnuarm@gmail.com> writes:
>>>
>>>> On 5/16/2017 1:11 PM, Randy Yates wrote:
>>>>> rickman <gnuarm@gmail.com> writes:
>>>>>
>>>>>> On 5/15/2017 11:49 AM, Randy Yates wrote:
>>>>>>> rickman <gnuarm@gmail.com> writes:
>>>>>>>
>>>>>>>> On 5/15/2017 1:21 AM, Piotr Wyderski wrote:
>>>>>>>>> rickman wrote:
>>>>>>>>>
>>>>>>>>>> Are the 248 words for data or also program?
>>>>>>>>>
>>>>>>>>> Data only, there are dedicated code memories. The control part
>>>>>>>>> is totally crazy: there are two separate code memories working
>>>>>>>>> in parallel (they take both paths simultaneosly in order to
>>>>>>>>> provide instructions with no delay in the case of a branch).
>>>>>>>>> But there is no such thing as a program counter, the code
>>>>>>>>> memories store instructions at consecutive addresses without
>>>>>>>>> any deeper structure. A run of instructions is called a block
>>>>>>>>> and is ended with the presence of a jump instruction. There
>>>>>>>>> is also a third control memory which stores the information
>>>>>>>>> where the i-th block begins and to which j-th state the FSM
>>>>>>>>> should go in the case of a branch. In short, hardware basic
>>>>>>>>> blocks. The data path is a VLIW with exposed pipelining,
>>>>>>>>> which adds fun.
>>>>>>>>>
>>>>>>>>>> That should be plenty of room for coefficients.
>>>>>>>>>
>>>>>>>>> But doesn't the FIR require a lot of cells for
>>>>>>>>> the past data values?
>>>>>>>>
>>>>>>>> Like I said, I wrote this a long time ago, so I am not the resource to
>>>>>>>> be asking.  I can't picture how it works, but I specifically remember
>>>>>>>> NOT needing to store previous data inputs.  I am thinking you store
>>>>>>>> outputs which will be fewer because of the decimation.  As the input
>>>>>>>> data comes in the output samples can be built up and when one is
>>>>>>>> complete it is outputted and replaced by a new one.
>>>>>>>>
>>>>>>>> But read a proper reference.  If I wasn't busy today I'd dig this up
>>>>>>>> for you.
>>>>>>>
>>>>>>> Yes you do need to store N-1 previous inputs (plus 1 current input) for
>>>>>>> a length N polyphase FIR decimation by M. Just think first principles:
>>>>>>> for any output y[k], you need N inputs to compute it, even if k = n*M.
>>>>>>
>>>>>> I am pretty sure that conclusion is not correct.  When I was asked to
>>>>>> implement a polyphase filter that was the part I had trouble getting
>>>>>> my head around but finally understood how it worked.  At the time this
>>>>>> was being done on a very early DSP chip with very limited memory, so
>>>>>> storing the N-1 previous samples was not an option.  The guy
>>>>>> presenting the problem to me was the classic engineer who could read
>>>>>> and understand things, but couldn't explain anything to other people.
>>>>>> So I had to take the paper he gave me and finally figure it out for
>>>>>> myself.
>>>>>>
>>>>>> I will dig around when I get a chance and figure out exactly how this
>>>>>> worked.  But the basic idea is that instead of saving multiple inputs
>>>>>> and calculating the outputs one at a time (iterating over the stored
>>>>>> inputs) - as each input comes in it is iterated over the stored
>>>>>> outputs and each output is ... well output, when it has been fully
>>>>>> calculated. Since there are fewer outputs than inputs (by the
>>>>>> decimation factor of M) you store fewer outputs than you would inputs.
>>>>>>
>>>>>> This is not hard to understand if you just stop thinking it *has* to
>>>>>> be the way you are currently seeing it.  (shown here with simplified
>>>>>> notation since it is too messy to show the indexes with decimation)
>>>>>>
>>>>>> y(k)   = a(0)x(i)   + a(1)x(i-1) + a(2)x(i-2) + ...
>>>>>> y(k+1) = a(0)x(i+1) + a(1)x(i)   + a(2)x(i-1) + ...
>>>>>> y(k+2) = a(0)x(i+2) + a(1)x(i+1) + a(2)x(i)   + ...
>>>>>> y(k+3) = a(0)x(i+3) + a(1)x(i+2) + a(2)x(i+1) + ...
>>>>>>
>>>>>> Instead of thinking you have to calculate all the terms of y(k) at one
>>>>>> time (and so store the last N*M values of x), consider that when x(i)
>>>>>> arrives, you can calculate the x(i) terms and add them into y(k)
>>>>>> (which is then output) and y(k+1), y(k+2)...  This allows the storage
>>>>>> of fewer values by a factor of M.  It does require more memory fetches
>>>>>> than the standard MAC calculation since you have to fetch not only the
>>>>>> coefficient and the input, but also the intermediate output value
>>>>>> being calculated on each MAC operation.
>>>>>>
>>>>>> Maybe I won't have to dig it up.  I'm pretty sure this is how it
>>>>>> worked. It ties in nicely with the "polyphase" aspect to allow a
>>>>>> minimum number of calculations on arrival of each input since not all
>>>>>> the coefficients are used on each input.  So the coefficients are
>>>>>> divided into "phases" (N/M coefficients) with only one phase used for
>>>>>> any given input sample.
>>>>>
>>>>> Counter-example: Consider M = 8, N = 8 (8 coefficients and decimating by
>>>>> 8). Then each output y(k) depends on x(k*M + n), 0 < n < M - 1, and thus
>>>>> the sets of input values for different outputs are disjoint.
>>>>
>>>> Yes, so what?  You still don't need to store any inputs to calculate
>>>> the output.
>>>
>>> Where would they come from if they were not stored?
>>
>> OK I see what you mean now. You can do a running MAC as they come in.
>> That is certainly true.
>
> Ok, glad we got past that.

Yeah, thanks for hanging with me, Rick.

For some reason I kept on overlooking the key concept, which you plainly
stated:

  Instead of thinking you have to calculate all the terms of y(k) AT ONE
  TIME (and so store the last N*M values of x), consider that when x(i) ...

Mea culpa.
-- 
Randy Yates, DSP/Embedded Firmware Developer
Digital Signal Labs
http://www.digitalsignallabs.com

[toc] | [prev] | [next] | [standalone]

#33825

From	Evgeny Filatov <filatov.ev@mipt.ru>
Date	2017-05-15 14:51 +0300
Message-ID	<xDgSA.95425$se.72631@fx38.iad>
In reply to	#33816

On 15.05.2017 1:42, Piotr Wyderski wrote:
> BTW, the processor is not as sledgehammer as it may sound

Then, have you considered a cascade of halfband low-pass FIR filters?

You would still need more multiplies per input sample than with IIR 
filters, but yet it's a very nice option.

Also, implementing a halfband decimating low-pass FIR filter is the 
easiest possible way to introduce yourself to polyphase techniques.

Read some fred harris or Lyon's "Understanding DSP", and you are fine.

Gene

[toc] | [prev] | [next] | [standalone]

#33827

From	Randy Yates <randyy@garnerundergroundinc.com>
Date	2017-05-15 10:37 -0400
Message-ID	<871srqrxte.fsf@garnerundergroundinc.com>
In reply to	#33816

Piotr Wyderski <peter.pan@neverland.mil> writes:

> Thank you all for your input.
>
> Randy Yates wrote:
>
>> What I don't understand is why, with the sledgehammer processor
>> you have, you don't do a decent polyphase filter with a good
>> FIR?
>
> It is a hobby project and I had 15 years of break with any form
> of DSP, at least understood as *signal* processing. I believe
> I still understand the math behind it, but I know only a limited
> number of "tricks". Polyphase representation is totally new to
> me, so are its applications to decimation/interpolation.
>
> The FIR filter has many advantages, e.g. maintaining linear phase
> will be easy, but the "direct" approach implied a FIR of insanely
> high order, so I didn't pursue the idea and drifted towards IIR,
> with which I had some positive experience, especially with CICs
> used in my SDR project.
>
> As far as I can see, the polyphase FIR approach should be the right
> choice for me. I am still not sure whether I am able to design a
> filter in this topology correctly, but well, it's what we call
> "learning". Thank you for bringing it to my attention.

Hey, you are most welcome. 

"Polyphase filtering" sounds really complicated, but (like many things
in this field) it's just a term for something quite simple: don't
compute values you're just going to throw away. For example, if you are
decimating by 8, you ultimately throw away 7 out of every 8 samples. So
if you're throwing them away, don't compute them in the first place.
Duh!

Armed with this technique, you essentially increase the effective number
of filter taps by M, where M is the decimation factor. So if you could
afforst 200 taps at full rate, you could actually afford 8*200 = 1600
taps when decimating by 8.

>
> BTW, the processor is not as sledgehammer as it may sound, especially
> due to the very limited RAM resources available to the DSP unit (2 banks
> of 128 24-bit words each, period.). The main processor (an ARM) has
> everything needed to the job, except of low latency guarantees, which
> is enough to kick it out of the scene. Whatever I can do, I must do
> using the DFB block. Currently the unknown is the definition of
> "whatever", but I'm aiming at saturating the block.

Hmm. Well then you may be up against a wall. Also if you have a
requirement for low-latency, a long FIR is out. But note that even
though CICs can be considered IIR, they also can be considered a simple
boxcar averager so they will have N/2 latency for a length-N boxcar.

I think there are ways to compute polyphase IIRs too but I've never done
that.

PS: Rick has some good material in his book on polyphase filtering,
CIC decimators, etc.

@BOOK{lyonsthird,
  title = "{Understanding Digital Signal Processing}",
  edition = "third",
  author = "{Richard~G.~Lyons}",
  publisher = "Prentice Hall",
  year = "2011"}

Good luck!
-- 
Randy Yates, Embedded Firmware Developer
Garner Underground, Inc.
866-260-9040, x3901
http://www.garnerundergroundinc.com

[toc] | [prev] | [next] | [standalone]

#33826

From	makolber@yahoo.com
Date	2017-05-15 06:27 -0700
Message-ID	<20203e7b-bf4d-4302-81cc-24514d306603@googlegroups.com>
In reply to	#33802

On Saturday, May 13, 2017 at 3:23:24 AM UTC-4, Piotr Wyderski wrote:
> Hello,
> 
> I have the 50Hz mains voltage signal sampled at 12bit/100kHz. This high 
> frequency is for rapid detection of overvoltage conditions. However, for
> further processing this sample rate is way too high, so I'd like to
> change the sampling rate by a factor ~100 (not a very critical value,
> just "sensibly low").
> 
> 
then the upper limit of transient frequencies that you see will be limited to about 500 Hz.

if you want to "see" transients with frequencies above 500 Hz, maybe its better to let them alias and see something is there ...depends on what you are trying to do.

m

[toc] | [prev] | [next] | [standalone]

#33831

From	Piotr Wyderski <peter.pan@neverland.mil>
Date	2017-05-15 20:39 +0200
Message-ID	<ofcslj$pfs$1@node2.news.atman.pl>
In reply to	#33826

makolber@yahoo.com wrote:

> then the upper limit of transient frequencies that you see will be limited to about 500 Hz.

No, the range checking will be performed on the high bandwidth stream, 
as a part of sample inception procedure. I care about U(t), not about
dU/dt. Then this stream should be decimated for less demanding tasks.

    Best regards, Piotr

[toc] | [prev] | [standalone]

Page 2 of 2 — ← Prev page 1 [2]

csiph-web

How to downsample this signal?

Contents

#33847

#33858

#33841

#33843

#33845

#33859

#33860

#33825

#33827

#33826

#33831