Re: About divisions on the Taylor series in the error function

From	hopcode <hopcode@nospicedham.invalid.de>
Subject	Re: About divisions on the Taylor series in the error function
Message-ID	<jm660p$8ve$1@dont-email.me> (permalink)
Date	2012-04-12 11:07 +0200
Organization	A noiseless patient Spider
References	<jlur7p$42p$1@dont-email.me> <jm3hkg$6md$3@speranza.aioe.org>
Newsgroups	comp.programming, comp.lang.asm.x86, alt.lang.asm, sci.math.num-analysis

Cross-posted to 4 groups.

Show all headers | View raw

Hi Rui,
i think i will not get such approximation to deserve
the honourable "sci" prefix ;-) anyway,
i speak only assembly and most people on "sci" doesnt like that.

embarassing computational situation:
there's no way to exponentiate without relying on FPU instro-set.
and Intel (SSE) RCPPS instruction works on single precision,
and the precious RSQRTPS too !!!

also, whenever using Abramowitz & Stegun numbers
i avoid exp(-x^2) because too much exp-ensive :-)
NOTE: all methods use exponentiation !

for this reason, i cannot go beyond the 3x10^-7 error
and practically this is the method of the Taylor's series
suggested from someone there.

have you any idea how to exp it ? i have heard of
10^-22 margin error using Chebyshev, but no way
to avoid exponentiation till now.

what about using Intel SQRTSD instruction in some way ? (it computes
square root of the low double-precision floating-point value)

finally, i dont like the method of the cmath lib where
in the Intel Approximate Math,free,

http://www.intel.com/design/pentiumiii/devtools/AMaths.zip
sample code in there uses single precision

ok,i am missing something surely :-)

my routine for now, it can be improved a lot.
it executes in very few cycles (not tested yet),
~4 cacheline, 1 SSE division, and only XMM0 --> XMM3 registers.

usage:

  mov rcx,1.9f
  call .erf_cheby

.erf_cheby:
  ;--- in RCX zscore float double
  ;--- OUT RAX probability float double

  ;--- http://en.wikipedia.org
  ;--- /wiki/Error_function#Approximation_with_elementary_functions
  ;--- Abramowitz and Stegun using Chebyshev polynomials
  ;--- (maximum error: 3·10-7)
  ;-------------------------------------------------------
  ;--- erf(x)=1-[1/(1+a1*x+a2*x^2+a3*x^3+a4*x^4+a5*x^5+a6*x^6)^16]
  ;--- NOTE: no exp here. what for ? to obtain only
  ;--- a poor 1,5*10-7 error ? any idea ?

  mov rax,.erf_chebyt
  movq xmm1,rcx
  movq xmm0,[rax+8]
  andnpd xmm0,xmm1 ;--- absolute it

  movdqa xmm1,xmm0	
  movdqa xmm3,xmm0 ;--- x
  mulpd xmm3,xmm1  ;--- x*x

  xorpd xmm0,xmm0

  movq xmm2,[rax+16] ;--- a1
  mulpd xmm2,xmm1
  addpd xmm0,xmm2

  movq xmm2,[rax+24] ;--- a2
  mulpd xmm2,xmm3
  addpd xmm0,xmm2

  movq xmm2,[rax+32] ;--- a3
  mulpd xmm2,xmm3
  mulpd xmm2,xmm1
  addpd xmm0,xmm2

  movq xmm2,[rax+40] ;--- a4
  mulpd xmm2,xmm3
  mulpd xmm2,xmm3
  addpd xmm0,xmm2

  movq xmm2,[rax+48] ;--- a5
  mulpd xmm2,xmm3
  mulpd xmm2,xmm3
  mulpd xmm2,xmm1
  addpd xmm0,xmm2

  movq xmm2,[rax+56] ;--- a6
  mulpd xmm2,xmm3
  mulpd xmm2,xmm3
  mulpd xmm2,xmm3
  addpd xmm0,xmm2

  movq xmm2,[rax]
  addpd xmm0,xmm2
  addpd xmm0,xmm2

  mulpd xmm0,xmm0 ;--- x2
  mulpd xmm0,xmm0 ;--- x4
  mulpd xmm0,xmm0 ;--- x8
  mulpd xmm0,xmm0 ;--- x16

  movq xmm2,[rax]
  addpd xmm2,xmm2
  divpd xmm2,xmm0

  movq xmm0,[rax]
  addpd xmm0,xmm0
  subpd xmm0,xmm2
  movq rax,xmm0
  ret 0

align 16
.erf_chebyt:
  dq 0.5f
  dq 80000000'00000000h
  dq 0.0705230784f ;--- a1
  dq 0.0422820123f ;--- a2
  dq 0.0092705272f ;--- a3
  dq 0.0001520143f ;--- a4
  dq 0.0002765672f ;--- a5
  dq 0.0000430638f ;--- a6


Il 11.04.2012 11:07, Rui Maciel ha scritto:
> hopcode wrote:
>
>> Hi,
>> i am actually using the Marsaglia's method
>> to calculate the error function from
>> (snipped) this series
>> x + x3/3 + x5/(3 · 5) + x7/(3 · 5 · 7)
>>
>> is there a way to avoid divisions in there?
>> what about a method interpolating on a precalculated
>> lookup table ?
>>
>> Thanks in advance for your help,
>> Cheers,
>
> On a side note, why not post this to sci.math.num-analysis?
>
>
> Rui Maciel


-- 
.:mrk[hopcode]
   .:x64lab:.
  group http://groups.google.com/group/x64lab
  site http://sites.google.com/site/x64lab

Back to comp.programming | Previous | Next — Previous in thread | Find similar

Thread

About divisions on the Taylor series in the error function hopcode <hopcode@invalid.de> - 2012-04-09 16:20 +0200
  Re: About divisions on the Taylor series in the error function Robert Wessel <robertwessel2@yahoo.com> - 2012-04-09 14:52 -0500
    Re: About divisions on the Taylor series in the error function hopcode <hopcode@invalid.de> - 2012-04-10 03:02 +0200
      Re: About divisions on the Taylor series in the error function "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> - 2012-04-10 10:14 +0200
      Re: About divisions on the Taylor series in the error function "BartC" <bc@freeuk.com> - 2012-04-10 09:28 +0100
      Re: About divisions on the Taylor series in the error function hopcode <hopcode@invalid.de> - 2012-04-11 04:33 +0200
  Re: About divisions on the Taylor series in the error function Rui Maciel <rui.maciel@gmail.com> - 2012-04-11 10:07 +0100
    Re: About divisions on the Taylor series in the error function hopcode <hopcode@nospicedham.invalid.de> - 2012-04-12 11:07 +0200

csiph-web