Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.programming > #1464
| From | hopcode <hopcode@nospicedham.invalid.de> |
|---|---|
| Subject | Re: About divisions on the Taylor series in the error function |
| Message-ID | <jm660p$8ve$1@dont-email.me> (permalink) |
| Date | 2012-04-12 11:07 +0200 |
| Organization | A noiseless patient Spider |
| References | <jlur7p$42p$1@dont-email.me> <jm3hkg$6md$3@speranza.aioe.org> |
| Newsgroups | comp.programming, comp.lang.asm.x86, alt.lang.asm, sci.math.num-analysis |
Cross-posted to 4 groups.
Hi Rui, i think i will not get such approximation to deserve the honourable "sci" prefix ;-) anyway, i speak only assembly and most people on "sci" doesnt like that. embarassing computational situation: there's no way to exponentiate without relying on FPU instro-set. and Intel (SSE) RCPPS instruction works on single precision, and the precious RSQRTPS too !!! also, whenever using Abramowitz & Stegun numbers i avoid exp(-x^2) because too much exp-ensive :-) NOTE: all methods use exponentiation ! for this reason, i cannot go beyond the 3x10^-7 error and practically this is the method of the Taylor's series suggested from someone there. have you any idea how to exp it ? i have heard of 10^-22 margin error using Chebyshev, but no way to avoid exponentiation till now. what about using Intel SQRTSD instruction in some way ? (it computes square root of the low double-precision floating-point value) finally, i dont like the method of the cmath lib where in the Intel Approximate Math,free, http://www.intel.com/design/pentiumiii/devtools/AMaths.zip sample code in there uses single precision ok,i am missing something surely :-) my routine for now, it can be improved a lot. it executes in very few cycles (not tested yet), ~4 cacheline, 1 SSE division, and only XMM0 --> XMM3 registers. usage: mov rcx,1.9f call .erf_cheby .erf_cheby: ;--- in RCX zscore float double ;--- OUT RAX probability float double ;--- http://en.wikipedia.org ;--- /wiki/Error_function#Approximation_with_elementary_functions ;--- Abramowitz and Stegun using Chebyshev polynomials ;--- (maximum error: 3·10-7) ;------------------------------------------------------- ;--- erf(x)=1-[1/(1+a1*x+a2*x^2+a3*x^3+a4*x^4+a5*x^5+a6*x^6)^16] ;--- NOTE: no exp here. what for ? to obtain only ;--- a poor 1,5*10-7 error ? any idea ? mov rax,.erf_chebyt movq xmm1,rcx movq xmm0,[rax+8] andnpd xmm0,xmm1 ;--- absolute it movdqa xmm1,xmm0 movdqa xmm3,xmm0 ;--- x mulpd xmm3,xmm1 ;--- x*x xorpd xmm0,xmm0 movq xmm2,[rax+16] ;--- a1 mulpd xmm2,xmm1 addpd xmm0,xmm2 movq xmm2,[rax+24] ;--- a2 mulpd xmm2,xmm3 addpd xmm0,xmm2 movq xmm2,[rax+32] ;--- a3 mulpd xmm2,xmm3 mulpd xmm2,xmm1 addpd xmm0,xmm2 movq xmm2,[rax+40] ;--- a4 mulpd xmm2,xmm3 mulpd xmm2,xmm3 addpd xmm0,xmm2 movq xmm2,[rax+48] ;--- a5 mulpd xmm2,xmm3 mulpd xmm2,xmm3 mulpd xmm2,xmm1 addpd xmm0,xmm2 movq xmm2,[rax+56] ;--- a6 mulpd xmm2,xmm3 mulpd xmm2,xmm3 mulpd xmm2,xmm3 addpd xmm0,xmm2 movq xmm2,[rax] addpd xmm0,xmm2 addpd xmm0,xmm2 mulpd xmm0,xmm0 ;--- x2 mulpd xmm0,xmm0 ;--- x4 mulpd xmm0,xmm0 ;--- x8 mulpd xmm0,xmm0 ;--- x16 movq xmm2,[rax] addpd xmm2,xmm2 divpd xmm2,xmm0 movq xmm0,[rax] addpd xmm0,xmm0 subpd xmm0,xmm2 movq rax,xmm0 ret 0 align 16 .erf_chebyt: dq 0.5f dq 80000000'00000000h dq 0.0705230784f ;--- a1 dq 0.0422820123f ;--- a2 dq 0.0092705272f ;--- a3 dq 0.0001520143f ;--- a4 dq 0.0002765672f ;--- a5 dq 0.0000430638f ;--- a6 Il 11.04.2012 11:07, Rui Maciel ha scritto: > hopcode wrote: > >> Hi, >> i am actually using the Marsaglia's method >> to calculate the error function from >> (snipped) this series >> x + x3/3 + x5/(3 · 5) + x7/(3 · 5 · 7) >> >> is there a way to avoid divisions in there? >> what about a method interpolating on a precalculated >> lookup table ? >> >> Thanks in advance for your help, >> Cheers, > > On a side note, why not post this to sci.math.num-analysis? > > > Rui Maciel -- .:mrk[hopcode] .:x64lab:. group http://groups.google.com/group/x64lab site http://sites.google.com/site/x64lab
Back to comp.programming | Previous | Next — Previous in thread | Find similar
About divisions on the Taylor series in the error function hopcode <hopcode@invalid.de> - 2012-04-09 16:20 +0200
Re: About divisions on the Taylor series in the error function Robert Wessel <robertwessel2@yahoo.com> - 2012-04-09 14:52 -0500
Re: About divisions on the Taylor series in the error function hopcode <hopcode@invalid.de> - 2012-04-10 03:02 +0200
Re: About divisions on the Taylor series in the error function "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> - 2012-04-10 10:14 +0200
Re: About divisions on the Taylor series in the error function "BartC" <bc@freeuk.com> - 2012-04-10 09:28 +0100
Re: About divisions on the Taylor series in the error function hopcode <hopcode@invalid.de> - 2012-04-11 04:33 +0200
Re: About divisions on the Taylor series in the error function Rui Maciel <rui.maciel@gmail.com> - 2012-04-11 10:07 +0100
Re: About divisions on the Taylor series in the error function hopcode <hopcode@nospicedham.invalid.de> - 2012-04-12 11:07 +0200
csiph-web