Path: csiph.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: minforth <minforth@gmx.net>
Newsgroups: comp.lang.forth
Subject: Re: Parsing timestamps?
Date: Thu, 10 Jul 2025 07:37:02 +0200
Lines: 36
Message-ID: <md91rtFtejdU1@mid.individual.net>
References: <1f433fabcb4d053d16cbc098dedc6c370608ac01@i2pn2.org> <2025Jul2.172222@mips.complang.tuwien.ac.at> <nnd$77366e3c$215e3e20@1580fe9081551b96> <300ba9a1581bea9a01ab85d5d361e6eaeedbf23a@i2pn2.org> <nnd$619ca290$2bff25f3@fa4b7a265c28888c> <4d440297d7e17251ebc50774bacfec73e184f9bc@i2pn2.org> <2025Jul5.104922@mips.complang.tuwien.ac.at> <6fd9f665e73ad93270fff88eca894ba69424cac7@i2pn2.org> <87a55dxbft.fsf@nightsong.com> <md8f7aFqhh4U1@mid.individual.net> <87y0swwtqt.fsf@nightsong.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: individual.net mUJ1n4pSKuOzb2W0PHpLXg8QpP6n5TKa8Y7YlSlFRXdgMARRJp
Cancel-Lock: sha1:gpeFNlK8UvirE07NHhMLItExCms= sha256:w7nFWSmZY1ppoFMSHLmRBJ472xguOp3+dN4v/qORcso=
User-Agent: Mozilla Thunderbird
In-Reply-To: <87y0swwtqt.fsf@nightsong.com>
Xref: csiph.com comp.lang.forth:134002

Am 10.07.2025 um 06:32 schrieb Paul Rubin:
> minforth <minforth@gmx.net> writes:
>> You don't need 64-bit doubles for signal or image processing.
>> Most vector/matrix operations on streaming data don't require
>> them either. Whether SSE2 is adequate or not to handle such data
>> depends on the application.
> 
> Sure, and for that matter, AI inference uses 8 bit and even 4 bit
> floating point. 

Or fuzzy control for instance.

> Kahan on the other hand was interested in engineering
> and scientific applications like PDE solvers (airfoils, fluid dynamics,
> FEM, etc.).  That's an area where roundoff error builds up after many
> iterations, thus extended precision.


That's why I use Kahan summation for dot products. It is slow but
rounding error accumulation remains small. A while ago I read an
article about this issue in which the author(s) performed extensive tests
of different dot product calculation algorithms on many serial
data sets from finance, geology, oil industry, meteorology etc.
Their target criterion was to find an acceptable balance between
computational speed and minimal error.

The 'winner' was a chained fused-multiply-add algorithm (many
CPUs/GPUs can perform FMA in hardware) which makes for shorter code
(good for caching). And it supports speed improvement by
parallelization (recursive halving of the sets until manageable
vector size followed by parallel computation).

I don't do parallelization, but I was still surprised by the good
results using FMA. In other words, increasing floating-point number
size is not always the way to go. Anyhow, first step is to select
the best fp rounding method ....