Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.forth > #134520

Re: EuroForth 2025 preliminary proceedings

From peter <peter.noreply@tin.it>
Newsgroups comp.lang.forth
Subject Re: EuroForth 2025 preliminary proceedings
Date 2026-01-19 23:26 +0100
Organization A noiseless patient Spider
Message-ID <20260119232635.00007bcd@tin.it> (permalink)
References <69688c01$1@news.ausics.net> <2026Jan15.130413@mips.complang.tuwien.ac.at> <nnd$3a148ef5$137ee4b5@b1e8191b89e23503> <87wm1gpvdr.fsf@nightsong.com>

Show all headers | View raw


On Fri, 16 Jan 2026 23:10:24 -0800
Paul Rubin <no.email@nospam.invalid> wrote:

> Hans Bezemer <the.beez.speaks@gmail.com> writes:
> > 5. I added GCC extension support to 4tH in version 3.62.0. At the
> > time, it improved performance by about 25%. By accident I found out
> > that was no longer true. switch() based was faster. I didn't know
> > there had been changes in that regard to GCC.
> 
> If you mean the goto *a feature, these days you might try using tail
> calls instead.  GCC and LLVM both now support a musttail attribute that
> ensures this optimization, or signals a compile-time error if it can't.
> 
> https://lwn.net/Articles/1033373/

I got interested to understand how tail calls could improve compared 
to computed gotos. So I took the five first "opcodes" from the VM in 
NTF64/LXF64 to compared the generated asm.
The VM was written from the begining in X64 assembler (13 years ago)
4 years ago I also implemented the VM i C to simplify porting to ARM64.
At that time the asm version was about 10% faster then the generated 
C code, today the speed is about the same. C compilers have improved.
It was implemented using computed gotos, usingthe following macro
as the nesting code ending each "opcode"

#define RELOAD()  code=*ip++; goto *jmp_table[code]	

for the tail call version it was changed to

RELOAD() opcode func=(opcode)tbl[*ip++]; __attribute__((musttail)) 
                 return func(ip, tbl, TOP, FTOP, sp, rp, fp, lp)

(line brooken to be readable)

The noop "opcode has just the nesting and produces the following code

	movzx	r9d, byte ptr [rcx]
	inc	rcx
	jmp	qword ptr [rax + 8*r9]

and for the tailcall version

	movzx	eax, byte ptr [r12]
	inc	r12
	mov	rax, qword ptr [r13 + 8*rax]
	rex64 jmp	rax   

both compiled with 
clang -S -Wall -O2 -masm=intel -o vm8test3.asm vm8tail.c       

As I suspected the code is practically identical!

It also turns out that the musttail attribute is not necessary
It will generate a tailcall aanyway. The difference is that with
musttail it will report an error if it cannot do the tailcall.

Much more important is the __attribute__((preserve_none)) before
each function. This indicated that more registers will be used to pass
parameters. As seen above I pass 8 parameters to each function and
they need to be in registers to match the asmbler written code.
This is done automatically in the goto version as everything is in
one function there. 

In the end it is more how you like to write your VM, as one function
or one for each "opcode".

Unfortunately GCC does not recognize preserve_none and uses the stack
for some parameters

Here is my test code

// VM8 C variant using computed goto

#include <stdint.h>

#define UNS8  unsigned char
#define INT64 long long int
#define UNS64 unsigned long long int

#define RELOAD()  code=*ip++; goto *jmp_table[code]	

void VM8(UNS8 *ip, UNS64 *sp, UNS64 *rp, double *fp, UNS64 *lp ) {

const static void* jmp_table[] = {	
	&&noop,
	&&swap,
	&&rot,
	&&eqzero,
	&&negate,
};
    
	UNS8 code=*ip;
	UNS64 tmp;
	UNS64 TOP=*sp++;
//	double FTOP=*fp++;

	RELOAD();
	

	noop: 			// do nothing
		RELOAD();
	swap: 			//  swap
        tmp=sp[0];
		sp[0]=TOP;
		TOP=tmp;
		RELOAD();
	rot: 			//  rot
		tmp=TOP;
		TOP=sp[1];
		sp[1]=sp[0];
		sp[0]=tmp;
		RELOAD();
	eqzero: 		//  0=
		TOP=-(TOP==0);
		RELOAD();
	negate:  		// negate
		TOP=-TOP;
		RELOAD();
	
	
} //vm8


And here is the tail call version. Sorry for the long lines!

// VM8 C variant using tailcalls

#include <stdint.h>

#define UNS8  unsigned char
#define INT64 long long int
#define UNS64 unsigned long long int


typedef  __attribute__((preserve_none)) void (*opcode) (UNS8*, UNS64*, UNS64, double, UNS64*, UNS64*, double*, UNS64*); 

#define RELOAD() opcode func=(opcode)tbl[*ip++]; __attribute__((musttail)) return func(ip, tbl, TOP, FTOP, sp, rp, fp, lp)	

#define FUNC   __attribute__((preserve_none)) void

FUNC	noop(UNS8 *ip, UNS64 *tbl, UNS64 TOP, double FTOP, UNS64 *sp, UNS64 *rp, double *fp, UNS64 *lp )  			// do nothing
        {
		RELOAD();
        }
        
FUNC	swap(UNS8 *ip, UNS64 *tbl, UNS64 TOP, double FTOP, UNS64 *sp, UNS64 *rp, double *fp, UNS64 *lp )  			//  swap
		{UNS64 tmp;
        tmp=sp[0];
		sp[0]=TOP;
		TOP=tmp;
		RELOAD();}
        
FUNC	rot(UNS8 *ip, UNS64 *tbl, UNS64 TOP, double FTOP, UNS64 *sp, UNS64 *rp, double *fp, UNS64 *lp )  			//  rot
		{UNS64 tmp=TOP;
		TOP=sp[1];
		sp[1]=sp[0];
		sp[0]=tmp;
		RELOAD();}

FUNC	eqzero(UNS8 *ip, UNS64 *tbl, UNS64 TOP, double FTOP, UNS64 *sp, UNS64 *rp, double *fp, UNS64 *lp )  		//  0=
		{TOP=-(TOP==0);
		RELOAD();}

FUNC	negate(UNS8 *ip, UNS64 *tbl, UNS64 TOP, double FTOP, UNS64 *sp, UNS64 *rp, double *fp, UNS64 *lp )   		// negate
		{TOP=-TOP;
		RELOAD();}

opcode jmp_table[]={	
	noop,
	swap,
	rot,
	eqzero,
	negate,
};
        

        
void VM8(UNS8 *ip, UNS64 *sp, UNS64 *rp, double *fp, UNS64 *lp ) {


    UNS64 *tbl=(UNS64*)&jmp_table;
    UNS64 TOP=*sp++;
    double FTOP=*fp++;
        

    opcode func=(opcode)tbl[*ip++]; 
    func( ip, tbl, TOP, FTOP, sp, rp, fp, lp);

}	
	
 //vm8

BR
Peter



Back to comp.lang.forth | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

EuroForth 2025 preliminary proceedings dxf <dxforth@gmail.com> - 2026-01-15 17:41 +1100
  Re: EuroForth 2025 preliminary proceedings anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-15 12:04 +0000
    Re: EuroForth 2025 preliminary proceedings Hans Bezemer <the.beez.speaks@gmail.com> - 2026-01-16 15:25 +0100
      Re: EuroForth 2025 preliminary proceedings anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-16 17:38 +0000
        Re: EuroForth 2025 preliminary proceedings Hans Bezemer <the.beez.speaks@gmail.com> - 2026-01-22 16:51 +0100
          C compiler optimization and Forth engines (was: EuroForth 2025 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-24 11:28 +0000
            Re: C compiler optimization and Forth engines (was: EuroForth 2025 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-24 16:47 +0000
              Re: C compiler optimization and Forth engines (was: EuroForth 2025 ...) peter <peter.noreply@tin.it> - 2026-01-25 23:31 +0100
                Re: C compiler optimization and Forth engines (was: EuroForth 2025 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-26 19:24 +0000
                Re: C compiler optimization and Forth engines (was: EuroForth 2025 ...) peter <peter.noreply@tin.it> - 2026-01-27 15:44 +0100
                Re: C compiler optimization and Forth engines (was: EuroForth 2025 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-29 18:27 +0000
                Re: C compiler optimization and Forth engines (was: EuroForth 2025 ...) albert@spenarnc.xs4all.nl - 2026-01-30 13:20 +0100
                Re: C compiler optimization and Forth engines (was: EuroForth 2025 ...) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-30 18:00 +0000
      Re: EuroForth 2025 preliminary proceedings Paul Rubin <no.email@nospam.invalid> - 2026-01-16 23:10 -0800
        Re: EuroForth 2025 preliminary proceedings Hans Bezemer <the.beez.speaks@gmail.com> - 2026-01-17 16:58 +0100
          Re: EuroForth 2025 preliminary proceedings Paul Rubin <no.email@nospam.invalid> - 2026-01-17 20:21 -0800
            Re: EuroForth 2025 preliminary proceedings Hans Bezemer <the.beez.speaks@gmail.com> - 2026-01-18 15:26 +0100
          Re: EuroForth 2025 preliminary proceedings anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-18 22:17 +0000
        Re: EuroForth 2025 preliminary proceedings albert@spenarnc.xs4all.nl - 2026-01-18 16:34 +0100
          Re: EuroForth 2025 preliminary proceedings Paul Rubin <no.email@nospam.invalid> - 2026-01-20 00:35 -0800
            Re: EuroForth 2025 preliminary proceedings albert@spenarnc.xs4all.nl - 2026-01-20 12:12 +0100
            Coroutines in Forth Gerry Jackson <do-not-use@swldwa.uk> - 2026-04-02 20:59 +0100
              Re: Coroutines in Forth Paul Rubin <no.email@nospam.invalid> - 2026-04-04 18:02 -0700
                Re: Coroutines in Forth Paul Rubin <no.email@nospam.invalid> - 2026-04-04 21:21 -0700
        Re: EuroForth 2025 preliminary proceedings peter <peter.noreply@tin.it> - 2026-01-19 23:26 +0100
          Re: EuroForth 2025 preliminary proceedings Paul Rubin <no.email@nospam.invalid> - 2026-01-19 15:22 -0800
            Re: EuroForth 2025 preliminary proceedings peter <peter.noreply@tin.it> - 2026-01-20 10:44 +0100
            Re: EuroForth 2025 preliminary proceedings anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-20 22:36 +0000
          Re: EuroForth 2025 preliminary proceedings Paul Rubin <no.email@nospam.invalid> - 2026-01-20 00:33 -0800
          Re: EuroForth 2025 preliminary proceedings anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2026-01-20 22:17 +0000

csiph-web