Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: Kaz Kylheku <864-117-4973@kylheku.com>
Newsgroups: comp.compilers
Subject: Re: Is This a Dumb Idea? paralellizing byte codes
Date: Thu, 27 Oct 2022 14:51:04 -0000 (UTC)
Organization: A noiseless patient Spider
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-10-061@comp.compilers>
References: <22-10-046@comp.compilers> <22-10-060@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="26535"; mail-complaints-to="abuse@iecc.com"
Keywords: optimize, interpreter, comment
Posted-Date: 27 Oct 2022 11:52:56 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Xref: csiph.com comp.compilers:3231

On 2022-10-27, gah4 <gah4@u.washington.edu> wrote:
> On Saturday, October 22, 2022 at 11:51:31 AM UTC-7, nob...@gmail.com wrote:
>> Modern CPUs employ all kinds of clever techniques to improve
>> instruction level parallelism (ILP). I was wondering if it
>> makes sense to try to employ similar techniques in the
>> virtual machines used to execute byte code produced by language
>> compilers.
>
> Seems to me it is not that parallelizing byte codes that is
> a dumb idea, but byte codes themselves are.

I think you're taking "byte code" too literally to refer to
refer to a virtual machine where the instructions are byte-wide
opcodes that implicitly refer to operands on a stack.

I think that nowadays the term refers to any software-based
synthetic instruction set oriented toward supporting a higher
level language.

> This was known when Alpha replaced VAX. Work on making faster VAX
> systems was stuck with the byte oriented instruction stream which was
> impossible to pipeline.

Not impossible for Intel and AMD, obviously.

Variable-length instruction encodings do not inherently hamper
pipelining.

What might hamper pipelining would be variable-lenth instruction
encodings /where the length is not known until the instruction is
executed, due to depending on its output somehow/.

If you can decode an instruction and then immediately know where
the next one starts, you can pipeline.

The internal representation of a pipeline doesn't use the the original
variable-length representation any more; the instruction bytes do not
literally move through the pipeline.

> So it seems that the real answer is to devise a word oriented, or in
> other words RISC, virtual machine. (Actual RISC hardware might not be
> a good choice.)

I designed one in TXR Lisp; but the "byte code" terminology appears
numerous times in the source code, and leaks into the name of one
API fuction calld vm-desc-bytecode, which accesses the code vector
of a virtual machine description.

The opcodes are actually four byte words, stored in the local endian.
(When a compiled file is loaded that was compiled on a different
endian system, the load function will swap the byte order on all
the four byte words in the "bytecode" vector).

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
[THe VAX had auto-increment indexed address modes like

  addl3 (ra)+[rb],(rc)+[rd],(re)+[rf]

which means to take the word that ra+4*rb points to, add 4 to ra, take
the word that rc+4*rd points to, add 4 to rc, put their sum in the word
that re+4*rf points to, and add 4 to re.  If any of those registers are
the same register, the fetches and increments have to happen as if it was
all done sequentially.   There were instructions that took six operands.
While this much address complication was rare, it had to work. -John]