Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #616

Re: Recursive descent parsing and optimization, was Good practical language and OS agnostic text?

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!nerds-end
From BGB <cr88192@hotmail.com>
Newsgroups comp.compilers
Subject Re: Recursive descent parsing and optimization, was Good practical language and OS agnostic text?
Date Sun, 22 Apr 2012 13:45:44 -0700
Organization albasani.net
Lines 175
Sender news@iecc.com
Approved comp.compilers@iecc.com
Message-ID <12-04-073@comp.compilers> (permalink)
References <12-04-019@comp.compilers> <12-04-056@comp.compilers> <12-04-060@comp.compilers> <12-04-066@comp.compilers>
NNTP-Posting-Host news.iecc.com
X-Trace leila.iecc.com 1335145228 34657 64.57.183.58 (23 Apr 2012 01:40:28 GMT)
X-Complaints-To abuse@iecc.com
NNTP-Posting-Date Mon, 23 Apr 2012 01:40:28 +0000 (UTC)
Keywords parse, design
Posted-Date 22 Apr 2012 21:40:28 EDT
X-submission-address compilers@iecc.com
X-moderator-address compilers-request@iecc.com
X-FAQ-and-archives http://compilers.iecc.com
Xref csiph.com comp.compilers:616

Show key headers only | View raw


On 4/22/2012 4:51 AM, BartC wrote:
> "BGB"<cr88192@hotmail.com>  wrote in message
>> On 4/21/2012 2:22 AM, Uli Kusterer wrote:
>
>>> - Recursive descent parsers: It's the obvious way to write a parser.
>
>> although I use recursive descent, the above sounds different from what I
>> usually do.
>
>> ReadStatement:
>>    checks for and handles vaious statement types
>>    calls ReadExpression
>>
>> ReadExpression:
>>    (actually, this is a tower of functions, one for each precedence
>> level, working from lowest to highest precedence)
>
> Sounds like it's influenced by the C grammar, which defines expressions
> using something like 13 or 17 layers of precedence.
>
> Beyond about 3-4 levels, I found that unmanageable. For expression syntax, I
> don't use any precedence in the grammar at all; I have precedence as an
> attribute of an operator, and an expression can be parsed with a single
> function.
>
> Or rather two: readfactor(priority), and readterm(). Readfactor() deals with
> the  binary operators linking successive terms, while readterm() does all
> the real work (since my syntax doesn't distinguish between expressions and
> statements, that's quite a big workload).


I have functions for each precedence level, and there is a lot of them.

luckily, I tend to have the stack of expression precedence level
functions in their own source-file, so it is not all that bad.


>>> - Tokenizing: Essentially grab all the words in your source text and
>>> build an array with an entry for each so you can more quickly walk
>>> forward and backward without having to scan individual characters.
>
>> partly agreed, except that it isn't really strictly necessary to build
>> an array up-front.
>
>> most of my parsers don't bother using an array, but instead just call
>> the tokenizer function directly to peek and parse tokens based on the
>> current "stream position" (generally a raw char pointer in my language
>> parsers).
>
> I've tried reading all the tokens in a separate pass, but didn't really like
> it. And it takes a lot more storage as well, especially with macro
> expansions.
>
> Instead I read them as I go along, but with provision for a one-symbol
> look-ahead.

my parsers typically look at 1 or 2 tokens at a time, as usually this
seems sufficient for nearly any "ordinary" construction in a C style syntax.

a partial exception here is declarations, which often have to be tried
to be parsed, and will generally fail (revert to a prior point) if the
syntax turns out to not be an expression.

an annoyance (part of how this makes parsing C slower) is that pretty
much every normal identifier as the first token on a statement ends up
being checked if it refers to a typedef, although I guess an
optimization could be to check if the next token is in a "known bad"
list (ex: "=", "->", ...) and then skip out on looking up a typedef in
this case (a known-bad token meaning "this is obviously not a
declaration"). probably not that this would help when parsing header
contents though (as pretty much everything there is a declaration).


<snip>

yes, ok.


>> some past experiments (special purpose tests), have implied that
>> potentially threaded code can pull off "reasonably solid" interpreter
>> performance (potentially within 3-5x of native).
>
> Assuming you're talking about dynamic typing, I found it difficult to get
> within 3-5x, unless some sort of type hinting is used, or perhaps you're
> comparing with not too highly optimised native code. Or it's code that is
> memory-intensive, then memory access will dominate.


no, this was assuming that the output was resolved to static types
(although the input HLL is largely dynamically-typed, the goal is to
have largely statically-typed logic going on in the lower levels of the
VM, due either to the use of explicit annotation, and/or, type-inference).

so, these tests assumed that the VM would be able to effectively resolve
the types to their primitive forms (basically, they were intended to try
some things out in advance to help guide the design of a then-planned
new interpreter core, which never got fully implemented).

all this was approx 6 months ago.


the tests involved various numeric calculations (generally using integer
and floating point types), and compared them against their analogues as
plain C code.

there would have been an approx 3-5x slowdown between the interpreter
and native code compiled with the same optimization settings (trying
different settings, although changing the total running times, seemed to
have little impact on the relative execution times).

these weren't really general-purpose tests though, but rather were
involving hand-crafted "instruction sequences" (as lists of function
pointers).

the "stack" in the case of the VM was represented as a sliding pointer,
and was using fixed 64-bit elements (as was the argument list).

so, a double addition would have looked something like:
*((double *)(ctx->stack+1))=
	*((double *)(ctx->stack+0))+
	*((double *)(ctx->stack+1));
ctx->stack++;


I ended up not actually using an interpreter structured this way
(instead continuing to use and develop my older interpreter), but
performance was sufficiently good to show that threaded-code was a
viable option and "not drastically slower" than native code.

the result was, me eventually, migrating the older interpreter to the
use of threaded code (for a modest speed-up vs the "big switch()"
strategy I had been using previously).

as-is, the current interpreter performance is closer to around 100x
slower than native, for tests like Fib and bubble-sort (and generally
much smaller if a lot of C-side logic is involved).


for the newer version of the interpreter, with faster/improved "Boxed
Value Type" operations, here is a version of the opcode-handler logic:

BSVM_ThreadOp *BSVM_ThOp_ADD_XD(BSVM_SVMState *ctx, BSVM_ThreadOp *cur)
{	BVT_DefPopUV_Ty(f64, u, v); BVT_AddUV(u, v);
	BVT_FreeF64(ctx, v); return(cur->next); }

and if all macros are manually expanded out...

BSVM_ThreadOp *BSVM_ThOp_ADD_XD(BSVM_SVMState *ctx, BSVM_ThreadOp *cur)
{
	f64 *u, *v;
	v=(f64 *)(ctx->stack[--ctx->stackpos]);
	u=(f64 *)(ctx->stack[ctx->stackpos-1])
	*u=(*u)+(*v);
	*(f64 **)v=ctx->fbvt_double;
	ctx->fbvt_double=v;
	return(cur->next);
}

where "f64" is a typedef for "double".

as can be seen, this isn't quite as efficient, but this sort of logic
should be a little faster than the current (dynamically-typed) logic.

note that "SVMState *ctx" is the interpreter context, and "BSVM_ThreadOp
*cur" is the current opcode (it contains a function-pointer to the
handler, as well as any operands used by the opcode).

note that, in this interpreter, the stack is an array of pointers.


I don't yet have an estimate for "overall" performance of the new logic,
considering that a good deal more logic is still be noted before I will
be able to actually test and benchmark a lot of this stuff... (but, at
least, with the changes made thus far the VM hasn't started violently
blowing up, which is at least itself a good sign...).

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-17 21:28 +0000
  Re: Good practical language and OS agnostic text? Philip Herron <redbrain@gcc.gnu.org> - 2012-04-18 14:25 +0100
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-19 16:32 +0000
      Re: Good practical language and OS agnostic text? arnold@skeeve.com (Aharon Robbins) - 2012-04-20 03:58 +0000
        Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-22 10:10 +0000
      Re: Good practical language and OS agnostic text? "BartC" <bc@freeuk.com> - 2012-04-20 09:45 +0100
        Re: Good practical language and OS agnostic text? "Jonathan Thornburg" <jthorn@astro.indiana.edu> - 2012-04-21 15:04 +0000
  Re: Good practical language and OS agnostic text? BGB <cr88192@hotmail.com> - 2012-04-18 08:39 -0700
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-19 17:32 +0000
  Re: Good practical language and OS agnostic text? Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-04-18 18:24 +0200
    Re: Good practical language and OS agnostic text? Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2012-04-19 13:53 +0200
      Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-21 03:07 +0000
      Re: Good practical language and OS agnostic text? "BartC" <bc@freeuk.com> - 2012-04-21 12:01 +0100
        Re: code quality, was Good practical language and OS agnostic text? Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2012-04-22 12:41 +0200
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-19 11:31 +0000
      Re: Good practical language and OS agnostic text? "Jonathan Thornburg" <jthorn@astro.indiana.edu> - 2012-04-20 16:19 +0000
  Re: Good practical language and OS agnostic text? "Derek M. Jones" <derek@knosof.co.uk> - 2012-04-18 18:16 +0100
    Re: Good practical language and OS agnostic text? glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2012-04-18 22:43 +0000
      Re: Good practical language and OS agnostic text? BGB <cr88192@hotmail.com> - 2012-04-19 00:05 -0700
      Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-19 11:31 +0000
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-19 16:32 +0000
  Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-18 19:30 +0000
    Re: Good practical language and OS agnostic text? "BartC" <bc@freeuk.com> - 2012-04-19 18:43 +0100
  Re: Good practical language and OS agnostic text? glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2012-04-18 20:29 +0000
    Re: Good practical language and OS agnostic text? Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2012-04-19 14:20 +0200
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-19 19:05 +0000
      Re: Good practical language and OS agnostic text? Uli Kusterer <ulimakesacompiler@googlemail.com> - 2012-04-21 11:30 +0200
  Re: Good practical language and OS agnostic text? Roberto Waltman <usenet@rwaltman.com> - 2012-04-18 22:00 -0400
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-19 11:31 +0000
      Re: Good practical language and OS agnostic text? glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2012-04-20 07:02 +0000
        Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-22 11:10 +0000
          Re: Good practical language and OS agnostic text? glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2012-04-22 23:56 +0000
        Re: PL/360, was Good practical language and OS agnostic text? ArarghMail204@Arargh.com - 2012-04-24 19:13 -0500
  Re: Good practical language and OS agnostic text? Bakul Shah <usenet@bitblocks.com> - 2012-04-18 21:15 -0700
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-20 16:06 +0000
  Re: Good practical language and OS agnostic text? torbenm@diku.dk (Torben Ægidius Mogensen) - 2012-04-19 14:58 +0200
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-20 16:06 +0000
  Re: Good practical language and OS agnostic text? "Joe Schmo" <askmeforit@myisp.com> - 2012-04-21 02:53 -0600
    Re: Writing parsers, was Good practical language and OS agnostic text? Uli Kusterer <ulimakesacompiler@googlemail.com> - 2012-04-22 16:18 +0200
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-23 19:12 +0000
  Re: Good practical language and OS agnostic text? Uli Kusterer <ulimakesacompiler@googlemail.com> - 2012-04-21 11:22 +0200
    Re: Good practical language and OS agnostic text? BGB <cr88192@hotmail.com> - 2012-04-21 18:58 -0700
      Re: writing interpreters, was Good practical language and OS agnostic text? Uli Kusterer <ulimakesacompiler@googlemail.com> - 2012-04-22 12:53 +0200
        Re: writing interpreters, was Good practical language and OS agnostic text? BGB <cr88192@hotmail.com> - 2012-04-22 12:29 -0700
      Re: generating bytecode, was Good practical language and OS agnostic text? Uli Kusterer <ulimakesacompiler@googlemail.com> - 2012-04-22 13:12 +0200
      Re: Recursive descent parsing and optimization, was Good practical language and OS agnostic text? "BartC" <bc@freeuk.com> - 2012-04-22 12:51 +0100
        Re: Recursive descent parsing and optimization, was Good practical language and OS agnostic text? "Dmitry A. Kazakov" <mailbox@dmitry-kazakov.de> - 2012-04-22 18:18 +0200
          Re: Recursive descent parsing and optimization, was Good practical language and OS agnostic text? "Bartc" <bartc@freeuk.com> - 2012-04-23 10:59 +0100
        Re: Recursive descent parsing and optimization, was Good practical language and OS agnostic text? BGB <cr88192@hotmail.com> - 2012-04-22 13:45 -0700
    Re: Good practical language and OS agnostic text? compilers@is-not-my.name - 2012-04-22 22:11 +0000
      Re: Good practical language and OS agnostic text? "BartC" <bc@freeuk.com> - 2012-04-23 18:41 +0100
  Re: Good practical language and OS agnostic text? basile@starynkevitch.net - 2012-05-02 22:16 -0700
    Re: Good practical language and OS agnostic text? Johann 'Myrkraverk' Oskarsson <johann@2ndquadrant.com> - 2012-06-06 16:52 +0000

csiph-web