Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.postscript > #3334

Parser Combinators revisited

Newsgroups comp.lang.postscript
Date 2019-02-02 19:51 -0800
Message-ID <6f879c02-caec-49a5-bc6c-a62931a36378@googlegroups.com> (permalink)
Subject Parser Combinators revisited
From luser droog <luser.droog@gmail.com>

Show all headers | View raw


I rediscovered some abandoned code for parser combinators
and I realized that I had rewritten the underlying library
more recently. So the obvious thing was to rewrite the
parser combinators to use the new library and then maybe
everything would work out more nicely.

And with a few stumbles, it has.

So here are two files and a very short output. First is 
the latest version of struct2.ps for functional programming
features. The most detailed write up on this code is in
my codereview post, but some tweaks and fixes have been
applied with the benefit of a little actual use.

$ cat struct2.ps
%!
% struct2.ps   An enhanced PostScript syntax for defining functions with named,
% type-checked arguments. Using @func within a block or other construct that uses
% 'pairs' accomplishes a sort of compile-time macro expansion of the shorthand function description.
<<
    /pairs-begin { pairs begin        }
    /pairs-def   { pairs {def} forall } pop { pairs currentdict copy pop }
    /pairs       { << exch explode >> }
    /explode     { { @exec } forall   }
    /@exec       { dup type /nametype eq { exec-if-@ } if }
    /exec-if-@   { dup dup length string cvs dup first (@) first eq { exec@ }{ pop } ifelse }
    /first       { 0 get }            /exec@ {  exch pop rest cvn cvx exec  }
    /rest        { 1 1 index length 1 sub getinterval }
>> begin {
    block { pairs-begin main end }
    func  { 1 index type /stringtype eq { typed-func }{ simple-func } ifelse }
    simple-func { func-begin { end } compose }
    typed-func  { exch args-and-types reverse { make-type-name } map check-stack 3 1 roll
                  exch simple-func   compose }
    func-begin  { exch reverse /args-begin load curry exch compose }
    args-begin  { dup length dict begin { exch def } forall }
    args-and-types { /was_x false def [ exch { each-specifier } fortokens fix-last ] dup args exch types  }
    each-specifier { dup xcheck /is_x exch def  is_x was_x and { null exch } if  /was_x is_x def }
    fix-last       { counttomark 2 mod 1 eq { null } if }
    check-stack    { {pop} 4 index cvlit { cvx /stackunderflow signalerror } curry compose
                     /if cvx 2 array astore cvx {check-count} exch compose curry
                     3 index cvlit { cvx /typecheck signalerror } curry
                     /if cvx 2 array astore cvx {check-types} exch compose compose }
    check-count { dup length count 2 sub gt }
    check-types { dup length 1 add copy true exch { check-type and } forall exch pop not }
    check-type  { dup null eq { 3 -1 roll pop pop true }{ 3 -1 roll type eq } ifelse }
    make-type-name { dup type /nametype eq { dup length 4 add string dup dup 4 2 roll cvs
                     2 copy 0 exch putinterval length (type) putinterval cvn } if }
    args    { [ exch 2 { 0 get } fortuple ] cvx }
    types   { [ exch 2 { 1 get } fortuple ] }
    map     { 1 index xcheck 3 1 roll  [ 3 1 roll forall ]  exch {cvx} if }
    reduce  { exch dup first exch rest 3 -1 roll forall }
    rreduce { exch aload length 1 sub dup 3 add -1 roll repeat }
    curry   { [ 3 1 roll {} forall ] cvx } @pop
    {   dup xcheck 3 1 roll
        dup length 1 add array dup 0 5 -1 roll put dup 1 4 -1 roll putinterval
        exch {cvx}if }
    compose { 2 array astore cvx { {} forall } map } @pop
    {   1 index xcheck 3 1 roll
        1 index length 1 index length add array  dup 0 4 index putinterval
        dup 4 -1 roll length 4 -1 roll putinterval
        exch {cvx} if }
    reverse {
        dup xcheck exch
        [ exch dup length 1 sub -1 0 { 2 copy get 3 1 roll pop } for pop ]
        exch {cvx} if }
} pairs-def {
    fortokens {src proc}{ { src token {exch /src exch store}{exit}ifelse  proc } loop } @func
    fortuple {a n p}{ 0 n /a load length 1 sub
        { /a exch /n getinterval /p exec } {load-if-literal-name} map end for
    } @func-begin
    load-if-literal-name { dup type /nametype eq 1 index xcheck not and { load } if }
} pairs-def


So using pairs-begin or block lets the code look really nice IMO without
the slants on all the names being defined.

Next is the PC code. A few short mnemonics at the start. term, seq2, and 
alt are the basic primitives, but almost too primitive to use. The actual
API is: parser + * ? action. 

The full deal is defining an action which takes a parser and a procedure
body and returns a parser with the side effect of executing the procedure.
The proc gets passed the string contents matched by the parser.
A parser description can also be passed to action without the user needing
to call parser on it first.


$ cat pc7.ps
(struct2.ps) run {
    #    {length}
    aa   {array astore}
    x    {2 index}
     y   {1 index}
      z  {dup}
     zy  {exch}
    zxy  {3 1 roll}
    yzx  {3 2 roll}
    head {0 zy getinterval}
    tail {y # y sub getinterval}
    max  {y y lt{zy}if pop}
    is   {y type eq}
    ps   {(stack:)= pstack}
    pc   {ps clear}
    pq   {ps quit}

    pass   { # 1 aa zy pop }
    fail   { pop pop [] }
    failed { z # 0 eq }
    passed { failed not }
    next   { y tail }
    sum-up { zy {add} curry forall }

    test { y #        y # lt { fail }{
           y y # head y   eq { pass }{ fail }  ifelse }  ifelse }

    ll   { {load-if-literal-name} deep-map }
    term {str}{ {/str test} ll } @func
    alt2 {p q}{ {z /p exec zy /q exec compose} ll } @func
    seq2 {p q}{ {z /p exec zy {next /q exec sum-up} curry map} ll } @func

    some  { z many seq2 }
    maybe { {()pass} alt2 }
    many  { {{{}exec}exec} {} deep-map
            z 0 get zxy seq2  maybe
            y y 0 zy put zy pop }
    parser { /stringtype is { term      }{
             /arraytype  is { sequence  }{
             /dicttype   is { alternate }{ } ifelse } ifelse } ifelse }
    sequence { z xcheck { }{ {parser} map {seq2} reduce } ifelse }
    alternate { {fix-up} each-map {parser} map {alt2} reduce }
    fix-up { /nulltype is { pop            }{
             /nametype is { z # string cvx }{ } ifelse } ifelse }

    *  { parser many }
    +  { parser some }
    ?  { parser maybe }

    char-class { << zy 1 {} fortuple >> }
    (>>) { [ {counttomark 2 mod 1 eq {null} if
              counttomark dup 1 add copy array astore exch pop
              mark exch 2 { aload pop pop } fortuple ] /.name-order exch
             } {} forall
             (>>) load dup type /arraytype eq {/exec cvx} if ] cvx } @exec
    inverse {
        << 0 1 127 { 1 string z 0 4 -1 roll put null } for >>
        exch {} each-map {fix-up} map << exch {null} forall >>
        minus-keys }
    minus-keys {
        y zxy {  pop dup /.name-order ne {undef}{pop pop} ifelse z  } forall pop
        dup /.name-order 2 copy get [ exch {
            counttomark 2 add index  1 index known not { pop } if
        } forall ] put }


    deep-map { y type /arraytype ne { exec }{
        y xcheck 3 1 roll  [ 3 1 roll /deep-map cvx 2 array astore cvx forall ] exch {cvx} if } ifelse }

    each-map { 1 index xcheck 3 1 roll  [ 3 1 roll foreach ]  exch {cvx} if }
    foreach {d p}{/d load type /dicttype eq {
                      /d load /.name-order known {
                          /d load /.name-order get  { /d 1 index get /p exec } ll
                      }{/d load /p load} ifelse
                  }{/d load /p load} ifelse
                  end forall } @func-begin

    action { % parser-desc {proc}  ->  parser
        % { z parser passed {   % input [#*]  =>  input' {0}
        %   {max} reduce y y head {proc} 4 2 roll {tail} curry curry compose exec
        % }{
        %   pop       %  => input' {0}
        % } ifelse }
        zy parser {z} zy compose
        {passed} compose            % {proc} {z parser passed}
        {{max} reduce y y head} yzx % {z pars..} {{max}..} {proc}
        {4 2 roll} curry compose    % {z pars..} {{max}...{proc} 4 2 roll}
        {{tail {0}} curry curry compose exec} compose % {z pars..} {{max}..} {{proc} 4 2 roll {tail {0}} curry curry compose}
        {{pop {}} ifelse} curry
        compose }
}
%block
%pairs-begin main
pairs-begin
  %currentdict {exch == ==} forall
0 dict begin

%(abc) [ (a) (b) (c) ] parser dup == exec pc
/integer (0123456789) char-class + def
/on-integer /integer load { [ exch /INT exch dup cvi ] } action def
(123) on-integer pq



And finally the output from a very small but successful test. The proc
passed to action simply builds a little array of the pieces. None of 
the parsers variables are present on the stack when proc executes and
subsequent parsers ignore whatever new objects proc may leave on the 
stack. A failed parser will not execute the proc but return the remaining
input string and a zero-length array.

[note that I've hacked my copy of 'gsnd' so it doesn't set -DSAFER.]

$ gsnd pc7.ps
GPL Ghostscript 9.22 (2017-10-04)
Copyright (C) 2017 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
stack:
{0}
()
[/INT (123) 123]


Any comments welcome.

Back to comp.lang.postscript | Previous | NextNext in thread | Find similar


Thread

Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-02 19:51 -0800
  Re: Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-05 11:56 -0800
    Re: Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-05 12:39 -0800

csiph-web