Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.postscript > #3334
| Newsgroups | comp.lang.postscript |
|---|---|
| Date | 2019-02-02 19:51 -0800 |
| Message-ID | <6f879c02-caec-49a5-bc6c-a62931a36378@googlegroups.com> (permalink) |
| Subject | Parser Combinators revisited |
| From | luser droog <luser.droog@gmail.com> |
I rediscovered some abandoned code for parser combinators
and I realized that I had rewritten the underlying library
more recently. So the obvious thing was to rewrite the
parser combinators to use the new library and then maybe
everything would work out more nicely.
And with a few stumbles, it has.
So here are two files and a very short output. First is
the latest version of struct2.ps for functional programming
features. The most detailed write up on this code is in
my codereview post, but some tweaks and fixes have been
applied with the benefit of a little actual use.
$ cat struct2.ps
%!
% struct2.ps An enhanced PostScript syntax for defining functions with named,
% type-checked arguments. Using @func within a block or other construct that uses
% 'pairs' accomplishes a sort of compile-time macro expansion of the shorthand function description.
<<
/pairs-begin { pairs begin }
/pairs-def { pairs {def} forall } pop { pairs currentdict copy pop }
/pairs { << exch explode >> }
/explode { { @exec } forall }
/@exec { dup type /nametype eq { exec-if-@ } if }
/exec-if-@ { dup dup length string cvs dup first (@) first eq { exec@ }{ pop } ifelse }
/first { 0 get } /exec@ { exch pop rest cvn cvx exec }
/rest { 1 1 index length 1 sub getinterval }
>> begin {
block { pairs-begin main end }
func { 1 index type /stringtype eq { typed-func }{ simple-func } ifelse }
simple-func { func-begin { end } compose }
typed-func { exch args-and-types reverse { make-type-name } map check-stack 3 1 roll
exch simple-func compose }
func-begin { exch reverse /args-begin load curry exch compose }
args-begin { dup length dict begin { exch def } forall }
args-and-types { /was_x false def [ exch { each-specifier } fortokens fix-last ] dup args exch types }
each-specifier { dup xcheck /is_x exch def is_x was_x and { null exch } if /was_x is_x def }
fix-last { counttomark 2 mod 1 eq { null } if }
check-stack { {pop} 4 index cvlit { cvx /stackunderflow signalerror } curry compose
/if cvx 2 array astore cvx {check-count} exch compose curry
3 index cvlit { cvx /typecheck signalerror } curry
/if cvx 2 array astore cvx {check-types} exch compose compose }
check-count { dup length count 2 sub gt }
check-types { dup length 1 add copy true exch { check-type and } forall exch pop not }
check-type { dup null eq { 3 -1 roll pop pop true }{ 3 -1 roll type eq } ifelse }
make-type-name { dup type /nametype eq { dup length 4 add string dup dup 4 2 roll cvs
2 copy 0 exch putinterval length (type) putinterval cvn } if }
args { [ exch 2 { 0 get } fortuple ] cvx }
types { [ exch 2 { 1 get } fortuple ] }
map { 1 index xcheck 3 1 roll [ 3 1 roll forall ] exch {cvx} if }
reduce { exch dup first exch rest 3 -1 roll forall }
rreduce { exch aload length 1 sub dup 3 add -1 roll repeat }
curry { [ 3 1 roll {} forall ] cvx } @pop
{ dup xcheck 3 1 roll
dup length 1 add array dup 0 5 -1 roll put dup 1 4 -1 roll putinterval
exch {cvx}if }
compose { 2 array astore cvx { {} forall } map } @pop
{ 1 index xcheck 3 1 roll
1 index length 1 index length add array dup 0 4 index putinterval
dup 4 -1 roll length 4 -1 roll putinterval
exch {cvx} if }
reverse {
dup xcheck exch
[ exch dup length 1 sub -1 0 { 2 copy get 3 1 roll pop } for pop ]
exch {cvx} if }
} pairs-def {
fortokens {src proc}{ { src token {exch /src exch store}{exit}ifelse proc } loop } @func
fortuple {a n p}{ 0 n /a load length 1 sub
{ /a exch /n getinterval /p exec } {load-if-literal-name} map end for
} @func-begin
load-if-literal-name { dup type /nametype eq 1 index xcheck not and { load } if }
} pairs-def
So using pairs-begin or block lets the code look really nice IMO without
the slants on all the names being defined.
Next is the PC code. A few short mnemonics at the start. term, seq2, and
alt are the basic primitives, but almost too primitive to use. The actual
API is: parser + * ? action.
The full deal is defining an action which takes a parser and a procedure
body and returns a parser with the side effect of executing the procedure.
The proc gets passed the string contents matched by the parser.
A parser description can also be passed to action without the user needing
to call parser on it first.
$ cat pc7.ps
(struct2.ps) run {
# {length}
aa {array astore}
x {2 index}
y {1 index}
z {dup}
zy {exch}
zxy {3 1 roll}
yzx {3 2 roll}
head {0 zy getinterval}
tail {y # y sub getinterval}
max {y y lt{zy}if pop}
is {y type eq}
ps {(stack:)= pstack}
pc {ps clear}
pq {ps quit}
pass { # 1 aa zy pop }
fail { pop pop [] }
failed { z # 0 eq }
passed { failed not }
next { y tail }
sum-up { zy {add} curry forall }
test { y # y # lt { fail }{
y y # head y eq { pass }{ fail } ifelse } ifelse }
ll { {load-if-literal-name} deep-map }
term {str}{ {/str test} ll } @func
alt2 {p q}{ {z /p exec zy /q exec compose} ll } @func
seq2 {p q}{ {z /p exec zy {next /q exec sum-up} curry map} ll } @func
some { z many seq2 }
maybe { {()pass} alt2 }
many { {{{}exec}exec} {} deep-map
z 0 get zxy seq2 maybe
y y 0 zy put zy pop }
parser { /stringtype is { term }{
/arraytype is { sequence }{
/dicttype is { alternate }{ } ifelse } ifelse } ifelse }
sequence { z xcheck { }{ {parser} map {seq2} reduce } ifelse }
alternate { {fix-up} each-map {parser} map {alt2} reduce }
fix-up { /nulltype is { pop }{
/nametype is { z # string cvx }{ } ifelse } ifelse }
* { parser many }
+ { parser some }
? { parser maybe }
char-class { << zy 1 {} fortuple >> }
(>>) { [ {counttomark 2 mod 1 eq {null} if
counttomark dup 1 add copy array astore exch pop
mark exch 2 { aload pop pop } fortuple ] /.name-order exch
} {} forall
(>>) load dup type /arraytype eq {/exec cvx} if ] cvx } @exec
inverse {
<< 0 1 127 { 1 string z 0 4 -1 roll put null } for >>
exch {} each-map {fix-up} map << exch {null} forall >>
minus-keys }
minus-keys {
y zxy { pop dup /.name-order ne {undef}{pop pop} ifelse z } forall pop
dup /.name-order 2 copy get [ exch {
counttomark 2 add index 1 index known not { pop } if
} forall ] put }
deep-map { y type /arraytype ne { exec }{
y xcheck 3 1 roll [ 3 1 roll /deep-map cvx 2 array astore cvx forall ] exch {cvx} if } ifelse }
each-map { 1 index xcheck 3 1 roll [ 3 1 roll foreach ] exch {cvx} if }
foreach {d p}{/d load type /dicttype eq {
/d load /.name-order known {
/d load /.name-order get { /d 1 index get /p exec } ll
}{/d load /p load} ifelse
}{/d load /p load} ifelse
end forall } @func-begin
action { % parser-desc {proc} -> parser
% { z parser passed { % input [#*] => input' {0}
% {max} reduce y y head {proc} 4 2 roll {tail} curry curry compose exec
% }{
% pop % => input' {0}
% } ifelse }
zy parser {z} zy compose
{passed} compose % {proc} {z parser passed}
{{max} reduce y y head} yzx % {z pars..} {{max}..} {proc}
{4 2 roll} curry compose % {z pars..} {{max}...{proc} 4 2 roll}
{{tail {0}} curry curry compose exec} compose % {z pars..} {{max}..} {{proc} 4 2 roll {tail {0}} curry curry compose}
{{pop {}} ifelse} curry
compose }
}
%block
%pairs-begin main
pairs-begin
%currentdict {exch == ==} forall
0 dict begin
%(abc) [ (a) (b) (c) ] parser dup == exec pc
/integer (0123456789) char-class + def
/on-integer /integer load { [ exch /INT exch dup cvi ] } action def
(123) on-integer pq
And finally the output from a very small but successful test. The proc
passed to action simply builds a little array of the pieces. None of
the parsers variables are present on the stack when proc executes and
subsequent parsers ignore whatever new objects proc may leave on the
stack. A failed parser will not execute the proc but return the remaining
input string and a zero-length array.
[note that I've hacked my copy of 'gsnd' so it doesn't set -DSAFER.]
$ gsnd pc7.ps
GPL Ghostscript 9.22 (2017-10-04)
Copyright (C) 2017 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
stack:
{0}
()
[/INT (123) 123]
Any comments welcome.
Back to comp.lang.postscript | Previous | Next — Next in thread | Find similar
Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-02 19:51 -0800
Re: Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-05 11:56 -0800
Re: Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-05 12:39 -0800
csiph-web