Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.postscript > #3335
| Newsgroups | comp.lang.postscript |
|---|---|
| Date | 2019-02-05 11:56 -0800 |
| References | <6f879c02-caec-49a5-bc6c-a62931a36378@googlegroups.com> |
| Message-ID | <ecdf8380-c4cc-4efc-9b7f-a3504b210931@googlegroups.com> (permalink) |
| Subject | Re: Parser Combinators revisited |
| From | luser droog <luser.droog@gmail.com> |
On Saturday, February 2, 2019 at 9:51:57 PM UTC-6, luser droog wrote:
> I rediscovered some abandoned code for parser combinators
> and I realized that I had rewritten the underlying library
> more recently. So the obvious thing was to rewrite the
> parser combinators to use the new library and then maybe
> everything would work out more nicely.
>
> And with a few stumbles, it has.
>
> So here are two files and a very short output. First is
> the latest version of struct2.ps for functional programming
> features. The most detailed write up on this code is in
> my codereview post, but some tweaks and fixes have been
> applied with the benefit of a little actual use.
>
> $ cat struct2.ps
> %!
> % struct2.ps An enhanced PostScript syntax for defining functions with named,
> % type-checked arguments. Using @func within a block or other construct that uses
> % 'pairs' accomplishes a sort of compile-time macro expansion of the shorthand function description.
> <<
> /pairs-begin { pairs begin }
> /pairs-def { pairs {def} forall } pop { pairs currentdict copy pop }
> /pairs { << exch explode >> }
> /explode { { @exec } forall }
> /@exec { dup type /nametype eq { exec-if-@ } if }
> /exec-if-@ { dup dup length string cvs dup first (@) first eq { exec@ }{ pop } ifelse }
> /first { 0 get } /exec@ { exch pop rest cvn cvx exec }
> /rest { 1 1 index length 1 sub getinterval }
> >> begin {
> block { pairs-begin main end }
> func { 1 index type /stringtype eq { typed-func }{ simple-func } ifelse }
> simple-func { func-begin { end } compose }
> typed-func { exch args-and-types reverse { make-type-name } map check-stack 3 1 roll
> exch simple-func compose }
> func-begin { exch reverse /args-begin load curry exch compose }
> args-begin { dup length dict begin { exch def } forall }
> args-and-types { /was_x false def [ exch { each-specifier } fortokens fix-last ] dup args exch types }
> each-specifier { dup xcheck /is_x exch def is_x was_x and { null exch } if /was_x is_x def }
> fix-last { counttomark 2 mod 1 eq { null } if }
> check-stack { {pop} 4 index cvlit { cvx /stackunderflow signalerror } curry compose
> /if cvx 2 array astore cvx {check-count} exch compose curry
> 3 index cvlit { cvx /typecheck signalerror } curry
> /if cvx 2 array astore cvx {check-types} exch compose compose }
> check-count { dup length count 2 sub gt }
> check-types { dup length 1 add copy true exch { check-type and } forall exch pop not }
> check-type { dup null eq { 3 -1 roll pop pop true }{ 3 -1 roll type eq } ifelse }
> make-type-name { dup type /nametype eq { dup length 4 add string dup dup 4 2 roll cvs
> 2 copy 0 exch putinterval length (type) putinterval cvn } if }
> args { [ exch 2 { 0 get } fortuple ] cvx }
> types { [ exch 2 { 1 get } fortuple ] }
> map { 1 index xcheck 3 1 roll [ 3 1 roll forall ] exch {cvx} if }
> reduce { exch dup first exch rest 3 -1 roll forall }
> rreduce { exch aload length 1 sub dup 3 add -1 roll repeat }
> curry { [ 3 1 roll {} forall ] cvx } @pop
> { dup xcheck 3 1 roll
> dup length 1 add array dup 0 5 -1 roll put dup 1 4 -1 roll putinterval
> exch {cvx}if }
> compose { 2 array astore cvx { {} forall } map } @pop
> { 1 index xcheck 3 1 roll
> 1 index length 1 index length add array dup 0 4 index putinterval
> dup 4 -1 roll length 4 -1 roll putinterval
> exch {cvx} if }
> reverse {
> dup xcheck exch
> [ exch dup length 1 sub -1 0 { 2 copy get 3 1 roll pop } for pop ]
> exch {cvx} if }
> } pairs-def {
> fortokens {src proc}{ { src token {exch /src exch store}{exit}ifelse proc } loop } @func
> fortuple {a n p}{ 0 n /a load length 1 sub
> { /a exch /n getinterval /p exec } {load-if-literal-name} map end for
> } @func-begin
> load-if-literal-name { dup type /nametype eq 1 index xcheck not and { load } if }
> } pairs-def
>
>
> So using pairs-begin or block lets the code look really nice IMO without
> the slants on all the names being defined.
>
> Next is the PC code. A few short mnemonics at the start. term, seq2, and
> alt are the basic primitives, but almost too primitive to use. The actual
> API is: parser + * ? action.
>
> The full deal is defining an action which takes a parser and a procedure
> body and returns a parser with the side effect of executing the procedure.
> The proc gets passed the string contents matched by the parser.
> A parser description can also be passed to action without the user needing
> to call parser on it first.
>
>
Some improvements. Alternations now have a short-circuiting semantic so
it stops trying branches after one succeeds. Reorganized to be more top-down.
Expanded testing section.
In order to test that the user proc associated with an action wasn't
causing any stack interference, this version uses a function called
'below-mark' to get the result (a little bit) out of the way.
$ cat pc7.ps
(struct2.ps) run {
action{pa ac}{ % parser-desc {proc} -> parser
/pa /pa load parser def
{ % String
z /pa exec passed { % S [#]
dup {max} reduce zy zxy head % [#] S[0..^#]
/ac 3 2 roll % S[0..^#] {proc} [#]
1 aa cvx compose exec % PASS => [#]
}{ % S []
zy pop % FAIL => []
} ifelse
} ll
} @func
parser { /stringtype is { term }{
/arraytype is { sequence }{
/dicttype is { alternate }{ } ifelse } ifelse } ifelse }
* { parser many }
+ { parser some }
? { parser maybe }
char-class { << zy 1 {} fortuple >> }
inverse { << 0 1 127 { 1 string z 0 4 -1 roll put null } for >> exch
{} each-map {fix-up} map << exch {null} forall >> minus-keys }
pass { # 1 aa zy pop }
fail { pop pop [] }
failed { z # 0 eq }
passed { failed not }
next { y tail }
sum-up { zy {add} curry forall }
test { y # y # lt { fail }{
y y # head y eq { pass }{ fail } ifelse } ifelse }
term {str}{ {/str test} ll } @func
%alt2 {p q}{ {z /p exec zy /q exec compose} ll } @func
alt2 {p q}{ {z /p exec failed { zy /q exec compose }{ zy pop } ifelse } ll } @func
seq2 {p q}{ {z /p exec zy {next /q exec sum-up} curry map} ll } @func
sequence { z xcheck { }{ {parser} map {seq2} reduce } ifelse }
alternate { {} each-map {fix-up} map {parser} map {alt2} reduce }
some { z many seq2 }
maybe { {()pass} alt2 }
many { {{{}exec}exec}{}deep-map z 0 get zxy seq2 maybe y y 0 zy put zy pop }
fix-up { /nulltype is { pop }{
/nametype is { z # string cvx }{ } ifelse } ifelse }
(>>) { [ {counttomark 2 mod 1 eq {null} if
counttomark dup 1 add copy array astore exch pop
mark exch 2 { aload pop pop } fortuple ] /.name-order exch} {} forall
(>>) load dup type /arraytype eq {/exec cvx} if ] cvx } @exec
minus-keys { y zxy { pop dup /.name-order ne {undef}{pop pop} ifelse z } forall pop
dup /.name-order 2 copy get [ exch {
counttomark 2 add index 1 index known not { pop } if
} forall ] put }
# {length}
aa {array astore}
ll { {load-if-literal-name} deep-map }
x {2 index}
y {1 index}
z {dup}
zy {exch}
zxy {3 1 roll}
yzx {3 2 roll}
head {0 zy getinterval}
tail {y # y sub getinterval}
max {y y lt{zy}if pop}
is {y type eq}
ps {(stack:)= pstack}
pc {ps clear}
pq {ps quit}
deep-map { y type /arraytype ne { exec }{
y xcheck 3 1 roll [ 3 1 roll /deep-map cvx 2 array astore cvx forall ] exch {cvx} if } ifelse }
each-map { 1 index xcheck 3 1 roll [ 3 1 roll foreach ] exch {cvx} if }
foreach {d p}{/d load type /dicttype eq {
/d load /.name-order known {
/d load /.name-order get { /d 1 index get /p exec } ll
}{/d load /p load} ifelse
}{/d load /p load} ifelse
end forall } @func-begin
accumulate { below-mark }
to-bottom { count 1 roll }
below-mark { counttomark 1 add 1 roll }
} pairs-begin
/delimiter ( \n()/%[]<>{}) char-class def
/regular-char delimiter inverse def
/digit (0123456789) char-class def
/number //digit + def
/opt-number //digit * def
/rad-integer [ //digit //digit ? (#) //number ] parser def
/integer [ (-) ? //number ] parser def
/real [ (-)? << [ //number (.) //opt-number ] [ (.) //number ] >> ] parser def
/name //regular-char + def
/on-integer << //rad-integer //integer >> { [ exch /INT exch dup cvi ] accumulate } action def
/on-real //real { [ exch /REAL exch dup cvr ] accumulate } action def
/on-name //name { [ exch /NAME exch dup cvn ] accumulate } action def
/on-delimiter //delimiter { [ exch /DEL exch ] accumulate } action def
/ps-token << //on-real //on-integer //on-name //on-delimiter >> parser def
mark (16#25) on-integer pc
mark (123) on-integer pc
mark (-123) on-integer pc
mark (123) on-real pc %% sb. FAIL
mark (-123) on-real pc %% sb. FAIL
mark (2.685) on-real pc
mark (-.5) on-real pc
mark (name) on-name pc
mark (word) on-name pc
mark (123) ps-token pc
mark (67.5) ps-token pc
mark (aname) ps-token pc
mark (aname) on-name pc
mark ({) ps-token pc
mark (<) ps-token pc
mark ({) on-delimiter pc
So each test lays down a 'mark' object and then if the user proc fires,
its result is tucked behind the mark and the parser's result is returned
normally on top of the stack. The result of a parser is an array, length=0
means failure, any other contents will be matched lengths. If the short-
circuit behavior is not used, then alternations may yield multiple matches.
Output:
$ gsnd pc7.ps
GPL Ghostscript 9.22 (2017-10-04)
Copyright (C) 2017 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
stack:
[5]
-mark-
[/INT (16#25) 37]
stack:
[3]
-mark-
[/INT (123) 123]
stack:
[4]
-mark-
[/INT (-123) -123]
stack:
[]
-mark-
stack:
[]
-mark-
stack:
[5]
-mark-
[/REAL (2.685) 2.685]
stack:
[3]
-mark-
[/REAL (-.5) -0.5]
stack:
[4]
-mark-
[/NAME (name) /name]
stack:
[4]
-mark-
[/NAME (word) /word]
stack:
[3]
-mark-
[/INT (123) 123]
stack:
[4]
-mark-
[/REAL (67.5) 67.5]
stack:
[5]
-mark-
[/NAME (aname) /aname]
stack:
[5]
-mark-
[/NAME (aname) /aname]
stack:
[1]
-mark-
[/DEL ({)]
stack:
[1]
-mark-
[/DEL (<)]
stack:
[1]
-mark-
[/DEL ({)]
GS>
Back to comp.lang.postscript | Previous | Next — Previous in thread | Next in thread | Find similar
Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-02 19:51 -0800
Re: Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-05 11:56 -0800
Re: Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-05 12:39 -0800
csiph-web