Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.postscript > #3336
| Newsgroups | comp.lang.postscript |
|---|---|
| Date | 2019-02-05 12:39 -0800 |
| References | <6f879c02-caec-49a5-bc6c-a62931a36378@googlegroups.com> <ecdf8380-c4cc-4efc-9b7f-a3504b210931@googlegroups.com> |
| Message-ID | <fe05b3ee-adde-4384-a841-51203907706f@googlegroups.com> (permalink) |
| Subject | Re: Parser Combinators revisited |
| From | luser droog <luser.droog@gmail.com> |
On Tuesday, February 5, 2019 at 1:56:47 PM UTC-6, luser droog wrote:
> On Saturday, February 2, 2019 at 9:51:57 PM UTC-6, luser droog wrote:
> > I rediscovered some abandoned code for parser combinators
> > and I realized that I had rewritten the underlying library
> > more recently. So the obvious thing was to rewrite the
> > parser combinators to use the new library and then maybe
> > everything would work out more nicely.
> >
> > And with a few stumbles, it has.
> >
> > So here are two files and a very short output. First is
> > the latest version of struct2.ps for functional programming
> > features. The most detailed write up on this code is in
> > my codereview post, but some tweaks and fixes have been
> > applied with the benefit of a little actual use.
> >
> > $ cat struct2.ps
> > %!
> > % struct2.ps An enhanced PostScript syntax for defining functions with named,
> > % type-checked arguments. Using @func within a block or other construct that uses
> > % 'pairs' accomplishes a sort of compile-time macro expansion of the shorthand function description.
> > <<
> > /pairs-begin { pairs begin }
> > /pairs-def { pairs {def} forall } pop { pairs currentdict copy pop }
> > /pairs { << exch explode >> }
> > /explode { { @exec } forall }
> > /@exec { dup type /nametype eq { exec-if-@ } if }
> > /exec-if-@ { dup dup length string cvs dup first (@) first eq { exec@ }{ pop } ifelse }
> > /first { 0 get } /exec@ { exch pop rest cvn cvx exec }
> > /rest { 1 1 index length 1 sub getinterval }
> > >> begin {
> > block { pairs-begin main end }
> > func { 1 index type /stringtype eq { typed-func }{ simple-func } ifelse }
> > simple-func { func-begin { end } compose }
> > typed-func { exch args-and-types reverse { make-type-name } map check-stack 3 1 roll
> > exch simple-func compose }
> > func-begin { exch reverse /args-begin load curry exch compose }
> > args-begin { dup length dict begin { exch def } forall }
> > args-and-types { /was_x false def [ exch { each-specifier } fortokens fix-last ] dup args exch types }
> > each-specifier { dup xcheck /is_x exch def is_x was_x and { null exch } if /was_x is_x def }
> > fix-last { counttomark 2 mod 1 eq { null } if }
> > check-stack { {pop} 4 index cvlit { cvx /stackunderflow signalerror } curry compose
> > /if cvx 2 array astore cvx {check-count} exch compose curry
> > 3 index cvlit { cvx /typecheck signalerror } curry
> > /if cvx 2 array astore cvx {check-types} exch compose compose }
> > check-count { dup length count 2 sub gt }
> > check-types { dup length 1 add copy true exch { check-type and } forall exch pop not }
> > check-type { dup null eq { 3 -1 roll pop pop true }{ 3 -1 roll type eq } ifelse }
> > make-type-name { dup type /nametype eq { dup length 4 add string dup dup 4 2 roll cvs
> > 2 copy 0 exch putinterval length (type) putinterval cvn } if }
> > args { [ exch 2 { 0 get } fortuple ] cvx }
> > types { [ exch 2 { 1 get } fortuple ] }
> > map { 1 index xcheck 3 1 roll [ 3 1 roll forall ] exch {cvx} if }
> > reduce { exch dup first exch rest 3 -1 roll forall }
> > rreduce { exch aload length 1 sub dup 3 add -1 roll repeat }
> > curry { [ 3 1 roll {} forall ] cvx } @pop
> > { dup xcheck 3 1 roll
> > dup length 1 add array dup 0 5 -1 roll put dup 1 4 -1 roll putinterval
> > exch {cvx}if }
> > compose { 2 array astore cvx { {} forall } map } @pop
> > { 1 index xcheck 3 1 roll
> > 1 index length 1 index length add array dup 0 4 index putinterval
> > dup 4 -1 roll length 4 -1 roll putinterval
> > exch {cvx} if }
> > reverse {
> > dup xcheck exch
> > [ exch dup length 1 sub -1 0 { 2 copy get 3 1 roll pop } for pop ]
> > exch {cvx} if }
> > } pairs-def {
> > fortokens {src proc}{ { src token {exch /src exch store}{exit}ifelse proc } loop } @func
> > fortuple {a n p}{ 0 n /a load length 1 sub
> > { /a exch /n getinterval /p exec } {load-if-literal-name} map end for
> > } @func-begin
> > load-if-literal-name { dup type /nametype eq 1 index xcheck not and { load } if }
> > } pairs-def
> >
> >
> > So using pairs-begin or block lets the code look really nice IMO without
> > the slants on all the names being defined.
> >
> > Next is the PC code. A few short mnemonics at the start. term, seq2, and
> > alt are the basic primitives, but almost too primitive to use. The actual
> > API is: parser + * ? action.
> >
> > The full deal is defining an action which takes a parser and a procedure
> > body and returns a parser with the side effect of executing the procedure.
> > The proc gets passed the string contents matched by the parser.
> > A parser description can also be passed to action without the user needing
> > to call parser on it first.
> >
> >
>
> Some improvements. Alternations now have a short-circuiting semantic so
> it stops trying branches after one succeeds. Reorganized to be more top-down.
> Expanded testing section.
>
> In order to test that the user proc associated with an action wasn't
> causing any stack interference, this version uses a function called
> 'below-mark' to get the result (a little bit) out of the way.
>
Now it's starting to look like something. I've implemented a replacement
for the 'token' operator for a small subset of the PostScript syntax.
Here's the result of parsing a string using both 'token' and 'mytoken'.
$ gsnd pc7.ps
GPL Ghostscript 9.22 (2017-10-04)
Copyright (C) 2017 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
stack:
[
2.65
123
name
37
stack:
[
2.65
123
name
37
GS>
And the code what done it:
$ cat pc7.ps
(struct2.ps) run {
action{pa ac}{ % parser-desc {proc} -> parser
/pa /pa load parser def
{ % String
z /pa exec passed { % S [#]
dup {max} reduce zy zxy head % [#] S[0..^#]
/ac 3 2 roll % S[0..^#] {proc} [#]
1 aa cvx compose exec % PASS => [#]
}{ % S []
zy pop % FAIL => []
} ifelse
} ll
} @func
parser { /stringtype is { term }{
/arraytype is { sequence }{
/dicttype is { alternate }{ } ifelse } ifelse } ifelse }
* { parser many }
+ { parser some }
? { parser maybe }
char-class { << zy 1 {} fortuple >> }
inverse { << 0 1 127 { 1 string z 0 4 -1 roll put null } for >> exch
{} each-map {fix-up} map << exch {null} forall >> minus-keys }
pass { # 1 aa zy pop }
fail { pop pop [] }
failed { z # 0 eq }
passed { failed not }
next { y tail }
sum-up { zy {add} curry forall }
test { y # y # lt { fail }{
y y # head y eq { pass }{ fail } ifelse } ifelse }
term {str}{ {/str test} ll } @func
%alt2 {p q}{ {z /p exec zy /q exec compose} ll } @func
alt2 {p q}{ {z /p exec failed { zy /q exec compose }{ zy pop } ifelse } ll } @func
seq2 {p q}{ {z /p exec zy {next /q exec sum-up} curry map} ll } @func
sequence { z xcheck { }{ {parser} map {seq2} reduce } ifelse }
alternate { {} each-map {fix-up} map {parser} map {alt2} reduce }
some { z many seq2 }
maybe { {()pass} alt2 }
many { {{{}exec}exec}{}deep-map z 0 get zxy seq2 maybe y y 0 zy put zy pop }
fix-up { /nulltype is { pop }{
/nametype is { z # string cvx }{ } ifelse } ifelse }
(>>) { [ {counttomark 2 mod 1 eq {null} if
counttomark dup 1 add copy array astore exch pop
mark exch 2 { aload pop pop } fortuple ] /.name-order exch} {} forall
(>>) load dup type /arraytype eq {/exec cvx} if ] cvx } @exec
minus-keys { y zxy { pop dup /.name-order ne {undef}{pop pop} ifelse z } forall pop
dup /.name-order 2 copy get [ exch {
counttomark 2 add index 1 index known not { pop } if
} forall ] put }
# {length}
aa {array astore}
ll { {load-if-literal-name} deep-map }
x {2 index}
y {1 index}
z {dup}
zy {exch}
zxy {3 1 roll}
yzx {3 2 roll}
head {0 zy getinterval}
tail {y # y sub getinterval}
max {y y lt{zy}if pop}
is {y type eq}
ps {(stack:)= pstack}
pc {ps clear}
pq {ps quit}
deep-map { y type /arraytype ne { exec }{
y xcheck 3 1 roll [ 3 1 roll /deep-map cvx 2 array astore cvx forall ] exch {cvx} if } ifelse }
each-map { 1 index xcheck 3 1 roll [ 3 1 roll foreach ] exch {cvx} if }
foreach {d p}{/d load type /dicttype eq {
/d load /.name-order known {
/d load /.name-order get { /d 1 index get /p exec } ll
}{/d load /p load} ifelse
}{/d load /p load} ifelse
end forall } @func-begin
accumulate { below-mark }
to-bottom { count 1 roll }
below-mark { counttomark 1 add 1 roll }
} pairs-begin
/delimiter ( \t\n()/%[]<>{}) char-class def
/regular-char delimiter inverse def
/digit (0123456789) char-class def
/number //digit + def
/opt-number //digit * def
/opt-spaces ( \t\n) char-class * def
/rad-integer [ //digit //digit ? (#) //number ] parser def
/integer [ (-) ? //number ] parser def
/real [ (-)? << [ //number (.) //opt-number ] [ (.) //number ] >> ] parser def
/name //regular-char + def
/on-integer << //rad-integer //integer >>
{ [ exch /INT exch dup cvi ] accumulate } action def
/on-real //real { [ exch /REAL exch dup cvr ] accumulate } action def
/on-lit-name [ (/) //name ] { [ exch /NAME exch dup cvn cvlit ] accumulate } action def
/on-name //name { [ exch /NAME exch dup cvn cvx ] accumulate } action def
/on-delimiter //delimiter { [ exch /DEL exch dup cvn cvx ] accumulate } action def
/ps-token << //on-real //on-integer //on-name //on-delimiter >> parser def
/mytoken {
dup //opt-spaces exec passed { 0 get tail }{ pop }ifelse
mark exch dup //ps-token exec passed {
0 get tail below-mark cleartomark
exch 2 get true
}{ cleartomark false } ifelse
} def
(16#25 name 123 2.65 [ ) {
mytoken {exch}{exit} ifelse
} loop
pc
(16#25 name 123 2.65 [ ) {
token {exch}{exit} ifelse
} loop
pc
Back to comp.lang.postscript | Previous | Next — Previous in thread | Find similar
Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-02 19:51 -0800
Re: Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-05 11:56 -0800
Re: Parser Combinators revisited luser droog <luser.droog@gmail.com> - 2019-02-05 12:39 -0800
csiph-web