Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.postscript > #3458

Further struggles with parsers

Newsgroups comp.lang.postscript
Date 2019-09-22 01:13 -0700
Message-ID <e72196ed-4ca0-4fd1-97ac-3be30286fcad@googlegroups.com> (permalink)
Subject Further struggles with parsers
From luser droog <luser.droog@gmail.com>

Show all headers | View raw


I turned back to my parser code and tried to add
string handling to the simple lexer for PostScript
tokens. And it works for simple strings, but not
when I try to do something fancy like converting
backslant escapes.

The problem is that the outer handler code relies
upon the matched string to be the correct length
to advance in the input. If I convert \n to a 
single newline character, then the outer handler
doesn't consume the final closing paren from the
input.

So, I wondered a lot and searched for some magical
way to get the length information by composing
the parser monads with a state monad. And I got
some interesting links to read, but nothing really
gelled. So I turned back to a paper by Graham
Hutton, Higher Order Functions for Parsing. 

This paper describes preprocessing the input stream
to add (row, col) decorations to each character.
The 'satify(pred)' parser filters out the extra
decoration, and thus any parsers built out of
'satisfy' are not any more complicated to deal with 
extra noise.

So, I've made some headway in rewriting everything
to follow this new idea. My string-input function
now has to produce a lazy list of [char [row col]] 
structures. Then there's a new function 'tok(p tag)'
which decorates the result from the parser p with 
a structure like [[/tag (matched)] [len row col]].
(The paper just has 'row col', but length was the
thing I really needed.)

And I'll spare you the code until it's more complete,
but it will be more usefully commented than the previous
incarnation (for my own sake). But here's a small
example and output.

$ tail -3 pc10.ps
(abc) string-input (ab) str exec ps clear / =
(abc) string-input (ab) str /AB tok exec ps first first ps 
quit

$ gsnd -q -DNOSAFER pc10.ps
stack:
[[[97 98] [[99 [2 2 0]] {[() [3 3 0]] string-next}]]]

stack:
[[[[/AB (ab)] [2 0 0]] [[99 [2 2 0]] {[() [3 3 0]] string-next}]]]
stack:
[[/AB (ab)] [2 0 0]]

Back to comp.lang.postscript | Previous | NextNext in thread | Find similar


Thread

Further struggles with parsers luser droog <luser.droog@gmail.com> - 2019-09-22 01:13 -0700
  Re: Further struggles with parsers luser droog <luser.droog@gmail.com> - 2019-09-26 04:07 -0700
    Re: Further struggles with parsers luser droog <luser.droog@gmail.com> - 2019-09-28 22:49 -0700

csiph-web