Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.postscript > #3458
| Newsgroups | comp.lang.postscript |
|---|---|
| Date | 2019-09-22 01:13 -0700 |
| Message-ID | <e72196ed-4ca0-4fd1-97ac-3be30286fcad@googlegroups.com> (permalink) |
| Subject | Further struggles with parsers |
| From | luser droog <luser.droog@gmail.com> |
I turned back to my parser code and tried to add
string handling to the simple lexer for PostScript
tokens. And it works for simple strings, but not
when I try to do something fancy like converting
backslant escapes.
The problem is that the outer handler code relies
upon the matched string to be the correct length
to advance in the input. If I convert \n to a
single newline character, then the outer handler
doesn't consume the final closing paren from the
input.
So, I wondered a lot and searched for some magical
way to get the length information by composing
the parser monads with a state monad. And I got
some interesting links to read, but nothing really
gelled. So I turned back to a paper by Graham
Hutton, Higher Order Functions for Parsing.
This paper describes preprocessing the input stream
to add (row, col) decorations to each character.
The 'satify(pred)' parser filters out the extra
decoration, and thus any parsers built out of
'satisfy' are not any more complicated to deal with
extra noise.
So, I've made some headway in rewriting everything
to follow this new idea. My string-input function
now has to produce a lazy list of [char [row col]]
structures. Then there's a new function 'tok(p tag)'
which decorates the result from the parser p with
a structure like [[/tag (matched)] [len row col]].
(The paper just has 'row col', but length was the
thing I really needed.)
And I'll spare you the code until it's more complete,
but it will be more usefully commented than the previous
incarnation (for my own sake). But here's a small
example and output.
$ tail -3 pc10.ps
(abc) string-input (ab) str exec ps clear / =
(abc) string-input (ab) str /AB tok exec ps first first ps
quit
$ gsnd -q -DNOSAFER pc10.ps
stack:
[[[97 98] [[99 [2 2 0]] {[() [3 3 0]] string-next}]]]
stack:
[[[[/AB (ab)] [2 0 0]] [[99 [2 2 0]] {[() [3 3 0]] string-next}]]]
stack:
[[/AB (ab)] [2 0 0]]
Back to comp.lang.postscript | Previous | Next — Next in thread | Find similar
Further struggles with parsers luser droog <luser.droog@gmail.com> - 2019-09-22 01:13 -0700
Re: Further struggles with parsers luser droog <luser.droog@gmail.com> - 2019-09-26 04:07 -0700
Re: Further struggles with parsers luser droog <luser.droog@gmail.com> - 2019-09-28 22:49 -0700
csiph-web