Path: csiph.com!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder-in1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Hans-Peter Diettrich Newsgroups: comp.compilers Subject: Re: Parser Reversed Date: Tue, 13 Mar 2018 12:21:58 +0100 Organization: Compilers Central Lines: 79 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <18-03-051@comp.compilers> References: <18-03-038@comp.compilers> <18-03-044@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="59977"; mail-complaints-to="abuse@iecc.com" Keywords: code Posted-Date: 13 Mar 2018 15:34:36 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:1998 Am 12.03.2018 um 22:00 schrieb Kaz Kylheku: > On 2018-03-11, Hans-Peter Diettrich wrote: >> A grammar can be used to *check* for valid sentences of a language, but >> it also can be used to *create* valid sentences. > > Grammars are *defined* in terms of generation. That's why the rules are > called "production rules" not "recognition rules". Isn't that depending on the grammar notation? > Using grammars to generate sentences randomly is algorithmically far > simpler than the cunning required to parse sentences according to a > grammar. Right, at least as long as semantical correctness is not of interest. > It's a common homework exercise in the first week of compiler courses. > (Sometimes called compiler courses, though rarely do compilers > get written.) I'm near 50 years past that stage ;-) > If you can use a high level, dynamic language, you can probably > write the code on a napkin. > > let L := { S } # let L be a list containing the start symbol > > while (at least one non-terminal symbol remains in L) { > choose a non-terminal symbol N from L > choose from the grammar a production rule R which has N on its left > replace S by the right side of the rule > } > > L now holds a random sentence Ah, double randomization is a nice trick :-) > Note that if the grammar can generate infinite sentences, then this > algorithm has no guarantee of termination. Right. I thought of giving weights to the alternatives of a NT, and using multiple sets of weights for initial, continued and final stage of the generation. > Some cunning could be built into such that, at a given configured > sentence length, it starts preferring the rules which lead to the > generation of terminal symbols. Fine, with eps (empty) counting as a terminal symbol. > If a given chosen N has two production rules like > > N -> N t # t is a terminal symbol > -> t > > then the second rule will be preferred (by some heuristic like choose > the rule with maximal terminal symbols and minimal non-terminals). In my simple grammar case this will be sufficient. Other grammars may deserve some initial analysis. > Speaking of termination, again, the generative POV dictates > the terminology: terminal symbols terminate the generation! Point taken :-) > If the grammar isn't context free (two or more symbols appear on > the left hand side of a rule) then you have a more complicated > job. Basically you have to gather all the left hand patterns > from the grammar, and then look for occurences of these patterns > in the sentential form L. From these occurrences, pick one > and expand it. Something like that. This is where replacing the (sequentially) first matching pattern may not cover all cases of interest. Thanks for the many hints :-) DoDi