Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: "Ev. Drikos" <drikosev@gmail.com>
Newsgroups: comp.compilers
Subject: Re: Applesoft tokenization phases?
Date: Wed, 18 Mar 2020 00:14:51 +0200
Organization: Aioe.org NNTP Server
Lines: 61
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <20-03-017@comp.compilers>
References: <20-03-013@comp.compilers> <20-03-016@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="94207"; mail-complaints-to="abuse@iecc.com"
Keywords: Basic, history
Posted-Date: 19 Mar 2020 17:44:31 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-US
Xref: csiph.com comp.compilers:2488

On 16/03/2020 08:07, awanderin wrote:
> "Ev. Drikos" <drikosev@gmail.com> writes:
>> ...
>> Yet, I've found ie a program at "hoist-point.com" [2] that contains:
>> 110 DIFF = ABS(A(I)-N)
>
> If you type that into Applesoft BASIC, it parses it as:
>
>     110 D IF F =  ABS (A(I) - N)
>
> The spaces are how Applesoft lists it...
>

Another vague point or simply a point where I'm not really sure that I
translate properly the manual are the reserved keywords before a certain
delimiter. Likely an Applesoft parser must reject this valid UK101 code:

10 X=SHIMEM:
20 END


>> Also, an online AppleSoft simulator at calormen.com [3] accepts ie both
>> DIFF and FEND as valid variable names.
>
> It is doing things differently than actual Applesoft.
>
>> As it seems, this issue can affect a design choice for the tokenization
>> phases of an Applesoft front-end. Is the manual just informative or the
>> online simulator does not accept (precisely) the particular dialect?
>
> The latter; the simulator accepts a different dialect.
> --

I've also read your comment for spacing rules on the Commodore machines.

Thanks.


@everyone

IMHO, if spaces are important then a Lexer can be simpler. One can build
one (L1) with a lexer generator that supports intersection/negation and
just re-scan the DATA statements.

Due to complex spacing rules of AppleSoft II, one could scan in advance:
1. DATA statements by preserving spaces in literals & strings, and
    1.1 Strings & Comments along with the Keyword AT
2. All other Keywords once the remaining spaces have been skipped

Thereafter one has to scan just few more tokens, ie names & delimiters,
yet this task is too simple for a generated Lexer (L2).

With a proprietary tool (Syntaxis) that supports cascaded scanners, I
could model a solution that supports both forms (space optional or not)
by combining 1, 2, and then for space efficiency just reused L1 instead
of L2 for the remaining tokens. The tool built parses several examples
but in any case the implementation is too fresh to be considered stable.

Obviously one could hand code a lexer by following the above 3 phases.

Ev. Drikos