Post Reply 
HP-42S Compiler for Niklaus Wirth's PL/0 Language
01-14-2024, 04:01 PM
Post: #1
HP-42S Compiler for Niklaus Wirth's PL/0 Language
Introduction

Recently I stumbled upon the book Compilerbau by Niklaus Wirth.
A small language PL/0 is used to illustrate how to write an interpreter and a compiler.
For the compiler a stack-based machine is assumed.
Thus I wondered if an HP-calculator could be used as target.

There's a PL/0 Language Tools project but it appears to be abandoned.
The last commit on the GitHub repository was made 7 years ago.
It is still based on Python 2 which is now dead.

The Language PL/0

It is a simplified variant of Pascal with the following EBNF:
Code:
program =       block "." .
block =         ["CONST" ident "=" number {"," ident "=" number} ";"] 
                ["VAR" ident {"," ident} ";"]
                {"PROCEDURE" ident ";" block ";"} statement.
statement =     [ident ":=" expression | "CALL" ident |
                "?" ident | "!" expression |
                "BEGIN" statement {";" statement} "END" |
                "IF" condition "THEN" statement |
                "WHILE" condition "DO" statement] .
condition =     "ODD" expression |
                expression ("=" | "#" | "<" | "<=" | ">" | ">=") expression .
expression =    ["+" | "-"] term {("+" | "-") term} .
term =          factor {("*" | "/") factor} .
factor =        ident | number | "(" expression ")" .

Installation and Setup

Download Project

Download the ZIP-file and unzip it.

unzip pl0-compiler.zip

It should contain the following files:
Code:
Archive:  pl0-compiler.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     3427  01-14-2024 14:50   pl0_lexer.py
    12754  01-14-2024 14:52   pl0_parser.py
     4393  01-14-2024 12:04   pl0_node_visitor.py
     5141  01-14-2024 14:48   pl0_interpreter.py
     6240  01-14-2024 14:52   pl0_compiler.py
     5860  01-14-2024 14:50   pl0_hp42s.py
      347  01-14-2024 13:53   examples/combination.pl0
      153  01-14-2024 11:05   examples/constants.pl0
      328  01-14-2024 11:05   examples/fibonacci.pl0
      232  01-14-2024 12:51   examples/gcd.pl0
      133  01-14-2024 15:07   examples/multiply.pl0
      265  01-14-2024 13:42   examples/permutation.pl0
      234  01-14-2024 11:05   examples/scope.pl0
      232  01-14-2024 11:05   examples/square.pl0
---------                     -------
    39739                     14 files

Setup the Virtual Environment

python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip setuptools wheel
pip install ply


Programs

Some example programs can be found in the examples directory.

Fibonacci

Code:
# This program calculates the first K values of the fibonacci sequence.

CONST K = 20;

VAR m, n, k, count;

BEGIN
    m := 1;
    n := 1;
    k := 1;
    count := 0;
    
    WHILE count <= K DO
    BEGIN
        ! k;
        
        k := n;
        n := m + n;
        m := k;
        
        count := count + 1
    END
END.

Compile to HP-42S

The program pl0_hp42s.py is based on pl0_compiler.py and creates commands for the HP-42S:

python pl0_hp42s.py examples/fibonacci.pl0

Code:
GTO 00     # main
# Register 01: m
# Register 02: n
# Register 03: k
# Register 04: count
LBL 00     # main
1
STO 01     # m
1
STO 02     # n
1
STO 03     # k
0
STO 04     # count
LBL 01     # while loop
20
RCL 04     # count
X>Y?
GTO 02     # end while loop if true
RCL 03     # k
PROMPT
RCL 02     # n
STO 03     # k
RCL 01     # m
RCL 02     # n
+
STO 02     # n
RCL 03     # k
STO 01     # m
RCL 04     # count
1
+
STO 04     # count
GTO 01     # loop again
LBL 02     # end while loop
END        # main

Caveat

Input Output

For input and output use ? and !:

?n; ?k; CALL perm; !p


Syntax

The syntax for comparisons is slightly changed.
Instead of = for equal use ==.
Instead of # for not equal use !=.

Integer Division

It is assumed that the machine uses only integers.
Therefore division results in integers.
However I find it more useful to use the ordinary division on an HP-42S.
It is easy to include the IP command if that is needed.
Or then replace ÷ by BASE÷ in the code:
Code:
            elif term[0] == "DIVIDE":
                print("÷")

Recursion

Due to the limited size of the return stack of the HP-42S I assume that recursion doesn't really work.

Scope of Variables

The same variable name can be used local to a procedure but these are still mapped to global registers.
Cf. examples/scope.pl0


.zip  pl0-compiler.zip (Size: 14.87 KB / Downloads: 11)
Find all posts by this user
Quote this message in a reply
02-17-2024, 10:41 AM
Post: #2
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language
In his talk "Reinventing the Parser Generator" David Beazley explains his ply project that is used here:


Find all posts by this user
Quote this message in a reply
02-17-2024, 05:02 PM
Post: #3
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language
(02-17-2024 10:41 AM)Thomas Klemm Wrote:  In his talk "Reinventing the Parser Generator" David Beazley explains his ply project that is used here:

That was fun to watch Smile Especially the part where he said he hopes the ply project dies...

"Reinventing the Parser Generator" is a big stretch IMO. More like "Implementing Lex and Yacc in Python and how to Mess it Up" Big Grin

Seriously. Ply doesn't even warn when token regex patterns are subsumed, only for us to find out the tokenizer we wrote does not work? Hacking around a way to define a "PRINT" token to not match ID is pure evil. And what about LALR conflicts? Yacc and Bison not only warn, but also output annotated LALR states (dot rules) that help identify the problem to correct. Ambiguous grammars lead to all sorts of nasty problems. Disingenuous to disparage the Dragon book. It's all there and better. Don't teach students tools to use. Teach them how to write them and write them well by teaching the beauty of the underlying algebraic structures and corresponding algorithms. Then use tools, because you don't have to reinvent them. Unless you can significantly improve them (good luck).

- Rob

"I count on old friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx...
Visit this user's website Find all posts by this user
Quote this message in a reply
02-18-2024, 09:32 AM
Post: #4
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language
(02-17-2024 05:02 PM)robve Wrote:  Teach them how to write them and write them well by teaching the beauty of the underlying algebraic structures and corresponding algorithms.

What tools would you suggest if we want to compile a language to the HP-42S?
  • recursive decent parser
  • lex & yacc
  • ANTLR
  • parser combinators

Something totally different?

And which language would you use?
  • PL/0
  • Oberon-0
  • Python
  • Lua
  • Lisp
  • FORTH

Find all posts by this user
Quote this message in a reply
02-18-2024, 11:42 AM
Post: #5
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language
prolog.
a prolog interpreter for lisp https://www.metalevel.at/lisprolog/

HP71 4TH/ASM & Multimod, HP41CV/X & Nov64d, PILBOX, HP-IL 821.62A & 64A & 66A, Deb11 64b-PC & PI2 3 4 w/ ILPER, VIDEO80, V41 & EMU71, DM41X, HP75D
Find all posts by this user
Quote this message in a reply
02-18-2024, 12:45 PM
Post: #6
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language
There are also Parsing Expression Grammars and, for some reason, Lua seems to be the most popular language to implement them in.

There's an intro here but I've not gone through it closely so no guaranties as to its quality.
Find all posts by this user
Quote this message in a reply
02-18-2024, 02:59 PM
Post: #7
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language
(02-18-2024 11:42 AM)floppy Wrote:  prolog.
a prolog interpreter for lisp https://www.metalevel.at/lisprolog/

Thanks for sharing. I like these kind of projects very much, that are a little different than the usual PL projects.

Here's a lazy functional language interpreter written in Prolog: Husky (disclaimer: which I wrote in the 90s).

On the topic of Lisp: a lua-to-lisp transpiler with the transpiler written in lex and yacc (actually, RE/flex and Bison, but that's just about the same). RE/flex does a lot more than lex (and flex) though, to make lexical analysis easier to specify and implement in C++.

Pick what you like:
  • lex/yacc (and proper variants) produce critical warnings and details to help correct lexical and grammatical issues, yacc and bison cover a wide range of LR grammars
  • recursive descent (for LL grammars) is generally only used for parsing small languages (class project PL are typically small, also PL/0 and Pascal are not that large), although C# was entirely written in R-D, which is not trivial
  • ANTLR is OK (but it is essentially R-D for LL grammars with some tweaks)
  • not so much heard of, but historically relevant: Prolog DCG to quickly implement recursive descent parsers
  • PEGs are great and a good choice
  • operator-precedence grammars for expressions (e.g. algebraic expressions, also Prolog is mostly an OP grammar), covers a subset of LR(1) and are very easy to implement as a general driver engine with a table that is modifiable at parsing time (like Prolog) to change/define operator precedence/associativity on the fly

Deterministic context-free LR(1) (LALR) grammars are far more common than LL. LL requires a rewrite to remove left recursion and also to explicitly deal with (or remove) ambiguity. PEGs originate from LL parsing (i.e. top down) but take a different approach to resolve some issues with LL.

The list of tools and choices is extensive and covers a wide range of PL to implement a parser.

A hand-written R-D parser for a tiny PL is fun and nice to implement too, especially if the grammar is already defined as a R-D grammar (LL, top down) and was tested, so you're not getting stuck in a long rinse-repeat cycle of testing and tweaking the grammar/parser. It is often a good choice to implement parsing on embedded devices and yesteryears 8-bit systems.

- Rob

"I count on old friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx...
Visit this user's website Find all posts by this user
Quote this message in a reply
02-21-2024, 12:35 PM
Post: #8
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language
I've used Flex and Bison for many years, including use in the microassembler packaged with Nonpareil. Most of this was in C, and I haven't yet converted any of those to use the native C++ support in Flex or Bison. I wasn't previously aware of RE/flex, but it looks quite appealing since much of my development is now in C++. Thanks for pointing it out!

I've also in the last yew years used some PEG parsers, including pyparsing for Python, and PEGTL for C++. I've used PEG for some of my more recent, non-calculator-related programs. pyparsing has some good support for ignoring whitespace between tokens (which can of course be disabled), but unfortunately PEGTL does not.

For those not familiar with PEG (Precedence Expression Grammars):

PEG is interesting in that it's generally used with a single rule set that does both tokenization (as might be done by flex) along with parsing (as might be done with bison). While one could use two layers of PEG, one for scanning and one for parsing, I've never seen that done.

Debugging a PEG parser can be challenging because PEG doesn't detect shift/shift or shift/reduce conflicts. PEG uses the first matching production (hence the "Precedence"). Once it finds that first matching production, it does not care about, nor warn about, ambiguity, because the ordering always will disambiguate cases where multiple productions could match. This is simultaneously a blessing and a curse. Trying to adapt LR or LALR rgrammars for any non-trivial language to PEG, or writing a PEG grammar from an LR or LALR mindset, yields much frustration, because the syntax of productions is basically the same, but there's that huge difference in semantics. If you're accustomed to Yacc or Bison, it takes a lot of getting used to. For instance, a grammar to parse C style unsigned integers in decimal, octal with a leading zero, or hexadecimal with a leading "0x", doesn't work if you write it as:

dec_lit: [0-9]+
oct_lit: 0[0-9]+
hex_lit: 0x[0-9a-fA-F]+
lit: dec_lit | oct_lit | hex_lit

That will interpret intenxed octal literals as decimal, and intended hex literals as a decimal 0 with the x and subsequent digits not consumed as part of lit.

If you reverse the order of the alternatives in the lit production, then it will work as desired.

You may be able to guess how I learned this.
:-)
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 1 Guest(s)