HP-42S Compiler for Niklaus Wirth's PL/0 Language
|
02-21-2024, 12:35 PM
Post: #8
|
|||
|
|||
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language
I've used Flex and Bison for many years, including use in the microassembler packaged with Nonpareil. Most of this was in C, and I haven't yet converted any of those to use the native C++ support in Flex or Bison. I wasn't previously aware of RE/flex, but it looks quite appealing since much of my development is now in C++. Thanks for pointing it out!
I've also in the last yew years used some PEG parsers, including pyparsing for Python, and PEGTL for C++. I've used PEG for some of my more recent, non-calculator-related programs. pyparsing has some good support for ignoring whitespace between tokens (which can of course be disabled), but unfortunately PEGTL does not. For those not familiar with PEG (Precedence Expression Grammars): PEG is interesting in that it's generally used with a single rule set that does both tokenization (as might be done by flex) along with parsing (as might be done with bison). While one could use two layers of PEG, one for scanning and one for parsing, I've never seen that done. Debugging a PEG parser can be challenging because PEG doesn't detect shift/shift or shift/reduce conflicts. PEG uses the first matching production (hence the "Precedence"). Once it finds that first matching production, it does not care about, nor warn about, ambiguity, because the ordering always will disambiguate cases where multiple productions could match. This is simultaneously a blessing and a curse. Trying to adapt LR or LALR rgrammars for any non-trivial language to PEG, or writing a PEG grammar from an LR or LALR mindset, yields much frustration, because the syntax of productions is basically the same, but there's that huge difference in semantics. If you're accustomed to Yacc or Bison, it takes a lot of getting used to. For instance, a grammar to parse C style unsigned integers in decimal, octal with a leading zero, or hexadecimal with a leading "0x", doesn't work if you write it as: dec_lit: [0-9]+ oct_lit: 0[0-9]+ hex_lit: 0x[0-9a-fA-F]+ lit: dec_lit | oct_lit | hex_lit That will interpret intenxed octal literals as decimal, and intended hex literals as a decimal 0 with the x and subsequent digits not consumed as part of lit. If you reverse the order of the alternatives in the lit production, then it will work as desired. You may be able to guess how I learned this. :-) |
|||
« Next Oldest | Next Newest »
|
Messages In This Thread |
HP-42S Compiler for Niklaus Wirth's PL/0 Language - Thomas Klemm - 01-14-2024, 04:01 PM
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language - Thomas Klemm - 02-17-2024, 10:41 AM
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language - robve - 02-17-2024, 05:02 PM
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language - Thomas Klemm - 02-18-2024, 09:32 AM
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language - floppy - 02-18-2024, 11:42 AM
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language - robve - 02-18-2024, 02:59 PM
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language - brouhaha - 02-21-2024 12:35 PM
RE: HP-42S Compiler for Niklaus Wirth's PL/0 Language - BruceH - 02-18-2024, 12:45 PM
|
User(s) browsing this thread: 1 Guest(s)