FORTH for the SHARP PC-E500 (S)
10-06-2021, 02:41 PM
Post: #21
 robve Member Posts: 185 Joined: Sep 2020
RE: FORTH for the SHARP PC-E500 (S)
(10-05-2021 01:02 PM)Klaus Overhage Wrote:  In RUN mode: COPY "COM:" TO "E:debugger.fth",A
In Forth500: INCLUDE debugger.fth
-- after 90 second --

Exception #-13 stands for "undefined word"

The large debugger.fth is the one and original code example that Sébastien wrote for pceForth. It meta-interprets Forth to debug Forth. I made a minor change to it, but have not yet tested it. The debugger.fth file has a disclaimer:

\ Updated to Forth500 but may need some more testing!

When an error occurs, the error occurs in the definition of the first word displayed by WORDS. The definition is incomplete, so it cannot be executed. But WORDS displays all words, including incomplete ones.

There should be a better way by adding some code to display the -13 error message with the word that wasn't found. I will add that to my TODO list. The parsed input from a file is stored in the FIB buffer at position >IN.

Also, INCLUDE does not catch exceptions so the file may still be open and not closed. The file must be closed to read it again, e.g. try 4 CLOSE-FILE to close the fileid (typically 4). I will look into this too, to close automatically.

It would be nice to add more examples and additions to Forth500. For example, implementing SEE would be nice, e.g. to load from see.fth. These words should be optional, otherwise adding these as built-ins eats away free space. On an unexpanded machine the Forth500 free space will be about 7K after adding the floating point words. I don't want to push that down much lower. It was possible to reduce Forth500 below the 20628 bytes that pceForth required. I am pleased with that, because Forth500 adds words and new features (see the changelog in the repo).

- Rob

"I can count on my friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx...
10-06-2021, 11:35 PM
Post: #22
 Helix Member Posts: 225 Joined: Dec 2013
RE: FORTH for the SHARP PC-E500 (S)
(10-06-2021 02:41 PM)robve Wrote:  When an error occurs, the error occurs in the definition of the first word displayed by WORDS. The definition is incomplete, so it cannot be executed. But WORDS displays all words, including incomplete ones.

I've noticed that if there is an error in a definition, the corresponding word cannot be deleted from the dictionary with FORGET. I have to delete the previous valid definition if I want to get rid of this false entry. Is this normal?

(10-06-2021 02:41 PM)robve Wrote:  It would be nice to add more examples and additions to Forth500. For example, implementing SEE would be nice, e.g. to load from see.fth. These words should be optional, otherwise adding these as built-ins eats away free space. On an unexpanded machine the Forth500 free space will be about 7K after adding the floating point words. I don't want to push that down much lower. It was possible to reduce Forth500 below the 20628 bytes that pceForth required. I am pleased with that, because Forth500 adds words and new features (see the changelog in the repo).

I'm too in favor of simple systems. When I tried different Forth packages some years ago, I found that F-PC Forth has 1523 words! I think this defeats the purpose of Forth.
Adding optional definitions for those who are interested is a better solution.

Jean-Charles
10-07-2021, 05:55 PM
Post: #23
 robve Member Posts: 185 Joined: Sep 2020
RE: FORTH for the SHARP PC-E500 (S)
(10-06-2021 11:35 PM)Helix Wrote:  I've noticed that if there is an error in a definition, the corresponding word cannot be deleted from the dictionary with FORGET. I have to delete the previous valid definition if I want to get rid of this false entry. Is this normal?

Good point! Yep, this is normal. A word cannot be found (with FORGET, ' tick etc.) if it is hidden. To unhide and delete the last definition, use REVEAL then FORGET. Incomplete definitions are hidden to prevent accidentally running them, which would lead to a crash obviously. WORDS still shows all hidden words but FORGET can't find them.

FORGET implementations in Forth may slightly differ in this respect, but FORGET is considered obsolescent by the standard anyway. However, FORGET is still very useful as an easy way to redo a definition interactively. In general, placing a MARKER is preferred to delete code (see MARKER and ANEW). FORGET was removed by Sébastien from his pceForth (it was commented out). I rewrote parts of it to correct a bug in FORGET that caused a dictionary memory leak.

To go back to the question about improving error reporting, the word that caused the exception should be shown to the user with some context. This should be easy to implement by changing the last part of (ERROR) to report location of the error on the line by showing the line up to and including the word that caused the error:

SOURCE >IN UMIN TYPE ." << exception #" S>D (D.) TYPE

where SOURCE returns a pointer and size to the input buffer (TIB or FIB) and >IN is the location in this buffer of the next word after the last word executed. This will report the error in user input and in source files (albeit without a line number alas).

Closing a file after INCLUDE should be done as follows by catching INTERPRET exceptions in a new definition of INCLUDE-FILE:

Code:
: INCLUDE-FILE     save-input n>r     to source-id     begin       refill     while       ['] interpret catch ?dup if         source-id close-file drop         n>r restore-input drop         throw       then     repeat     source-id close-file drop     nr> restore-input drop

This new definition uses updated SAVE-INPUT and RESTORE-INPUT combined with N>R and NR>, making the code of INCLUDE-FILE and EVALUATE more compact, thus saving some memory.

(10-06-2021 11:35 PM)Helix Wrote:  I'm too in favor of simple systems. When I tried different Forth packages some years ago, I found that F-PC Forth has 1523 words! I think this defeats the purpose of Forth.
Adding optional definitions for those who are interested is a better solution.

Wow. 1523 words is way too much and unnecessary for most applications. However, at least we should incorporate the most useful standard word sets in Forth500 and include E500-specific words for graphics, sound and the file system. It is always possible to load extra words from source files.

I'd like to add that at this point anyone interested in this project can suggest improvements and additions to the Forth500 core, as long as there is sufficient space for user programs. Placing the additions in source files to load on demand is probably best.

As a note about standard Forth compliance, I noticed that there is no REQUIRE and REQUIRED implemented yet in Forth500. So I came up with the following quick-and-dirty implementation that simply stores the filename in the dictionary with a leading space in the name:

Code:
: REQUIRED     which-pocket dup>r bl over c! 1+ swap dup 1+ >r cmove     2r@ find-word nip if r>drop r>drop exit then     2r@ included     2r> (created) ; : REQUIRE     parse-name required ;

Adding a leading space means that the filename word in the dictionary cannot clash with other words and cannot even be executed by accident. This word will be added after the file was successfully INCLUDED. ANEW and MARKER in the loaded file will also delete the filename, thus the next REQUIRE will load the file again if it was deleted from memory. It's a bit of a hack, but should work I believe. Note that the code above uses a new built-in system word (CREATED), which I added as a replacement of (LINK) and (NAME).

Updating the code and testing all of the additions and improvements will take a bit of time, but hopefully not too long.

I also would like to run the NQUEENS benchmark as suggested by xerxes. To this end, based on my understanding of the NQUEENS benchmarks I changed the NQUEENS Forth code slightly to use "nicer" standard Forth constructs, such as VALUE and by using POSTPONE to compile inlined versions of RCLAA and STOAA instead of calling them (disclaimer: this is yet untested on my end):

Code:
ANEW _NQUEENS_  8 CONSTANT RR  0 VALUE SS  0 VALUE XX  0 VALUE YY  CREATE AA RR 1+ ALLOT  : RCLAA POSTPONE AA POSTPONE + POSTPONE C@ ; IMMEDIATE  : STOAA POSTPONE AA POSTPONE + POSTPONE C! ; IMMEDIATE  : NQCORE    0 TO SS    0 TO XX    BEGIN      1 +TO XX RR XX STOAA      BEGIN        1 +TO SS        XX TO YY        BEGIN YY 1 > WHILE          -1 +TO YY          XX RCLAA YY RCLAA - DUP          0= SWAP ABS XX YY - = OR IF            0 TO YY            BEGIN XX RCLAA 1- DUP XX STOAA 0= WHILE              -1 +TO XX            REPEAT          THEN        REPEAT      YY 1 = UNTIL    RR XX = UNTIL  ;  : NQUEENS    ( STARTTIMER )    NQCORE    ." S=" SS    ( DISPLAYTIMER ) CR  ;
NQCORE should be run in a loop to execute multiple times to get a manual stopwatch timing (the E500 has no RTC or system clock).

There are still a few speed and code size optimizations to make in Forth500, which can affect this benchmark. Safety versus performance is an important consideration. For example, I opted to test for the BREAK key (+15 CPU cycles) and to check stack overflows (+19 CPU cycles) but only when necessary and not too frequently. BREAK key tests are only done when a colon definition is called and in loops. Stack overflow checks are done only in loops and when interpreting Forth code from an input source to keep the overhead low. I also reduced the overhead of stack checking to just 19 cycles with some coding tricks. Removing these tests speeds things up, but at the cost of possible runaway programs when a coding mistake was made.

Inspiration for picking up pceForth to create an updated version Forth500 came from SuperForth and Forth for HP-71B. Back in the 80s my first encounter and tryout with Forth was SuperForth for the QL by Garry Jackson. I learned the language, studied the Forth implementation in detail and wrote some stuff, but found the lack of file access to load files wanting. It supported blocks, which even at that time it felt like a huge step back to ancient times. So no blocks in Forth500, but blocks can be added from a source file if necessary.

There will be more to come soon, hopefully in a couple of days when I'm back to work on this project.

- Rob

"I can count on my friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx...
10-07-2021, 11:48 PM
Post: #24
 Helix Member Posts: 225 Joined: Dec 2013
RE: FORTH for the SHARP PC-E500 (S)
(10-07-2021 05:55 PM)robve Wrote:  Good point! Yep, this is normal. A word cannot be found (with FORGET, ' tick etc.) if it is hidden. To unhide and delete the last definition, use REVEAL then FORGET. Incomplete definitions are hidden to prevent accidentally running them, which would lead to a crash obviously. WORDS still shows all hidden words but FORGET can't find them.

Thank you! I still have a lot to learn about Forth.

Jean-Charles
10-08-2021, 01:04 PM
Post: #25
 robve Member Posts: 185 Joined: Sep 2020
RE: FORTH for the SHARP PC-E500 (S)
(10-05-2021 01:02 PM)Klaus Overhage Wrote:  Thank you Helix for BINTOTXT.EXE. From Forth500.bin it directly generates the 71k byte text file required for the MBSharpNotepad. And with your tip in the OPEN command to replace the parameter C with L, I can now use your original BASIC program except for this small change. The runtime has surprisingly remained at 18 minutes, it is probably given by MBSharpNotepad.

A 71KB file takes some time to load, but I'm surprised it takes 18 minutes. Have you set SIO to 9600 baud? If this can't be done faster, then I'm sticking with the cassette transfer method that takes 90 seconds.

(10-05-2021 01:02 PM)Klaus Overhage Wrote:  Next I tried to load the file debugger.fth from the folder "additions".

In RUN mode: COPY "COM:" TO "E:debugger.fth",A
In Forth500: INCLUDE debugger.fth
-- after 90 second --

The dictionary search is not optimized in the original code. As a consequence the loading and compilation of Forth takes some time and the program you are loading is not small. The original pceForth code compares the word length and if equal compares the word's names. I've made the comparison case insensitive. This adds only a few cycles with some clever bit bashing in assembly and won't add overhead that is noticeable, because the chars compared typically differ in their lower 5 bits that are checked first:

Code:
                mv      (!el),il                ; Set the counter lbl4:           mv      il,[x++]                ; Read next character of the current word string                 mv      a,[y++]                 ; Read next character of the searched string                 sub     a,il                    ; Compare the characters                 jrz     lbl5 ; CASE-INSENSITIVE FIND-WORD (COMMENT OUT FOR CASE-SENSITIVE FIND-WORD)                 test    a,$1f ; If not the same 32-byte block ASCII offset, no match jrnz lbl4a add a,il ; Restore character or a,$20                   ; Make it lower case                 cmp     a,'a'                   ; If less than 'a', no match                 jrc     lbl4a                 cmp     a,'{'                   ; If greater than 'z', no match                 jrnc    lbl4a                 sub     a,il                    ; Compare the characters again,                 test    a,$c0 ; but this time with a case-insensitive match jrz lbl5 ; END CASE-INSENSITIVE FIND-WORD However, the dictionary search can be optimized, like most Forth implementations. For example, the HP-71b limits searching based on the word length, thus checks dictionary entries for words of the same length only. Other implementations use trees or hashing. There are also simple and practical ways to speed up dictionary search, which I will try. For starters, comparing the length and the first character simultaneously to check a dictionary entry will speed things up. - Rob "I can count on my friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx... 10-08-2021, 05:36 PM Post: #26  Klaus Overhage Junior Member Posts: 40 Joined: Jan 2016 RE: FORTH for the SHARP PC-E500 (S) Thank you for all the detailed information. Large files are slower with MBSharpNotepad, but editing and loading normal source texts, whether BASIC or Forth, is really easy with it. I tried the example on strings at the end of manual.md on two PC-E500s. Both times most of it worked only strlower and strupper not: name type John Doe name strlower Exception #-4 name type John Doe name strupper Exception #-4 name type éohn Doe Exception #-4: "stack underflow" Is Forth500 only generated from the assembler source code Forth500.s or are there other Forth source codes that are part of Forth500? 10-08-2021, 07:16 PM Post: #27  robve Member Posts: 185 Joined: Sep 2020 RE: FORTH for the SHARP PC-E500 (S) (10-08-2021 05:36 PM)Klaus Overhage Wrote: I tried the example on strings at the end of manual.md on two PC-E500s. Both times most of it worked only strlower and strupper not: There is a typo in the manual. I had tested these examples on my E500, but I remember adding this example later to the manual but then changed it (bad idea), unfortunately causing this issue. There is a missing DUP and a missing SWAP in strlower/strupper. A missing SWAP can unfortunately cause a dictionary overwrite. Here is the corrected version: Code: Additional words to convert characters and string buffers to upper and lower case: : toupper ( char -- char ) DUP 'a '{ WITHIN IF$20 - THEN ;     : tolower   ( char -- char ) DUP 'A '[ WITHIN IF $20 + THEN ; : strupper ( string len -- ) 0 ?DO DUP I + DUP C@ toupper SWAP C! LOOP DROP ; : strlower ( string len -- ) 0 ?DO DUP I + DUP C@ tolower SWAP C! LOOP DROP ; For example: name strupper name TYPE ↲ JOHN DOE OK[0] (10-08-2021 05:36 PM)Klaus Overhage Wrote: Is Forth500 only generated from the assembler source code Forth500.s or are there other Forth source codes that are part of Forth500? Everything is generated from the single Forth500.s file. This file implements the entire dictionary. Words are defined in machine code or in "compiled" Forth. I don't like such a large monolithic file like this, but the dictionary linkage across all word definitions is essential. Eventually it would be nice to add Forth source files for extra words, such as for the SEE word to view compiled Forth definitions. Right now, Forth500 has over 456 built-in words that cover a large portion of the optional standard Forth word sets, not yet counting the 63 words in the float and float-ext word sets to be added soon. All this still fits in about 20KB. - Rob "I can count on my friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx... 10-10-2021, 01:38 AM Post: #28  robve Member Posts: 185 Joined: Sep 2020 RE: FORTH for the SHARP PC-E500 (S) (10-08-2021 01:04 PM)robve Wrote: However, the dictionary search can be optimized, like most Forth implementations. For example, the HP-71b limits searching based on the word length, thus checks dictionary entries for words of the same length only. Other implementations use trees or hashing. There are also simple and practical ways to speed up dictionary search, which I will try. For starters, comparing the length and the first character simultaneously to check a dictionary entry will speed things up. A quick update for those interested in this project, or in Forth, or in the E500's CPU. The new FIND-WORD assembly code listed further below runs about twice as fast as the old FIND-WORD code (the version shown in the previous post). This means that case-insensitive dictionary searches in Forth500 should speed up quite a bit. Loading and compiling a Forth source file is largely determined by dictionary search speed. The new CPU cycle stats compared to the old FIND-WORD, expressed in CPU cycles per word compared: mismatching length: old = 54 cycles, new = 34 cycles matching length but first characters differ: old = 108 cycles, new = 48 cycles matching words, character-by-character comparison: old = 53 cycles, new = 43 cycles The cost of a word length mismatch is 34 cycles. If the length matches, the cost of a first character mismatch is 48 cycles total (i.e. including the length match). Assuming a directory size of 519 words (expected with Forth500), this means that a full dictionary search takes 23ms to 32ms or slightly longer, depending on the word being searched: 34x519/768KHz = 23ms 48x519/768KHz = 32ms For example, an integer value 123 in the Forth source input matches the length of all 3-character words, but matches none of the words that start with a 1 thus taking 48x519 cycles to complete or 32ms. Explanation: all words, including integers, are first searched in the dictionary before pushed on the stack or compiled as an integer. The new FIND-WORD assembly, annotated with CPU cycles (disclaimer: this may not be the final version): Code: find_word: dw to_body db$09                 db      'FIND-WORD'             ; ( c-addr u -- 0 0 | xt 1 | xt -1 ) find_word_xt:   local                 mv      (!gl),a                 ; (gl) holds the string length (length < 64 checked next)                 mv      il,64                   ; Compare the string length                 sub     ba,i                    ; to the max of 63 characters                 popu    ba                      ; BA holds the string address                 pushu   x                       ; Save IP                 jrnc    lbl6                    ; String too long?                 mv      y,!base_address                 add     y,ba                    ; Y holds the string address                 mv      (!fl),[y++]             ; (fl) holds the first character of the string to search                 mv      (!yi),y                 ; (yi) holds the string address + 1                 mv      (!zi),y                 ; Set 2nd byte of (zi) to base address segment $b mvw (!zi),[!last_xt+3] ; (zi) holds the 20 bit LAST address ; LOOP OVER DICTIONARY lbl1: mv y,(!yi) ; 5 ; Y holds the string address + 1 mv il,(!gl) ; 4 ; IL holds the string length ; =9 cycles ; NEXT WORD IN THE DICTIONARY lbl2: mv x,(!zi) ; 5 ; X holds the address of the dictionary entry or (!zi),(!zi+1) ; 6 ; Check if the address of the dictionary entry is zero jrz lbl6 ; 2/3 ; Dictionary entry address is zero? mvw (!zi),[x++] ; 7 ; (zi) holds the previous dictionary link address mv ba,[x++] ; 5 ; A holds the word length and B holds the first character ; COMPARE STRING LENGTHS sub a,il ; 3 ; Compare string lengths test a,$7f           ; 3     ; Check string lengths, ignore immediate bit, keep smudge bit to force mismatch                 jrnz    lbl2            ; 2/3   ; String lengths are not the same?                                         ; =33 cycles +1 for jump if the length does not match ;               COMPARE FIRST CHARACTERS                 ex      a,b             ; 3     ; B holds immediate bit to save for later, A holds first character                 xor     a,(!fl)         ; 4     ; Compare first characters                 jrz     lbl4            ; 2/3   ; First characters match?                 test    a,$df ; 3 ; Check if case insensitive bits match jrnz lbl2 ; 2/3 ; Case insensitive characters differ? ; =33+14=47 cycles +1 for jump if the length does not match and the first character did not match mv a,(!fl) ; 3 ; A holds the first character of the string to search or a,$20           ; 3     ; Make it lower case (if A is a letter, checked next)                 cmp     a,'a'           ; 3                 jrc     lbl2            ; 2/3   ; A is not a letter?                 cmp     a,'{'           ; 3                 jrnc    lbl2            ; 2/3   ; A is not a letter?                 dec     il              ; 3     ; Decrement string length                 jrz     lbl5            ; 2/3   ; String length is zero?                                         ; =47+22=69 cycles if the length matched and the first character matched ;               LOOP OVER STRINGS TO COMPARE lbl3:           mv      a,[x++]         ; 4     ; A holds the next charater of the word                 mv      (!el),[y++]     ; 6     ; (el) holds the next character of the string to match                 xor     a,(!el)         ; 4     ; Compare characters                 jrz     lbl4            ; 2/3   ; Characters match?                 test    a,$df ; 3 ; Check if case insensitive bits match jrnz lbl1 ; 2/3 ; Case insensitive characters differ? mv a,(!el) ; 3 ; A holds the next character of the string to match or a,$20           ; 3     ; Make it lower case (if A is a letter, checked next)                 cmp     a,'a'           ; 3     ; A is not a letter?                 jrc     lbl1            ; 2/3                 cmp     a,'{'           ; 3     ; A is not a letter?                 jrnc    lbl1            ; 2/3 lbl4:           dec     il              ; 3     ; Decrement string length                 jrnz    lbl3            ; 2/3   ; String length is not zero?                                         ; =43 cycles for each subsequent character matched ;               FOUND A MATCHING WORD IN THE DICTIONARY lbl5:           add     ba,ba                   ; Check immediate bit stored in B                 mv      ba,x                    ; BA holds the execution token                 popu    x                       ; Restore IP                 pushu   ba                      ; Save new 2OS execution token                 mv      ba,-1                   ; Set new TOS to -1, word is not immediate                 jrnc    lbl7                    ; Immediate bit is unset?                 mv      ba,1                    ; Set new TOS to 1, word is immediate                 jr      lbl7 ;               NOT FOUND lbl6:           popu    x                       ; Restore IP                 sub     ba,ba                   ; Set TOS to zero                 pushu   ba                      ; Set 2OS to zero lbl7:           jp      !cont__

The new code is only one byte longer when assembled to binary than the old code!

- Rob

"I can count on my friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx...
10-10-2021, 09:11 AM
Post: #29
 Klaus Overhage Junior Member Posts: 40 Joined: Jan 2016
RE: FORTH for the SHARP PC-E500 (S)
I am able to read and understand assembly language. I have already written smaller assembly routines and can certainly learn a lot from you. At the moment, however, I would first like to report on my experiences from the point of view of a Forth500 user.

Your implementation of toupper and tolower shown above does not work for me. It throws exception # -13: undefined word with toupper as the first entry in the dictionary. The old implementation with the addition of DUP and SWAP works:
Code:
: toupper   ( char -- char ) DUP [CHAR] a [CHAR] { WITHIN IF $20 - THEN ; : tolower ( char -- char ) DUP [CHAR] A [CHAR] [ WITHIN IF$20 + THEN ; : strupper  ( string len -- ) 0 ?DO DUP I + DUP C@ toupper SWAP C! LOOP DROP ; : strlower  ( string len -- ) 0 ?DO DUP I + DUP C@ tolower SWAP C! LOOP DROP ;

I was also able to run the RC4 cipher program from the wikipedia entry for Forth without any problems.
(see https://en.wikipedia.org/wiki/Forth_(pro..._language)

When using WORDS, it happened to me that a BREAK via the ON key crashed the computer. I think that's what happens when the ON button bounces. And there is always an exception #-28 for "user interrupt", which is not so nice. Is it possible to use another key especially for WORDS, for example C-CE, for normal exit without exception? That would help a lot if you just want to look at the new words.
10-10-2021, 01:22 PM (This post was last modified: 10-10-2021 04:48 PM by robve.)
Post: #30
 robve Member Posts: 185 Joined: Sep 2020
RE: FORTH for the SHARP PC-E500 (S)
(10-10-2021 09:11 AM)Klaus Overhage Wrote:  And there is always an exception #-28 for "user interrupt", which is not so nice. Is it possible to use another key especially for WORDS, for example C-CE, for normal exit without exception?

Good idea! This can also improve a break from FILES.

With respect to your issue with BRK from WORDS, a debounce loop is used. I'm curious what the problem could be. I have not had this problem. Perhaps the timing of the second BRK bounce exceeded the debounce timing, implemented as follows:

Code:
break__:        local                 pre_on lbl1:           mv      il,$ff ; Test if the break lbl2: test ($ff),$08 ; key was intentionally jrnz lbl1 ; released dec i ; (break action is triggered jrnz lbl2 ; when the break key is released) pre_off endl mv il,-28 ; User interrupt test ($ff),$08 sets the z flag if BRK is not pressed. The debounce time is 4.3ms (13x255 cycles), which is rather short. A typical debounce time is 20ms or longer. Increasing the timing to 20ms should help. Also the inner jrnz lbl1 was changed to reset the debounce counter when a key bounce/hit reoccurs. With respect to Forth source file loading time, a relatively large file such as debugger.fth should take no more than about 30 seconds to compile with the new FIND-WORD. This can be further reduced to a couple of seconds, but this requires a redesign of the dictionary. A simple approach is the HP-71b implementation, which does not offer WORDS (or similar). This simplifies the search, because the order of dictionary words does not need to be preserved across the entire dictionary, only the relative order of words with the same name length. A hybrid approach could work well: limit WORDS to only list the user-defined words (and words loaded with INCLUDE). Built-in words are searched by name length to speed up compilation. This hybrid approach works with FORGET and MARKER, and does not require memory to store trees. Adding a small index table to search built-in words suffices. However, WORDS will not show the built-in words. - Rob "I can count on my friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx... 10-10-2021, 01:36 PM Post: #31  Helix Member Posts: 225 Joined: Dec 2013 RE: FORTH for the SHARP PC-E500 (S) I'm not at all an expert in assembly language, but I find the explanations on how the system works always interesting. (10-10-2021 09:11 AM)Klaus Overhage Wrote: When using WORDS, it happened to me that a BREAK via the ON key crashed the computer. I have no crash with my Sharp. A Break just causes an exception error. Jean-Charles 10-13-2021, 02:38 AM (This post was last modified: 10-16-2021 01:04 AM by robve.) Post: #32  robve Member Posts: 185 Joined: Sep 2020 RE: FORTH for the SHARP PC-E500 (S) (10-10-2021 01:36 PM)Helix Wrote: (10-10-2021 09:11 AM)Klaus Overhage Wrote: When using WORDS, it happened to me that a BREAK via the ON key crashed the computer. I have no crash with my Sharp. A Break just causes an exception error. I believe what may have happened is that the missing SWAP in the strupper example overwrote the start of the dictionary that contains the break logic. This caused instability. My bad to leave out the SWAP in the example. I'll take this opportunity for a quick update. I spent a bit of time to redesign the core Forth interpreter assembly to improve execution speeds. It looks feasible to accelerate Forth500 as follows: - colon call and return sequence (docol__xt + doret__xt): 22% faster - fetch-execute (cont__): 13% faster - deferred word vectoring (dodefer__xt): 23% faster - constant fetch (docon__xt): 16% faster - does> execution (does__xt): 17% faster The redesign uses a RAM register to extend 16 bit addresses to 20 bit by presetting the 3rd byte (high order byte) to the 11th segment$b of the memory address space (the CPU is little endian). This is cheaper than the current method of converting a 16 bit register to a 20 bit register. These 16 to 20 bit conversions happen a lot, because Forth500 cells are 16 bit when the machine is 20 bit.

The register assignments remain the same as before:
20 bit register X holds the IP (instruction pointer)
20 bit register U holds the SP (stack pointer)
20 bit register S holds the RP (return stack nointer)
16 bit registers BA (A low and B high) hold the TOS (top of stack)

Other registers available:
20 bit register Y
16 bit register I, assigning IL (I low) also sets IH (I high) to zero

Internal RAM is addressed as (N) with 8 bit N. Internal RAM can hold 8, 16 and 20 (24) bit values to load/store to/from registers and to/from external RAM.

To cover 16 bit to 20 bit addresses, we load a 16 bit address into a RAM "register", say (yi) and (yi+1) (two bytes internal RAM). We set and keep (yi+2) to $b (segment). To get the 20 bit address we simply load X from (yi). The changes to the core Forth500 execution words are summarized in this outline: Code: yi: equ$36 zi:             equ     $39 ps: equ$b                      ; 11th segment base_address:   equ     $b0000 ; 11th segment address ;------------------------------------------------------------------------------- org$b9000                  ; $b0000 or$b1000 or $b9000 ... ;------------------------------------------------------------------------------- pre_off boot: ;... mv (!yi+2),!ps ; Store segment in 3rd byte mv (!zi+2),!ps ; Store segment in 3rd byte ;... ;------------------------------------------------------------------------------- docol__xt: mv i,x ; 2 ; I holds the IP pushs i ; 6 ; Push IP (return address) pmdf (!yi),3 ; 4 ; Set new IP mv x,(!yi) ; 5 = 17 cycles ;--------------- interp__: pre_on ; cycles = 7 + 13 = 20 test ($ff),\$08       ; 5     ; Is break pushed?                 pre_off                 jrnz    break__         ; 2/3   ; Break was pushed ;---------------                        ; cycles = 13 cont__:         mvw     (!yi),[x++]     ; 7     ; Set (yi) to new execution token                 jp      (!yi)           ; 6     ; Execute new token ;------------------------------------------------------------------------------- break__:        ;... ;------------------------------------------------------------------------------- doret__xt:      mvw     (!yi),[s++]     ; 7     ; Pop IP (return address)                 mv      x,(!yi)         ; 5     ; X holds the IP                 mvw     (!yi),[x++]     ; 7     ; Fetch new execution token                 jp      (!yi)           ; 6     = 25 cycles versus 30 ;------------------------------------------------------------------------------- dovar__xt:      pushu   ba              ; 4     ; Save old TOS                 pmdf    (!yi),3         ; 4     ; Set new TOS                 mv      ba,(!yi)        ; 4     ; to the address of the data                 jr      !cont__         ; 3     = 15+13 cycles versus 15+15 ;------------------------------------------------------------------------------- docon__xt:      pushu   ba              ; 4     ; Save TOS                 mv      ba,[(!yi)+3]    ; 12    ; Set new TOS                 jr      !cont__         ; 3     = 19+13 cycles versus 23+15 ;------------------------------------------------------------------------------- dodefer__xt:    mvw     (!yi),[(yi)+3]  ; 14                 jp      (!yi)           ; 6     = 20 cycles versus 26 ;------------------------------------------------------------------------------- does__xt:       pushu   ba              ; 4     ; Save TOS                 pmdf    (!yi),3         ; 4     ; Set new TOS                 mv      ba,(!yi)        ; 4     ; to the address of the data                 mvw     (!yi),[s]       ; 7     ; The CALL does__xt return short address is the execution token                 mv      i,x             ; 2     ; I holds the IP                 mv      [s],i           ; 5     ; Push old IP                 mv      x,(!yi)         ; 5     ; Set new IP                 jp      !cont__         ; 4     = 35+13 versus 43+15

The pieces of this puzzle nicely fall in place, which is satisfying. I've used some of the more exotic instructions, such as PMDF (pointer modify) that operates on internal RAM 20 bit addresses, and JP (yi) to jump to the 20 bit address in (yi).

A colon-return sequence is reduced to 66 cycles from 79: 4 JP docol__xt + 17 (docol__xt) + 20 (interp__) + 25 (doret__xt). This is the execution overhead of a word defined as a colon definition and includes a check for a BREAK key press to interrupt execution. A colon definition internally in the dictionary starts with a JP docol__xt. A constant starts with JP docon__xt, a variable starts with JP dovar__xt.

A word fetch-execute overhead is reduced to 13 cycles from 15. This is the fetch-execute overhead of words defined in assembly, by fetching them as 16 bit addresses to execute by jumping to their machine code located at a 20 bit address.

I want to first roll out the floating-point addition, fully working and tested, for the next Forth500 update to the repo in two weeks (or so, because I need to make time for this). I will focus later on implementing further optimizations to speed up Forth500, e.g. using the outline above.

PS (edit): from the details of the CPU technical manual, PMDF may not perform the operation on a 20 bit pointer stored in internal RAM but rather on a single byte internal RAM pointer to internal RAM. Oops. This raises the cycle count to 71 from 79 by using inc x three times. Still a worthwhile speed improvement to consider.

- Rob

"I can count on my friends" -- HP 71B,Prime|Ti VOY200,Nspire CXII CAS|Casio fx-CG50...|Sharp PC-G850,E500,2500,1500,14xx,13xx,12xx...
 « Next Oldest | Next Newest »

User(s) browsing this thread: 1 Guest(s)