Threaded Mode | Linear Mode

jte · 09-21-2023, 01:27 AM

(09-18-2023 06:48 AM)komame Wrote: ⋮ (lots of good stuff)
⋮
That's why I mentioned that to apply your suggestion with binary encoding, two dictionaries would need to be maintained (even before the program is run).

If accents are wanted to be kept in a single dictionary, it would seem that stepping up to 6 bits per character could work (with 64-bit ints, this would then allow 10-character blocks to be processed at once). (Optimizations of this sort -- picking a dense encoding -- certainly limits applicability...)

If matches are to not distinguish accents, arranging the encoding so that bitwise operations may still be used seems to be possible. Here is a possible encoding of the sort I'm imagining:

Code:

E: 0

È: 2

É: 3

Ê: 4

Ë: 5

A: 8

À: 9

Â: 11

I: 12

Î: 13 

Ï: 14

O: 16

Ô: 17 

Ö: 18 

U: 20

Û: 21

Ü: 22

C: 24

Ç: 25

-: 26

B: 27

D: 29

F: 31

⋮

Letters with accents are arranged into blocks so that upper bits may be used to mask for matches. (To mask for 'E's and allow for accents a mask of #111100b would be used instead of #111111b.)

There still is a fair amount of latitude in the encoding. To minimize the number of translations during encoding and decoding, above I'm implying that A,B,...Z would be shifted to 26,27,...,51. (Although A eventually ends up at 8, as there are accented variants.) For amusement, the table above has the accented E characters placed so that a common shift [of 198] is employed for them (to and from Unicode codepoints) and for the accented U characters.

(Hmmm.... looking over the above now, though... I cooked up the above by looking at convAccents(), but now I notice there isn't a "Ù" in there, but I do see "Ù" in dictionary words... So, dear reader, please make the necessary adjustments to this post :-&.)

(09-18-2023 06:48 AM)komame Wrote: ⋮
In the case of binary encoding, we would have to perform this conversion by iterating over individual words and converting letter by letter, additionally performing many bitwise operations,

To defend the honour of lists Big Grin

, one could try doing as much work as possible with lists. A faster encoding / decoding may be possible by employing lists in the encoding / decoding; just as a single BITAND could be used to mask all the words in an encoded dictionary (of short words - <= 10 characters) for matching purposes, BITAND, comparisons, etc. may be used on lists of lists during encoding / decoding.

It may well be that adding some list / string slicing methods to PPL might make this sort of coding more efficient.

The first step to go from strings to a 6-bit-per-char encoding might, if working with a 9-character dictionary, be to use MAKELIST(ASC(MID(...)),...) on 9-character chunks of the string dictionary (to generate a list of lists - with possibly one more level of looping / construction due to the limit of 10,000 elements per list).

My intuition would be that we'd want to get the letters packed into integers with the fewest number of PPL operations as, once packed, one level of work is being parallelised intrinsically (at the C++ level).

A single PPL subtraction (of 39) with the list of lists could shift all the characters (with "B" going to 27, "D" going to 29, ...). The next step would be to shift the accented letters closer to the unaccented ones so that integers may be assembled. One way to do this would be something like "decomposed+(decomposed>51)*-153" if decomposed is the list of lists. (The >51 masks for the accented characters, the multiplication slots in the translation to apply to accented characters, the + applies the translations: 0 for encoded A...Z, -153 for encoded accented characters; the preceding would likely be faster with integers instead of floating-point.)

(The above overlooks "-"...)

The next step would be to assemble the integers (with this 6-bit "in between" encoding) so that the shifting and masking can be done in larger chunks (at the C++ level). If lists can be arranged so that one list has the first character of each word, another has the second character, etc. (as I wrote earlier, additional PPL list / string slicing operations might be quite useful for this sort of coding; it may well be that introducing a bit of such things may be necessary to help lists shine with this kind of coding) then a single BITSL can apply a shift to all of the encoded first letters, another BITSL can apply a shift to all of the encoded second letters, etc.

Once collected so that 10 characters per word are packed into 64-bit integers, then masks and translations can reorder the encoding in bulk.

I started out writing this with the intention to clarify my earlier comments, but after writing all of the above... hmmmm... :-&

jte · 09-21-2023, 04:07 AM

(09-18-2023 07:37 PM)komame Wrote: ⋮

(09-18-2023 08:27 AM)jte Wrote: ⋮
one possible result of it would be to change REPLACE so that lists (in the strings case) could be given as the second and third arguments, to enact multiple replacements with a single PPL call. (e.g., allow “replace(str,accents,latin);” to be used in CROSSWORDS’ convAccents(-) instead of the looped call to REPLACE that is currently being done.)
I think using a list as the second and third parameter here won’t make much difference.

Yes… in general. My immediate thought was to have REPLACE check for cases where optimized kernels could be employed — the relevant one here being a list of single-character replacement rules.

Quote:Notice that you have to perform the replacement sequentially; you can’t replace all found values at once in a single iteration over the text string because they may overlap.

This is connected to my second thought — what, exactly, would the semantics be for this more general REPLACE use? I was thinking that, whatever the general semantics would be, they could simplify down to the same meaning for when a list of single-character replacement rules is presented.

That still doesn’t directly lead to an exact definition in the more general case. The on-the-face-of-it definition (for a more general REPLACE) would be to quickly fall back to the current REPLACE rule for strings: have the meaning of REPLACE(string,{a,b,…},{A,B,…}) be to invoke REPLACE with a->A, b->B, … (each in turn, each starting with the result of the previous) This would still allow a list of single-character replacement rules to be identified and more efficiently enacted.

It is certainly possible different semantics might be better (useful in other cases). Just off the top of my head:

scanning from front to back and using the earliest match (of the list) is another possibility (i.e., see if match 1 applies starting from position 1, then match 2 and position 1, … — and skipping past a replacement once it is performed) or
have later matches only trigger on portions of the string unmatched by earlier rules.

I haven’t thought much at all about possible more-general semantics…

There are two aspects to keep in mind for a more general REPLACE: one being to allow for faster execution, another being to allow for shorter / simpler / more natural code.

Quote:⋮
It would only make sense for 1-character elements, but I guess you won’t impose a limitation that only 1-character elements can be provided in the list, right? The iteration itself in ‘accentConv(-)’, excluding REPLACE, only takes 1ms, and that’s the only time you can save. To illustrate this: imagine someone uses REPLACE like this: replace(“AABBABAB”,{”AB”,”BA”},{”BA”,”AB”}).
⋮

This sort of example popped right into my mind too, but mine was nowhere near as compact! Big Grin

And, again, we seem to be in agreement on the nature of interpreters! Big Grin

(Well, interpreter overhead — in this sort of case, to be more precise.)

komame · 09-21-2023, 04:12 AM

(09-21-2023 01:27 AM)jte Wrote: ⋮
(Hmmm.... looking over the above now, though... I cooked up the above by looking at convAccents(), but now I notice there isn't a "Ù" in there, but I do see "Ù" in dictionary words... So, dear reader, please make the necessary adjustments to this post :-&.)
⋮

In the entire dictionary, there is only one occurrence of the character "Ù", and it's for 2-character words that weren't originally supposed to be here, which is why this character was omitted. It would probably make more sense to remove the 2-character dictionary, or I will add a dedicated condition for the 2-character dictionary in the conversion procedure. Nevertheless, thanks for this observation.

(09-21-2023 01:27 AM)jte Wrote: To defend the honour of lists, one could try doing as much work as possible with lists. A faster encoding / decoding may be possible by employing lists in the encoding / decoding; just as a single BITAND could be used to mask all the words in an encoded dictionary (of short words - <= 10 characters) for matching purposes, BITAND, comparisons, etc. may be used on lists of lists during encoding / decoding.

It may well be that adding some list / string slicing methods to PPL might make this sort of coding more efficient.

The first step to go from strings to a 6-bit-per-char encoding might, if working with a 9-character dictionary, be to use MAKELIST(ASC(MID(...)),...) on 9-character chunks of the string dictionary (to generate a list of lists - with possibly one more level of looping / construction due to the limit of 10,000 elements per list).

When binary search could somehow be integrated here, it might have a chance to work quickly. However, if we have to operate on entire lists, which have 10,000 elements each (and it's well-known that such large lists perform very slowly in PPL), such a solution is probably not feasible at the moment. Performing a single BITAND, BITSL operation on a list of 10,000 binary integers (64b) takes over 70ms, and you would need to perform at least few similar operations based on what you're describing. Moreover, even the dictionary preparation itself, which involves something like MAKELIST(ASC(MID(words,X,9)),X,1,DIM(words)/9,9), takes over 1800ms, and this operation needs to be repeated multiple times for a larger dictionary when the word count exceeds 10,000. Such a dictionary should already be stored in a 6-bit binary version before the program is executed, but I'm afraid the time required for decoding the result would completely exclude this solution in terms of performance. And yet we haven't even mentioned 21-character words Wink

Text strings are a very basic data type, and low-level operations will always be faster on them than on lists, which are a compound data type and have several disadvantages in PPL.
Nevertheless, thank you for these considerations. Regardless of the fact that we're planning a solution in PPL here, what you're writing is a very interesting approach. It's clear that you're familiar with algorithms and can come up with creative solutions.

Piotr

jte · 09-21-2023, 04:19 AM

(09-19-2023 05:19 AM)Tyann Wrote: ⋮
For REPLACE, I could see something like REPLACE(str1,{"é", "è", "ê"}, "e");

Yes, this seems quite natural. At least to us two, as when thinking about the more general REPLACE semantics, this popped into my head too. But then I thought: wouldn't it be nice to be able to do multiple rules of this sort? And that leads to REPLACE(str1,{{"é", "è", "ê"},{"à","â"}}, {"e","a"})! Hmmm...

komame · (This post was last modified: 09-21-2023 04:32 AM by komame.)

(09-21-2023 04:19 AM)jte Wrote:
(09-19-2023 05:19 AM)Tyann Wrote: ⋮
For REPLACE, I could see something like REPLACE(str1,{"é", "è", "ê"}, "e");

Yes, this seems quite natural. At least to us two, as when thinking about the more general REPLACE semantics, this popped into my head too. But then I thought: wouldn't it be nice to be able to do multiple rules of this sort? And that leads to REPLACE(str1,{{"é", "è", "ê"},{"à","â"}}, {"e","a"})! Hmmm...

The solution proposed by Tyann (a single multi-rule) makes sense and will indeed be more efficient. However, if you add more rules, we go back to the topic of iteration because each multi-rule must be considered independently, and here there will be no performance gain beyond avoiding the PPL call itself. But it's hard for me to assess how much overhead there is in this case.

Nevertheless, I think you can start implementing Tyann's version. Will it be ready by tomorrow? Big Grin

Tyann · 09-21-2023, 04:59 AM

Bonjour Jte
En fait actuellement REPLACE fonctionne sur des listes et sur des chaînes , ainsi si j'entre REPLACE(str,{"é","è"},"e") j'obtiens une erreur .
Mais les fonctions de chaînes uniquement peuvent utiliser des listes :
INSTRING({"abce", "cfer"},"e") fonctionne très bien par exemple.
Ainsi peut être faudrait il mieux faire une fonction spécifique aux chaînes SREPLACE par exemple qui pourrait alors utiliser des listes de chaînes plus facilement ?

Bonjour Komane
Il n' y a pas de code spécifique en fait, je récupére la sous liste de la AVars dans une variable locale disons 'l' puis je fais un s:=convAccent(STRING(l)) puis l1:=EXPR(s)
'l' contient les mots avec accents que je renvoie dans la solution et 'l1' contient les mots sans accents pour la comparaison et 's' contient quelque chose comme : "{"abaca","acras"....}".

Hello Jte
Actually REPLACE works on lists and strings, so if I enter REPLACE(str,{"é", "è"}, "e") I get an error.
But string-only functions can use lists:
INSTRING({"abce", "cfer"}, "e") works very well, for example.
So perhaps it would be better to make a function specific to SREPLACE strings, for example, which could then use lists of strings more easily?

Hello Komane
There's no specific code in fact, I get the AVars sub-list in a local variable, say 'l', then I do a s:=convAccent(STRING(l)) then l1:=EXPR(s)
l' contains the words with accents that I'm returning in the solution and 'l1' contains the words without accents for comparison and 's' contains something like "{"abaca", "acras"....}".

jte · (This post was last modified: 09-21-2023 08:00 PM by jte.)

(09-21-2023 04:27 AM)komame Wrote:
(09-21-2023 04:19 AM)jte Wrote: Yes, this seems quite natural. At least to us two, as when thinking about the more general REPLACE semantics, this popped into my head too. But then I thought: wouldn't it be nice to be able to do multiple rules of this sort? And that leads to REPLACE(str1,{{"é", "è", "ê"},{"à","â"}}, {"e","a"})! Hmmm...
The solution proposed by Tyann (a single multi-rule) makes sense and will indeed be more efficient. However, if you add more rules, we go back to the topic of iteration because each multi-rule must be considered independently, and here there will be no performance gain beyond avoiding the PPL call itself.
⋮

Well, in either case, I'd think I'd have the REPLACE code look for the special case and employ a special routine when that case is identified.

While I didn't design or implement PPL, I've implemented special-case handling in some parts of the project. The Function app, for example, doesn't use the main PPL interpreter for the sorts of things typically graphed by high-school students. Instead, it uses something closer to a "bytecode" interpreter. Before graphing begins, the PPL (the part after the "F1(X)=") is analyzed. If it is of a simple-enough form to fit the special-case interpreter, a simple list of instructions is generated for an abstract machine (of a "push" form - things like "add register 1 to register 5, put the result in register 7"). Things like common subexpressions are recognized and only evaluated once (when evaluating a user-defined function for a particular value).

I implemented this first for the Graph 3D app as even very simple plots do involve many function evaluations. With the Graph 3D app, the analysis (of PPL structures) breaks apart the function so that things like SIN(X) can be evaluated only once per column rather than once per cell. (The Function app similarly analyzes for dependencies, but there, in effect, it is just constant folding as the categories are either "depends on X" or "doesn't depend on X".) The "bytecode" evaluation engine should produce identical numerical results (down to the last digit) as the standard PPL interpreter.

My intuition is that part of the speed variability that can be seen with the standard PPL interpreter has to do with the heap. (The special-case interpreters I've just mentioned above do not use the heap during evaluations.)

(The "first for the Graph 3D app": the Advanced Graphing app also uses a "bytecode" interpreter, and I implemented this earlier, although that case is a bit different.)

There's a whole variety of approaches that could be considered for increasing the speed of PPL evaluations. Things like benchmarks can help guide development. To date, my focus has been more on the parts typically used by typical high-school students in typical high-school settings (as that was my role).

komame · (This post was last modified: 09-23-2023 05:17 AM by komame.)

(09-21-2023 07:59 PM)jte Wrote: While I didn't design or implement PPL, I've implemented special-case handling in some parts of the project. The Function app, for example, doesn't use the main PPL interpreter for the sorts of things typically graphed by high-school students. Instead, it uses something closer to a "bytecode" interpreter. Before graphing begins, the PPL (the part after the "F1(X)=") is analyzed. If it is of a simple-enough form to fit the special-case interpreter, a simple list of instructions is generated for an abstract machine (of a "push" form - things like "add register 1 to register 5, put the result in register 7"). Things like common subexpressions are recognized and only evaluated once (when evaluating a user-defined function for a particular value).

I don't fully understand. After all, PPL is also a bytecode interpreter. Is what you're analyzing the result of tokenization by the PPL mechanism, or is a different bytecode generated for your mechanism?

(09-21-2023 07:59 PM)jte Wrote: My intuition is that part of the speed variability that can be seen with the standard PPL interpreter has to do with the heap. (The special-case interpreters I've just mentioned above do not use the heap during evaluations.)

I've always been puzzled by why in PPL the program size affects performance. Even dead code slows it down. It's a very peculiar implementation, even for an interpreter.
From what Cyrille explained here: https://www.hpmuseum.org/forum/thread-12585.html it seems that when handling variables, there's some kind of ~~code~~ linear scanning to look for variables, and this happens multiple times instead of a one-time scan and keeping references in a hash table or a binary tree for optimal performance. It's an unusual approach, and all of this negatively impacts PPL's performance.

That also makes me wonder:

(03-11-2019 06:53 AM)cyrille de brébisson Wrote: - your 100 "dummy" function are of course slowing down any lookup of non local variables as they enter relatively high in the variable precedence order and the lookup is (sorry about this) linear (the name list is not sorted). So adding 100 extra items to look through will slow down any search that has to traverse it.

~~The order of functions in PPL doesn't matter, so by scanning the code once, you can cache addresses for calls that can be sorted by names, allowing for efficient searching.~~

EDIT: It was only the next day after writing this post that I realized Cyrille meant that labels are scanned in such a way that functions are searched first, and then global variables. This doesn't refer to scanning the code but rather those lists where labels for functions and variables are stored. Nevertheless, it's still a linear search, which means it shouldn't work efficiently, and what I wrote about the hash table and binary tree still applies here.

BR,
Piotr

jte · 09-22-2023, 07:47 PM

(09-22-2023 07:29 AM)komame Wrote:
(09-21-2023 07:59 PM)jte Wrote: While I didn't design or implement PPL, I've implemented special-case handling in some parts of the project. The Function app, for example, doesn't use the main PPL interpreter for the sorts of things typically graphed by high-school students. Instead, it uses something closer to a "bytecode" interpreter. Before graphing begins, the PPL (the part after the "F1(X)=") is analyzed. If it is of a simple-enough form to fit the special-case interpreter, a simple list of instructions is generated for an abstract machine (of a "push" form - things like "add register 1 to register 5, put the result in register 7"). Things like common subexpressions are recognized and only evaluated once (when evaluating a user-defined function for a particular value).
I don't fully understand. After all, PPL is also a bytecode interpreter. Is what you're analyzing the result of tokenization by the PPL mechanism, or is a different bytecode generated for your mechanism?

Yes, for the Function app, there is another evaluation engine that handles formula evaluation for when the PPL is simple enough (for this alternative evaluation engine to handle). This second engine doesn't, for example, handle strings or matrices or lists, ... (or branching...) The restrictions allow the engine to preallocate storage and simply run through a series of floating-point function evaluations.

The input to the analysis is not raw strings, but internal structures used by the PPL system. (The analysis doesn't need to [re-]do tokenization, doesn't need to [re-]handle precedence rules etc.)

I first implemented this second evaluation engine (and associated "bytecode") that targets BCD floating-point evaluations for the Graph 3D app; there it can allow for substantially fewer BCD evaluations. For example, for a 40x40 grid of points, evaluating SIN(X)*COS(Y), with this alternative engine, uses 40 SIN evaluations and 40 COS evaluations (not 1,600 each) and 1,600 BCD multiplications. (For the "value" evaluations; there are also the evaluations of partial derivatives, for lighting.)

The Advanced Graphing app needs its own evaluation engine as it uses structures (floating-point intervals) not currently present in PPL. (It too crawls over internal PPL structures to produce its plans for evaluations to come.)

jte · 09-22-2023, 08:37 PM

(09-21-2023 04:12 AM)komame Wrote: ⋮ (many many good points)
Such a dictionary should already be stored in a 6-bit binary version before the program is executed, but I'm afraid the time required for decoding the result would completely exclude this solution in terms of performance. And yet we haven't even mentioned 21-character words

Thanks for giving some actual timing measurements. Big Grin

Ideally, in case it isn’t obvious, lists would be shortened as soon as possible for decoding. As I’ve mentioned, it may well be that some changes to PPL (adding a few routines, for example), might substantially improve the efficiency of this type of coding. Actual real-world measurements, like you’ve helpfully done, can help convert idle thought experiments / conjectures into practical code. One possible already-present routine that could be used for trimming is SORT. This does more than we want (we’re wanting to separate the zeroes from the non-zeroes; the “not a match” from the “matches”). (To determine the boundary after sorting, a ΣLIST of a list of zeroes / ones could be used, or a binary search on the sorted list.)

At a more abstract level, one can ask what the goal of the program is. If it is to determine the number of matches, decoding isn’t necessary. If it is to show matches, only a screenful need be shown at once. (This is just a general comment.)

Tyann · 09-23-2023, 06:06 AM

Bonjour
Juste une pensée comme ça, je me souviens dans les années 90 d'un Langage interprété réputé pour sa très grande rapidité sur Atari ST : le GFA basic.
Il offrait un certain nombre d'instructions proches du processeur :
INC var et DEC var plus rapide que var=var+1 et var=var-1
MUL var,expr et DIV var,expr plus rapide que var=var*expr et var=var/expr
ADD var,expr et SUB var,expr plus rapide que var=var+expr et var=var-expr
il permettait aussi de typer des variables, pouvant recevoir uniquement des entiers par exemple pour les boucles, les indices de tableaux ou les pixels de l'écran qui là encore rendait les choses plus rapides.
Cela pourrait-il être appliqué à PPL ?

Hello
Just a thought, in the 90s I remember an interpreted language known for its great speed on the Atari ST: GFA basic.
It offered a number of instructions close to the processor:
INC var and DEC var faster than var=var+1 and var=var-1
MUL var,expr and DIV var,expr faster than var=var*expr and var=var/expr
ADD var,expr and SUB var,expr faster than var=var+expr and var=var-expr
it could also be used to type variables, which could only be integers, for example for loops, array indices or screen pixels, which again made things faster.
Could this be applied to PPL?

komame · (This post was last modified: 09-23-2023 11:21 AM by komame.)

(09-23-2023 06:06 AM)Tyann Wrote: INC var and DEC var faster than var=var+1 and var=var-1
MUL var,expr and DIV var,expr faster than var=var*expr and var=var/expr
ADD var,expr and SUB var,expr faster than var=var+expr and var=var-expr

It seems to me that such an approach would require passing arguments by reference, which PPL does not support (function arguments are always passed by value). It could probably be simulated somehow, but it would certainly not be a standard approach for PPL. However, the machine code compilation mechanism I mentioned in another thread would be significantly better if there were the possibility of passing arguments by reference and modifying the object "in place" instead of creating a copy in memory. This would provide an additional performance boost.

Tyann · 09-23-2023, 11:15 AM

Bonjour Komane
J'ai amélioré ma routine de recherche et je suis maintenant pour ??e??r??? <10s.
Jusqu'ici je comptais les '?' en début de mot ici, ma boucle de recherche aller de 3 à 9.
Maintenant j'enregistre dans une liste les positions des lettres connues ici {3,6} disons dans 'lx'.
Ma boucle de recherche va de 1 à SIZE(lx) et le test sur "?" n'est plus nécessaire.
Du coup ici une boucle de 1 à 2 pour tous les mots des sous listes.

Hello Komane
I've improved my search routine and I'm now for ??e??r???? <10s.
Until now I counted the '?' at the beginning of a word here, my search loop going from 3 to 9.
Now I record in a list the positions of the known letters here {3,6} say in 'lx'.
My search loop goes from 1 to SIZE(lx) and the test on "?" is no longer necessary.
So here we have a loop from 1 to 2 for all the words in the sub-lists.

komame · 09-23-2023, 11:29 AM

Hi Tyann,

(09-23-2023 11:15 AM)Tyann Wrote: I've improved my search routine and I'm now for ??e??r???? <10s.
Until now I counted the '?' at the beginning of a word here, my search loop going from 3 to 9.
Now I record in a list the positions of the known letters here {3,6} say in 'lx'.
My search loop goes from 1 to SIZE(lx) and the test on "?" is no longer necessary.
So here we have a loop from 1 to 2 for all the words in the sub-lists.

Wow! Congratulations on this idea. It seems to be great! Today, I will verify what the benefit will be after its implementation.

Piotr (Komame) Wink

Tyann · 09-23-2023, 04:24 PM

Désolé pour l'erreur Komame, je vais le copier 100 fois pour ne plus me tromper Smile

Après cette idée, une autre m' ai venu, pourquoi ne pas construire une chaîne à l'aide des positions et remplacer la boucle par un simple test.
Donc avec le mots dans 'm' et les mots de liste -> 'u'
J'ai testé :IF EQ(MID(m,lx,1),MID(u,lx,1)) THEN r(0):= ...... END;
IF ∑LIST(MID(m,lx,1))==∑LIST(MID(u,lx,1)) THEN r(0):= .....END;
Mais cela s'avère moins rapide.

Sorry for the mistake Komame, I'm going to copy it 100 times so I don't make any more mistakes Smile

After this idea, another one came to me, why not build a string using the positions and replace the loop with a simple test.
So with the words in 'm' and the list words -> 'u'.
I tested :IF EQ(MID(m,lx,1),MID(u,lx,1)) THEN r(0):= ...... END;
IF ∑LIST(MID(m,lx,1))==∑LIST(MID(u,lx,1)) THEN r(0):= .....END;
But this is not as fast.

komame · (This post was last modified: 09-23-2023 04:32 PM by komame.)

Hi Tyann,

(09-23-2023 11:15 AM)Tyann Wrote: I've improved my search routine and I'm now for ??e??r???? <10s.

The results after the modification:
??e??r??? => 386 words found in 5.03 - 5.81s (previously 5.6 - 7.6s)
??e??r??s => 130 words found in 4.72 - 5.63s (previously 5.7 - 7.7s)
??e?????s => 1031 words found in 5.32 - 6.38s (previously 5.8 - 7.3s)
p?e??r??? => 93 words forund in 1.10 - 1.27s (previously 1.3 - 2.0s)
a??????? => 3167 words found in 1.413 - 1.801s (previously 1.461 - 1.719s)
a???????? => 3839 words found in 2.270 - 2.597s (previously 2.363 - 2.725s)

In some cases, this method works slower:
a??? => 123 words found in 0.031 - 0.043s (previously 0.021 - 0.023s)
a????? => 1101 words found in 0.287 - 0.392s (previously 0.255 - 0.291s)
sc???r??? => 9 words found in 0.61 - 0.67s (previously 0.597 - 0.618s)
p????eurs => 29 words in 1.25 - 1.50s (previously 0.975 - 1.171s)

I suspect that the time it takes to compare a single letter is longer because we have to read the index twice, so in certain cases, it may indeed take longer than just comparing without retrieving the index of the letter. This depends on the ratio of the number of letters to compare to the total number of letters in the word, and depending on the case, the old or new method should be selected.

komame · (This post was last modified: 09-24-2023 07:18 AM by komame.)

Tyann,

(09-23-2023 04:29 PM)komame Wrote: In some cases, this method works slower: [...]
I suspect that the time it takes to compare a single letter is longer because we have to read the index twice, so in certain cases, it may indeed take longer than just comparing without retrieving the index of the letter. This depends on the ratio of the number of letters to compare to the total number of letters in the word, and depending on the case, the old or new method should be selected.

When I noticed that the results are sometimes worse, I had a similar idea to yours about building a string, but I only build it from the pattern side. There's no simple comparison here, but this way, the index is retrieved twice only for the dictionary word. It slighty improved the results (still comparing to the my rc1 release as 'previously'):
??e??r??? => 386 words found in 5.483 - 5.798s (previously 5.6 - 7.6s)
??e??r??s => 130 words found in 5.506 - 6.236s (previously 5.7 - 7.7s)
??e?????s => 1031 words found in 5.104 - 5.725s (previously 5.8 - 7.3s)
p?e??r??? => 93 words forund in 1.038 - 1.158s (previously 1.3 - 2.0s)
a??????? => 3167 words found in 1.474 - 1.801s (previously 1.461 - 1.719s)
a???????? => 3839 words found in 2.371 - 2.597s (previously 2.363 - 2.725s)
sc???r??? => 9 words found in 0.589 - 0.601s (previously 0.597 - 0.618s)

a??? => 123 words found in 0.029 - 0.037s (previously 0.021 - 0.023s)
a????? => 1101 words found in 0.265 - 0.315s (previously 0.255 - 0.291s)
p????eurs => 29 words in 1.146 - 1.376s (previously 0.975 - 1.171s

Since comparing a single character is still somewhat more time-consuming, in the case of long words with many known characters (and therefore many characters to compare), the total time is longer than with the old method. The same problem arises even when we have only one known letter but the word is short - in that case, the old method wins because the percentage of known words to all words is relatively high. However, for long words with relatively few known letters, the new method clearly prevails. Additionally, it also depends on where these known letters are positioned and whether they are in a sequence or scattered.
In the case of short words, I wouldn't be concerned about slightly worse performance because they are searched almost instantly. Taking the overall picture into account, the averaged results are better, especially for those more resource-intensive queries, so I believe it's worth permanently implementing this solution.
At this stage, we also need to consider measurement error, as while it may not have been significant with longer times, it becomes crucial with very short times. Variability in measurements (as is common in PPL) requires multiple runs under different conditions for reliable results.

(09-23-2023 11:15 AM)Tyann Wrote: I've improved my search routine and I'm now for ??e??r???? <10s.

This measurement is indeed supposed to be for ??e??r???? (10 characters), or rather for ??e??r??? (9 characters). So far, we haven't done it for 10 characters. Anyway, for 10 characters, I managed to achieve a result ranging from 4.799 to 5.410 seconds.
EDIT: Now, I've measured again for ??e??r??? (9 characters), and I achieved an even better result: 4.718s. In the list above, it was 5.483 - 5.798s for this case, and it was based on approximately 30 runs under different conditions. As you can see, these measurements are so unstable that it's difficult to assess the impact of a specific change on performance conclusively, and I think further optimization efforts may not make sense, especially considering that everything could change dramatically in the next firmware version.

Piotr

Tyann · (This post was last modified: 09-24-2023 06:48 AM by Tyann.)

Bonjour Komame

Mes temps de recherches sont très proches des vôtre, j'ai encore une petite amélioration à tester et je vous les montrerais.
Par contre pour les recherches comme a??? ou a????? ma réponse est quasiment instantanée, car lors de l'entrée du mot dans l'application je compte le nombre de ?, si celui-ci est = à la longueur du mot ou = 0 j'affiche un message d'erreur "Entrée erronée", du coup j ai rajouté un test : si le nombre de ? = longueur du mot -1 et left(mot,1)<>"?" alors je renvoie la sous liste complète sans aucune recherche.

Hello Komame

My search times are very close to yours, I still have a small improvement to test and I'll show you.
On the other hand, for searches like a??? or a????? my response is almost instantaneous, because when I enter the word in the application I count the number of ?, if this is = the length of the word or = 0 I display an error message "Wrong entry", so I've added a test: if the number of ? = length of the word -1 and left(word,1)<>"?" then I return the complete sub-list without any search.

komame · 09-24-2023, 06:36 PM

When you enter a command and press ENTER, it gets executed. If you then press ENTER again immediately after completion of task (without entering a command), it will execute again, and each subsequent press of ENTER will trigger another execution. This is not the same as repeatedly copying the command (in that case, you see the command on the left and the result on the right for every line on the stack). I'm mentioning this because when I run the CROSSWORDS call with TEVAL measurement in this way, it often happens that each successive execution produces a better result. This improvement continues for up to around the 30th run. I don't think PPL has any caching because, in that case, I would expect the second run to be faster, but the third, fourth, and subsequent runs would have the same speed. This effect is such that several consecutive calls result in a better outcome with each call.

This is repeatable for many different cases, but sometimes it takes more calls for Prime to enter this mode. Once it does, each subsequent call is faster.
@Jeff, do you have any idea what might be causing this?

Tyann · (This post was last modified: 09-29-2023 04:58 AM by Tyann.)

Bonjour komame

je vous donne enfin mes temps de recherche:
A noter que j'ai fais cela ce matin batterie à 100%, hier soir avec une batterie à 25%
les temps étaient très fluctuants et plus élevés.

Hello komame

I'm finally giving you my research times:
Note that I did this this morning with the battery at 100%, and yesterday evening with the battery at 25%.
the times were very fluctuating and higher.

??e??r??? --> 6.5s
??e??r??s --> 6.5s
??e?????s --> 8.7s
p?e??r??? --> 0.6s
sc???r??? --> 0.6s
p????eurs --> 0.7s