HP Forums - Programming puzzles: processing lists!

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

(05-28-2017 06:31 PM)John Keith Wrote: [ -> ]DOH!

I realized what the problem was: I had a program in my HOME directory called LREPL, a similar program I had been working on a few weeks ago. I renamed that program and everything is working fine now. ;-)

Sorry for wasting your time, I really appreciate your effort.

Homer Simpson (aka John)

Mystery solved! Thanks for keeping me from beating my head against the wall trying a bunch more tests in an effort to track it down. No time was wasted. Smile

Here's some brief descriptions of a few more commands I've created:

S→NL (String to Numeric List)

Converts string to a list of numbers. The numbers are the same as if you executed "NUM" on each character of the string in sequence.

Ex.: "ABCDEFG" => { 65. 66. 67. 68. 69. 70. 71. }

NL→S (Numeric List to String)

Reciprocal command to the above.

LCLLT (List Collate)

Given a list of lists, returns a single list with the contents of each sublist extracted one item at a time in sequence. A picture is easier to understand than the description:
{ { 1 1 1 } { 2 2 2 } { 3 3 3 } } => { 1 2 3 1 2 3 1 2 3 }

LDST (List Distribute)

Reciprocal of the above; needs a "groups" argument. For the above example, the group count is 3, so:
{ 1 2 3 1 2 3 1 2 3 } 3 => { { 1 1 1 } { 2 2 2 } { 3 3 3 } }

(you can also think of LDST as "List Deal", because it is analogous to dealing a deck of cards into "groups" hands)

LSPLT (List Split)

Splits a list into two lists where the length of the first sublist is given as a numeric argument. The result is a list of the two lists. The number given must be less than or equal to the length of the list, and also non-negative. If equal, the list and an empty list are returned. If 0, an empty list and the list are returned.

LSDIV (List Subdivide)

Given a list and a number, divides the list into the number of sublists indicated. Number must evenly divide the list contents.

The last commands seems pretty interesting, surely a possible new challenge is to compare your library with existing ones. Nice work!

(05-28-2017 08:03 PM)pier4r Wrote: [ -> ]The last commands seems pretty interesting, surely a possible new challenge is to compare your library with existing ones. Nice work!

I actually started looking at the GoferList library (finally) after completing these last few commands. It's got a very nice set of commands! I was intrigued to find the "Zip" commands there, which are somewhat similar to my LCLLT and LDST. There are some differences, but I think I could still compare them by combining some commands together.

I think I'll take a break from creating new commands for a bit and do some performance testing to compare similar GoferList commands with the ones I've already done.

(05-28-2017 08:42 PM)DavidM Wrote: [ -> ]I think I'll take a break from creating new commands for a bit and do some performance testing to compare similar GoferList commands with the ones I've already done.

If you do it, please post the results, otherwise I'll do eventually the same.

Inspired by the commands of DavidM (that still I have to compare to GoFerList or to use for solutions, they are meant for it) I added some more challenges in the first post.

If you find errors of whatever kind, please report them.

(05-31-2017 04:02 PM)pier4r Wrote: [ -> ]Inspired by the commands of DavidM (that still I have to compare to GoFerList or to use for solutions, they are meant for it) I added some more challenges in the first post.

I'm still in the process of doing some performance testing on GoferList commands as compared to similar commands from my in-progress library, so it may take a few days before I can start to dig into your newest challenges.

I can share some preliminary findings of the performance testing, with the promise of more details to come:

Commands that are functionally equivalent are mostly on par with each other, though almost all of my library commands have a very slight speed advantage over their GoferList counterparts. Commands in this category include Copy/LMRPT, Repeat/LNDUP, Split/LSPLT, Chars/S->SL, and Strcat/LSUM.

There are a couple of exceptions to that rule, though:

GoferList's Nub performed better on larger lists than LDDUP, at least on my test data (which was intentionally chosen to have 50% duplicates). The reverse was true for lists of 0-100 elements, where LDDUP had the advantage. At 1000 elements, Nub averaged 78.4 seconds vs. LDDUP at 135.7 seconds. Since LDDUP is essentially a wrapper around a ROM routine, I'm thinking it might be worth coming up with my own version for comparison.

GoferList's Copy is orders of magnitude slower than LMRPT, and the performance degradation is stark as the quantity of element duplication grows. When duplicating a list of 1 item 1000 times (the worst case in my tests), Copy required 528.8 seconds. LMRPT completed the same task in 0.11 seconds.

I'm still in the process of comparing some others, but it's taking a bit more time because the commands aren't exactly the same. To get a more complete picture, I do two separate test suites for each of these: For the first I use a sequence of UserRPL commands (based on my library) to compare to a single GoferList command, and then I reverse the perspective and use a single command from my library to compare with a sequence of GoferList/UserRPL commands. There's been some interesting results with these tests, and a couple of surprises. Commands in this category are Chars/S->NL, Strcat/NL->S, Zip/LCLLT, and Unzip/LDST.

Then there's all the remaining commands that have no counterparts. That's a large list from GoferList, but only a few from my library (LCNT, LEQ, LGRP, LREPL, LROT, LRPCT, LSDIV, LSHF, LSWP). That may grow over time, though. The results of testing so far have convinced me that there's enough of a value proposition to warrant further development effort for "yet another list library".

thanks for the info David!

(05-31-2017 05:44 PM)DavidM Wrote: [ -> ]GoferList's Copy is orders of magnitude slower than LMRPT, and the performance degradation is stark as the quantity of element duplication grows. When duplicating a list of 1 item 1000 times (the worst case in my tests), Copy required 528.8 seconds. LMRPT completed the same task in 0.11 seconds.

...

Then there's all the remaining commands that have no counterparts. That's a large list from GoferList, but only a few from my library (LCNT, LEQ, LGRP, LREPL, LROT, LRPCT, LSDIV, LSHF, LSWP). That may grow over time, though. The results of testing so far have convinced me that there's enough of a value proposition to warrant further development effort for "yet another list library".

That is exactly the point that I see over and over, in contrast with https://xkcd.com/927/ . One starts a project for his personal experience, looking at existing known projects like "oh, it will be surely be dismal compared to that result" (existing known because we never know how many others are there that are not shared, what a pity)

But instead the "yet another attempt" is great for many reasons. First and foremost for personal experience. Second, one never knows where a new development can lead over time. Third it is yet another example for someone that wants to learn. If the conditions around the attempt are not the same (i.e: a product that is commercial or closed source vs a product that is open source), the new attempt is surely fitting one condition than the older did not have. Maybe the new attempt is easier to expand since it is more clean, well documented, whatever. And I could continue mentioning many other factors. (with my poor grammar. I should proofread a bit more damn me)

In other words, if one can do a "yet another attempt", then he has all my support.

Already the library from David can be a complement to goferlist (even in the case it is restricted to a certain input type, it does not matter) and has inspired some more challenges that can help someone (at least me) getting better with RPL and processing lists.
Also, I'm pretty sure I will go through some challenges with the newRPL.

As promised, I've put together some timing data for the commands in the library I posted recently. The process of measuring and comparing their performance also helped to show where a couple of new commands would be very useful, so I added them to the mix.

For each test, the respective libraries were installed in Port 0 to minimize the impact of code relocation on the final results. The timing process included a forced garbage collection just prior to command invocation in order to "level the playing field" for all commands.

In all of the following tables, the entries in the "50-100-500-1000" columns represent the time required for a single invocation of the command with a list containing the quantity of elements in the column header. The times listed are in seconds. Most commands were tested multiple times (the count designated as "Cycles") and the reported time is an average of all of the tests.

Several of my commands are very similar to GoferList commands, and not surprisingly have similar performance characteristics:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{Repeat} & 0.0188 & 0.0245 & 0.0670 & 0.1195 & 50 \\
\textbf{LNDUP} & 0.0161 & 0.0197 & 0.0477 & 0.0816 & 50 \\
\hline
\textbf{Split} & 0.0183 & 0.0239 & 0.0705 & 0.1257 & 50 \\
\textbf{LSPLT} & 0.0179 & 0.0208 & 0.0451 & 0.0755 & 50 \\
\hline
\textbf{Chars} & 0.0530 & 0.0911 & 0.5576 & 1.1568 & 50 \\
\textbf{S→SL} & 0.0488 & 0.0831 & 0.5237 & 1.0705 & 50 \\
\hline
\textbf{Strcat} & 0.1816 & 0.3429 & 1.6582 & 3.6404 & 50 \\
\textbf{LSUM} & 0.1635 & 0.3137 & 1.5390 & 3.3770 & 50 \\
\hline
\end{array}

A couple more are functionally equivalent, but there were significant differences in performance, especially with larger lists:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{Copy} & 0.1954 & 0.6454 & 41.9425 & 528.8188 & 1 \\
\textbf{LMRPT} & 0.0274 & 0.0319 & 0.0663 & 0.1075 & 50 \\
\hline
\textbf{Nub} & 0.1848 & 0.5596 & 10.9966 & 78.4019 & 3 \\
\textbf{LDDUP} & 0.1368 & 0.4644 & 12.4826 & 135.7304 & 3 \\
\hline
\end{array}

LMRPT uses a Saturn code routine at its core to replicate the needed objects, so I wasn't surprised to see it perform well. LDDUP is simply a wrapper around a built-in ROM command (COMPRIMext). I will spend some time later in an effort to see if I can come up with a faster alternative.

I initially had created commands to convert strings to numeric lists and back (S→NL and NL→S), but saw that the GoferList library approached that concept differently (strings are converted to lists of characters instead). It seemed like both approaches had merit, so I created a similar string list command for my library (S→SL). I didn't see a need for the reverse function, as ΣLIST (which is used by LSUM) handles that job nicely. It's an easy thing to convert a list of characters to numbers by simply passing the list to the built-in NUM command, so I thought I'd see how the combination of GoferList's Chars command followed by NUM compared with S→NL:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{Chars NUM} & 0.2399 & 0.4569 & 2.7059 & 6.1926 & 50 \\
\textbf{S→NL} & 0.0551 & 0.0951 & 0.5922 & 1.2080 & 50 \\
\hline
\end{array}

The reverse of the above functions can easily be tested with another combination (CHR Strcat), and the results took a surprising turn:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{CHR Strcat} & 0.3747 & 0.7166 & 3.8709 & 9.1431 & 10 \\
\textbf{NL→S} & 0.0879 & 0.1654 & 0.9203 & 9.4550 & 10 \\
\hline
\end{array}

Notice the timing of NL→S for a list of 1000 numbers. My suspicion is that a garbage collection is getting triggered during the processing of the large list, as the other list sizes showed much better performance. I will add that command to my list of things to check later for improvement.

The next category of commands could be described as "similar but not functionally equivalent", and is essentially based on GoferList's Zip/Unzip commands. I actually came up with the idea for my similar functions (LCLLT/LDST) before I ever saw the GoferList commands, so they aren't designed to do quite the same things. It's not too difficult to compare them, though, as long as you understand the following:

Zip is designed to handle a specific instance of the type of data that LCLLT can process, and will return results even when the sublists involved aren't the same size. LCLLT requires all sublists to be the same size, but can handle any number of lists. Zip returns its results as a list of sublists, but LCLLT returns all of the data in one single list. So conversions are required in both directions when comparing these complementary functions:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{Zip} & 0.1134 & 0.1863 & 0.8451 & 1.8365 & 50 \\
\textbf{LCLLT DUP SIZE 2 / LSDIV} & 0.0705 & 0.1128 & 0.7510 & 2.0134 & 50 \\
\hline
\textbf{Zip ΣLIST} & 0.2651 & 0.6426 & 21.6099 & 319.9845 & 1 \\
\textbf{LCLLT} & 0.0322 & 0.0426 & 0.1271 & 0.2285 & 50 \\
\hline
\textbf{Unzip} & 0.1468 & 0.2423 & 1.1414 & 2.6013 & 10 \\
\textbf{ΣLIST 2 LDST} & 0.1780 & 0.4396 & 6.5892 & 24.4145 & 10 \\
\hline
\textbf{DROP DUP SIZE 2 / LSDIV Unzip 2 →LIST} & 0.1814 & 0.3041 & 1.7102 & 4.2611 & 5 \\
\textbf{LDST} & 0.0272 & 0.0346 & 0.0948 & 0.1667 & 5 \\
\hline
\end{array}

The above tests had two glaring standouts, and both of them involved the ΣLIST command. This caused me to look more closely at what was happening, and it should come as no surprise that I discovered that using ΣLIST to "explode" the contents of a list of lists was not a good idea. Whereas ΣLIST is good at many things, that particular purpose is not its strongest. And that's putting it nicely. So I opted to create my own version of a command for that purpose: LXIL (List eXplode Inner Lists). It won't win any awards for speed, but compared to ΣLIST it is much faster for its stated purpose:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{ΣLIST} & 0.3333 & 0.9277 & 53.9323 & 658.0647 & 1 \\
\textbf{LXIL} & 0.0527 & 0.1102 & 1.4303 & 4.8115 & 20 \\
\hline
\end{array}

Using LXIL instead of ΣLIST in the above tests gave much better results:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{Zip ΣLIST} & 0.2651 & 0.6426 & 21.6099 & 319.9845 & 1 \\
\textbf{Zip LXIL} & 0.1388 & 0.2382 & 1.4557 & 3.7946 & 10 \\
\hline
\textbf{ΣLIST 2 LDST} & 0.1780 & 0.4396 & 6.5892 & 24.4145 & 10 \\
\textbf{LXIL 2 LDST} & 0.0530 & 0.0866 & 0.7026 & 2.1245 & 10 \\
\hline
\end{array}

...so the final results for the Zip/LDST comparisons became:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{Zip LXIL} & 0.1388 & 0.2382 & 1.4557 & 3.7946 & 10 \\
\textbf{LCLLT} & 0.0328 & 0.0422 & 0.1264 & 0.2272 & 10 \\
\hline
\textbf{Unzip} & 0.1467 & 0.2401 & 1.1347 & 2.5791 & 10 \\
\textbf{LXIL 2 LDST} & 0.0530 & 0.0866 & 0.7026 & 2.1245 & 10 \\
\hline
\end{array}

The final category of testing was simply the commands in my library that had no GoferList counterparts. I thought it would be nice just to see how each command performed with those same list sizes:
\begin{array}{|lrrrrc|}
\hline
\textbf{Command} & \textbf{50} & \textbf{100} & \textbf{500} & \textbf{1000} & \textbf{Cycles} \\
\hline
\textbf{LCNT - none matched} & 0.0245 & 0.0347 & 0.1113 & 0.2086 & 20 \\
\textbf{LCNT - 100\% matched} & 0.0334 & 0.0515 & 0.2954 & 0.6088 & 20 \\
\hline
\textbf{LEQ} & 0.0240 & 0.0346 & 0.1189 & 0.2249 & 50 \\
\hline
\textbf{LGRP - Sorted Data} & 0.0304 & 0.0568 & 0.3541 & 1.0933 & 10 \\
\textbf{LGRP - Shuffled Data} & 0.0456 & 0.1030 & 1.2533 & 3.6518 & 10 \\
\hline
\textbf{LRPCT - Sorted Data} & 0.0500 & 0.1068 & 0.7310 & 2.3372 & 10 \\
\textbf{LRPCT - Shuffled Data} & 0.0907 & 0.2513 & 3.1239 & 9.4632 & 10 \\
\hline
\textbf{LREPL - nothing replaced - 1 target} & 0.0716 & 0.1479 & 1.4741 & 5.0333 & 50 \\
\textbf{LREPL - 100\% replaced - 1 target} & 0.0724 & 0.1507 & 1.4747 & 5.0292 & 50 \\
\textbf{LREPL - nothing replaced - 20 targets} & 0.1283 & 0.3022 & 2.2166 & 6.4884 & 50 \\
\textbf{LREPL - 100\% replaced - 20 targets} & 0.1486 & 0.2664 & 2.0512 & 6.1830 & 50 \\
\hline
\textbf{LROT} & 0.0257 & 0.0338 & 0.1008 & 0.1819 & 50 \\
\hline
\textbf{LSDIV} & 0.0611 & 0.1145 & 1.1616 & 3.3086 & 20 \\
\hline
\textbf{LSHF} & 0.2523 & 0.5041 & 2.9164 & 5.9838 & 20 \\
\hline
\textbf{LSWP} & 0.0245 & 0.0309 & 0.0823 & 0.1436 & 50 \\
\hline
\end{array}

The process of going through all of this testing showed me that there was definitely some value in pursuing the development of these additional commands, and even the similar ones to GoferList's have varying degrees of performance improvement in most instances. As a result, I'll continue to refine my contributions and probably add a few more commands before I'm done with it.

Awesome!

I hope you'll put this also in the general software library!

(06-03-2017 08:00 PM)pier4r Wrote: [ -> ]I hope you'll put this also in the general software library!

It's still "in progress", and not quite ready for a GSL post.

Yesterday I took a look at how STREAM was implemented, and realized that it was utilizing an unusual method for biting off list elements one piece at a time for feeding to some process. Under the right conditions, this method can not only save loop time, but it also has benefits regarding memory fragmentation that can improve performance both immediately and later on.

I'm able to make use of this technique in several places in my code, so I've been going through the library and re-coding segments where appropriate. From the testing I've done, it's definitely worth the trouble.

#8 is easy with GoferList

Code:

Spoiler

.

.

.

.

.

.

.

.

.

 IF DUPDUP SIZE 2 / Split ≠ THEN DROP {} END

You can replace {} by "Invalid" but I dont like when a program dont return always the same type of object.

(06-04-2017 02:31 PM)DavidM Wrote: [ -> ]It's still "in progress", and not quite ready for a GSL post.

Yesterday I took a look at how STREAM was implemented, and realized that it was utilizing an unusual method for biting off list elements one piece at a time for feeding to some process. Under the right conditions, this method can not only save loop time, but it also has benefits regarding memory fragmentation that can improve performance both immediately and later on.

I'm able to make use of this technique in several places in my code, so I've been going through the library and re-coding segments where appropriate. From the testing I've done, it's definitely worth the trouble.

Ok, at least you post it here, better than nothing.

Anyway added some challenge to the first post. Now there are 30.

The 30th looks particularly evil. I need to go through it for some statistics returned by a program of mine.

edit:
@DavidM. One thing I noted on hpcalc.org is that people put there great work (even if still in progress) but with little documentation. You have your list package in some post before and that is great, but you missed the documentation in the package. That is, the user should copy your post to procide a "txt" on the 50g to mention how the commands works because one does not always have a link to your post.

So my request/tip is. Could you put your nice documentation together in the zip package, so one knows that he has an help guide when needed. For the rest, kudos! I hope you will share your most updated version when you can.

(06-03-2017 08:00 PM)pier4r Wrote: [ -> ]Awesome!

I hope you'll put this also in the general software library!

I agree ;D

What about a DOPERM command to handle list permutations.
Something like :

Code:

INPUT

 { 1 2 3 4 }             @ A list

 2                       @ Number of elements for permutation

 << My program >>        @ to handle each permutation

DOPERM

OUPUT (If MyProgram does nothing}

{ {1, 2}, {1, 3}, {1, 4},{2, 1}, {2, 3}, {2, 4},{3, 1}, {3, 2}, {3, 4},{4, 1}, {4, 2}, {4, 3} }

#30 with GoferList

Code:

Spoiler

.

.

.

.

.

.

.

.

.

.

.

.

 SWAP Zip SORT Unzip SWAP

#11

Code:

Spoiler 

.

.

.

.

.

.

.

.

.

.

.

.

  DUPDUP IF REVLIST ≠ THEN DROP "Inv." END

#22

Code:

Spoiler

.

.

.

.

.

.

.

.

.

.

.

DUP2 « ≠ » DOLIST * ADD

(06-05-2017 12:01 PM)Gilles59 Wrote: [ -> ]#30 with GoferList

Uh that is short. Clever!

Thanks for contributing Gilles59, but could be that you are Gilles Carpentier ? I remember that that user was an enthusiast of the goferlist. (I discovered it also through his posts)

(06-05-2017 12:55 PM)pier4r Wrote: [ -> ]
(06-05-2017 12:01 PM)Gilles59 Wrote: [ -> ]#30 with GoferList

Uh that is short. Clever!

Thanks for contributing Gilles59, but could be that you are Gilles Carpentier ? I remember that that user was an enthusiast of the goferlist. (I discovered it also through his posts)

Hi, Yes I'm ;D

(06-05-2017 11:48 AM)Gilles59 Wrote: [ -> ]I agree ;D

What about a DOPERM command to handle list permutations.
Something like :

Code:

INPUT { 1 2 3 4 } @ A list 2 @ Number of elements for permutation << My program >> @ to handle each permutation DOPERM OUPUT (If MyProgram does nothing} { {1, 2}, {1, 3}, {1, 4},{2, 1}, {2, 3}, {2, 4},{3, 1}, {3, 2}, {3, 4},{4, 1}, {4, 2}, {4, 3} }

actually that is one of the challenges. To write a program that gives you all the permutations of a list (sure, when the list is large it may hog the system).

But I did not find any DOPERM on the AUR. Where do you get this command from?

(06-05-2017 01:07 PM)pier4r Wrote: [ -> ]
(06-05-2017 11:48 AM)Gilles59 Wrote: [ -> ]I agree ;D

What about a DOPERM command to handle list permutations.
Something like :

Code:

INPUT { 1 2 3 4 } @ A list 2 @ Number of elements for permutation << My program >> @ to handle each permutation DOPERM OUPUT (If MyProgram does nothing} { {1, 2}, {1, 3}, {1, 4},{2, 1}, {2, 3}, {2, 4},{3, 1}, {3, 2}, {3, 4},{4, 1}, {4, 2}, {4, 3} }

actually that is one of the challenges. To write a program that gives you all the permutations of a list (sure, when the list is large it may hog the system).

But I did not find any DOPERM on the AUR. Where do you get this command from?

As far I know this command does not exist neither in the AUR nor in a Library :/

I wrote such a thing far ago in User RPL but it was awfully slow and very badly coded! It will be great if somebody can create this in sysRPL.

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15