Post Reply 
List Commands Library for 50g
09-11-2017, 05:14 PM
Post: #141
RE: List Commands Library for 50g
(09-11-2017 01:53 PM)David Hayden Wrote:  
(09-05-2017 09:33 PM)Joe Horn Wrote:  The problem of long garbage collections when there are a lot of exploded list items on the stack can be avoided by first storing the list into a global variable.
Brilliant!

This is indeed a great option for avoiding the list/GC issues. Now that I've had a chance to run some tests with it, I've realized that there's still a potential "gotcha", though. In particular, it's not necessarily as memory-friendly as I was originally thinking.

The reason is that you will end up with (at least) two copies of the original list in memory for almost all scenarios while you are processing the list contents (assuming the list wasn't already saved to HOME). The first copy is the one saved in TEMP by the CKx... step of each command. Then the second is in HOME (by definition). Any altered elements will also be allocated in TEMP. So memory constraints can still play into this for large lists.

Avoiding that first copy in TEMP is possible, but it gets messy. Keeping it out of TEMP means it can't already be there when the CKx&Dispatch step is performed. Saving it to HOME prior to CK&Dispatch requires validating argument types before the "official" argument checking occurs, and has to be done carefully to avoid any errors being raised.

The payoff is great in the situation where lots of exploded list items exist on the stack and a GC occurs. Deciding (programmatically) how and where to apply that technique is not always straightforward, though. I'm still trying to figure out the best approach.

Thoughts are appreciated!
Find all posts by this user
Quote this message in a reply
09-12-2017, 08:34 PM
Post: #142
RE: List Commands Library for 50g
If I am not mistaken, the following is so far a program of mine that uses ListExt the most. It seems that ListExt is tailored for it.

The problem that it solves may become a list processing challenge if I can formalize it in a simpler way.

The program is just an initial draft, although working
Code:

sampleSizeTest
    \<<
      100 "dbSize" DROP
      {} "dbList" DROP
      { 45 25 20 10} "choicesDistr" DROP
        @actually this set the dbSize as well
      DUP SIZE "choicesNum" DROP
      10 "sampleSize" DROP
      {} "sampleList" DROP
      {} "sampleResult" DROP
    
      \->
      @input
      dbSize
      
      @local var
      dbList
      choicesDistr
      choicesNum
      sampleSize
      sampleList
      sampleResult
      
      \<<
        @set the db
        choicesDistr LSUM 'dbSize' STO
        
        @setting in homogeneous blocks
        1 choicesNum
        FOR choiceInd
          @create 'n' times the entry choiceInd
          choiceInd
          choicesDistr choiceInd GET
          LNDUP
          'dbList' SWAP STO+
        NEXT
        
        @here we have the db
        
        @going to sample the db
        
        @creating a list of random positions to pick
        dbSize LSEQ LSHUF
        sampleSize
        LFRST
        dbList
        SWAP LPICK
        
        @got the sample on the stack
        'sampleList' STO
        
        @counting
        1 choicesNum
        FOR choiceInd
          sampleList choiceInd LCNT
          'sampleResult' SWAP STO+
        NEXT
        
        "choices in source db"
        choicesDistr dbSize /
        "sample result"
        sampleResult sampleSize /
      \>>
    \>>

Wikis are great, Contribute :)
Find all posts by this user
Quote this message in a reply
09-13-2017, 04:07 AM
Post: #143
RE: List Commands Library for 50g
(09-12-2017 08:34 PM)pier4r Wrote:  If I am not mistaken, the following is so far a program of mine that uses ListExt the most. It seems that ListExt is tailored for it.

Are you testing LSHUF? Seems like many runs of this should cause the average of the results to approach the defined distribution percentages, if I'm understanding what you're doing correctly.

Nice to see that you're able to make good use of the library!
Find all posts by this user
Quote this message in a reply
09-13-2017, 05:21 AM (This post was last modified: 09-13-2017 05:38 AM by pier4r.)
Post: #144
RE: List Commands Library for 50g
No I'm not testing lshuf (but yeah I have it in the todos) .

Years ago I was contesting that a sample size of 3000 individuals could predict the voting results of way larger populations (tens of millions of individuals ). At that time I had less experience with statistics. Nevertheless I realized that I never tested the theory with a simulation (that normally helps consolidating the theory learned ).

So the idea is that I create a database of elements that reflects some choices and then I pick from it, to see if the sample reflects the real distribution.

Now I guess what affects the result is : how the population is distributed . In the code above the groups of elements that hold a certain choice are all packed together instead of being mixed .

I would like to test when the group are packed together , when the groups are completely mixed (the population is shuffled ) , when the groups are partially mixed (like people in Texas lean more for the Republicans rather than for the Democrats).

The main problem , though, is that to have a population of millions of entries I guess I have to do too much work on the 50g (I need to split the list on the sd).

I am not sure whether the HP prime can handle this as well even if on Android . I may need to use once again SQLite and bash / awk.

Anyway I was amazed that your library simplified my work a lot. And this is what I said before. The utility of a command , even the one that can be replicated with 3 userrpl commands , is equal to how much time is saved in the programming phase . After that the speed is important (due to the need of running the program and waiting for the result) and only as third concern is the size of the program.

Eyeballing the value of a command I would say:
The size of a command is 1 part of its value.
The speed of a command is 10 parts of its value (or even more, it depends how much times is used)
The usefulness of a command in simplifying coding work is 20 parts of its value (or even more if the program would be very complex otherwise).

So of course a library that packs more (useful and fast) commands, has more chances to have an higher total value for the user.

Wikis are great, Contribute :)
Find all posts by this user
Quote this message in a reply
09-23-2017, 08:35 AM
Post: #145
RE: List Commands Library for 50g
David, is there any further work on the roadmap or can we consider the 1.1.0d pretty definitive? (for the moment at least, until the next interesting command suggestion)

Wikis are great, Contribute :)
Find all posts by this user
Quote this message in a reply
09-23-2017, 03:57 PM (This post was last modified: 09-23-2017 04:35 PM by DavidM.)
Post: #146
RE: List Commands Library for 50g
(09-23-2017 08:35 AM)pier4r Wrote:  David, is there any further work on the roadmap or can we consider the 1.1.0d pretty definitive? (for the moment at least, until the next interesting command suggestion)

I've been quietly working on it, mostly in three general areas:

- Removing some code redundancy that developed as certain newer commands were added.
- Modifying the treatment of indices and counts to round them as opposed to IP'ing them when the user provides non-integer arguments. This brings consistency with most of the built-in commands.
- Looking for opportunities to apply the same technique that Joe identified in the built-in SORT command (save to USEROB, explode, process, implode, delete in USEROB).

I spent a chunk of time trying to find a reliable way to "redirect" all of the various pointers to the residual TEMPOB list over to the new USEROB one in an effort to maximize available memory. I could substitute the new one for the old one fairly easily in the user stack and LASTARG(s), but I never found a reasonable way to locate the reference that still remained in the "cached stack" (used by UNDO when you have "Last Stack" enabled in MODE). The aptly-named SysRPL command "CacheStack" will make a wholesale replacement of the current saved stack, but it also breaks the current TEMPENV in the process. By "break" I mean it can potentially crash your calculator.

For now, I've given up on that strategy and will simply live with the fact that this technique requires at least enough memory for 3 copies of the original list to coexist (the original TEMPOB one, the interim USEROB one, and the final processed list while it is being built). This can be remedied by the user if the list is saved into a global variable first before recalling and passing it as an argument. I'll add a comment about that to the documentation.

After I'm done going through the code on this pass, I've got one more command that I'd like to add before the next release (@John Keith: the sublist reversal we discussed). At that point I'm hoping the library will be in a "near final" state. I'm anticipating having that ready early next week.

As far as a roadmap is concerned, here's how I see it for now:
- 1.1.1d release next week
- final code scrubbing/annotation, fix any issues resulting from testing
- after suitable testing period (hopefully other people than just me Smile), an "RC" (release candidate) version will be posted
- if no issues found, final release made sometime in October

Looking toward the future, one of the thoughts I've had is to see how much of this library could be re-implemented as a 48g/x library. Some of the commands would just need a little substitution here and there, but others simply wouldn't be feasible on the 48. I'm not sure how much interest there would be in such a beast, though.
Find all posts by this user
Quote this message in a reply
09-23-2017, 04:51 PM
Post: #147
RE: List Commands Library for 50g
thanks for sharing.

I'd like to see daily/weekly "thoughts log" on such projects (in general) but I understand that it is better when other people ask. So in the case the thread will be quiet again, I'll ask again in some days/week.

Wikis are great, Contribute :)
Find all posts by this user
Quote this message in a reply
09-23-2017, 07:34 PM
Post: #148
RE: List Commands Library for 50g
(09-11-2017 05:14 PM)DavidM Wrote:  
(09-11-2017 01:53 PM)David Hayden Wrote:  Brilliant!

This is indeed a great option for avoiding the list/GC issues. Now that I've had a chance to run some tests with it, I've realized that there's still a potential "gotcha", though. In particular, it's not necessarily as memory-friendly as I was originally thinking.

The reason is that you will end up with (at least) two copies of the original list in memory for almost all scenarios while you are processing the list contents (assuming the list wasn't already saved to HOME). The first copy is the one saved in TEMP by the CKx... step of each command. Then the second is in HOME (by definition). Any altered elements will also be allocated in TEMP. So memory constraints can still play into this for large lists.

Avoiding that first copy in TEMP is possible, but it gets messy. Keeping it out of TEMP means it can't already be there when the CKx&Dispatch step is performed. Saving it to HOME prior to CK&Dispatch requires validating argument types before the "official" argument checking occurs, and has to be done carefully to avoid any errors being raised.

The payoff is great in the situation where lots of exploded list items exist on the stack and a GC occurs. Deciding (programmatically) how and where to apply that technique is not always straightforward, though. I'm still trying to figure out the best approach.

Thoughts are appreciated!

Hi David,

I was wondering about your large lists with your functions. I had this problem with the radio program you helped with. All the senarios I could try, ended out with a temporary copy if I desired to modify a large list.

Ended out using directories with to save large amounts of data. This way the directory could be modified with little memory opposed to creating a temp copy of a large list. If you get the list functions to so they don't require a temp copy while modifying it I'd be very interested.

Thanks,
Scott
Find all posts by this user
Quote this message in a reply
09-23-2017, 08:26 PM
Post: #149
RE: List Commands Library for 50g
Hey Scott -

My earlier comments were about a specific technique that saves a list to USEROB so that its exploded elements don't bog down GCs that occur while processing. Unfortunately, I haven't been able to nail down an effective way to do that such that a list which starts out in TEMPOB is freed up during the processing phase of the command -- the O/S keeps an active reference to the TEMPOB version so that it can't be freed during a GC, thus taking up twice the space that is actually needed.

I only recall a little bit about the program you were working on, but isn't the bulk of your data in the form of strings? I'm wondering if you could avoid some of this by keeping your string data separate from other data you might want to save in a list. It's still likely that a copy of the string would have to be made each time you manipulate it, but GCs shouldn't be as much of a problem as they can be with lists. Would it be possible for you to combine multiple strings into a single string object by inserting separation characters?
Find all posts by this user
Quote this message in a reply
09-24-2017, 01:32 PM
Post: #150
RE: List Commands Library for 50g
(09-23-2017 03:57 PM)DavidM Wrote:  After I'm done going through the code on this pass, I've got one more command that I'd like to add before the next release (@John Keith: the sublist reversal we discussed). At that point I'm hoping the library will be in a "near final" state. I'm anticipating having that ready early next week.

As far as a roadmap is concerned, here's how I see it for now:
- 1.1.1d release next week
- final code scrubbing/annotation, fix any issues resulting from testing
- after suitable testing period (hopefully other people than just me Smile), an "RC" (release candidate) version will be posted
- if no issues found, final release made sometime in October

Thanks for the mention, I am glad that this command will be included. I humbly suggest the name LSSR (for List Subsequence Reversal).

I have been very busy lately but I have been playing around with various library commands and have come up with a few observations. I will try to collect my thoughts and PM you soon, but there is one suggestion I will make here.

Basically it is a mirror image of LPOP which returns the last item in the list on level 1, and the rest of the list on level 2. The Goferlists commands Last and Init together do this but Init is very slow. The User RPL sequence 1. OVER SIZE 1. - SUB is faster.

Thanks again for all your efforts, looking forward to the next version.

John
Find all posts by this user
Quote this message in a reply
09-24-2017, 01:36 PM (This post was last modified: 09-24-2017 01:38 PM by pier4r.)
Post: #151
RE: List Commands Library for 50g
Agreed with John Keith.

LHDTL is great, why not the same for the last element?

I know it could be done with "list REVLIST LHDTL" but once again, even those small commands 'already there' helps the developing time, especially if one is not focused and does not think about them.

Wikis are great, Contribute :)
Find all posts by this user
Quote this message in a reply
09-24-2017, 01:43 PM
Post: #152
RE: List Commands Library for 50g
(09-23-2017 08:26 PM)DavidM Wrote:  Hey Scott -

My earlier comments were about a specific technique that saves a list to USEROB so that its exploded elements don't bog down GCs that occur while processing. Unfortunately, I haven't been able to nail down an effective way to do that such that a list which starts out in TEMPOB is freed up during the processing phase of the command -- the O/S keeps an active reference to the TEMPOB version so that it can't be freed during a GC, thus taking up twice the space that is actually needed.

I only recall a little bit about the program you were working on, but isn't the bulk of your data in the form of strings? I'm wondering if you could avoid some of this by keeping your string data separate from other data you might want to save in a list. It's still likely that a copy of the string would have to be made each time you manipulate it, but GCs shouldn't be as much of a problem as they can be with lists. Would it be possible for you to combine multiple strings into a single string object by inserting separation characters?

Hi David,
Sounded like a similar problem to what I was having. Not sure if below applies to your issue but will include incase it does.

When researching how to free up memory when manipulating large lists I learned the following. When you extract even one element from a list that has not been stored to a variable it keeps the entire list in memory as long as the extracted element is on the stack. I also found reference that Matrices/Vectors were more memory efficient for calculations if using numbers. (ref William Wicks 'HP48 Insights PART I: PRINCIPLES AND PROGRAMING' Section 11.6 Composite Objects and Memory').

On my program, yes that's exactly the problem I found each time I modified the list a copy of string I was tying to modify was being created. Could not find a way around this with USERRPL. Think I did what you're suggesting, now I store string messages as there own variables in a MESSAGE directory also limited the length to 300 characters and created searching tools to pull together messages with certain criteria. Was hoping there might be a way to explode a list into individual elements while destroying the original list so it's not taking up memory while your exploding it.
Find all posts by this user
Quote this message in a reply
09-24-2017, 02:27 PM
Post: #153
RE: List Commands Library for 50g
(09-24-2017 01:32 PM)John Keith Wrote:  I humbly suggest the name LSSR (for List Subsequence Reversal).

Makes sense to me!

(09-24-2017 01:32 PM)John Keith Wrote:  Basically it is a mirror image of LPOP which returns the last item in the list on level 1, and the rest of the list on level 2. The Goferlists commands Last and Init together do this but Init is very slow. The User RPL sequence 1. OVER SIZE 1. - SUB is faster.

(09-24-2017 01:36 PM)pier4r Wrote:  Agreed with John Keith.

"LPOPR" fits nicely into 5 characters. Does that work for a name?

John, I look forward to hearing your other thoughts/concerns whenever you have the time to put them together.
Find all posts by this user
Quote this message in a reply
09-24-2017, 02:35 PM
Post: #154
RE: List Commands Library for 50g
LPOPL (List POP Last) ?

or LPOPT (List POP tail)

LPOPR (list POP reverse?) is a bit counterintuitive to remember.

Anyway the name is a minor thing if there is a documentation.

Wikis are great, Contribute :)
Find all posts by this user
Quote this message in a reply
09-24-2017, 02:50 PM (This post was last modified: 09-24-2017 02:55 PM by DavidM.)
Post: #155
RE: List Commands Library for 50g
(09-24-2017 01:43 PM)snrowe Wrote:  When researching how to free up memory when manipulating large lists I learned the following. When you extract even one element from a list that has not been stored to a variable it keeps the entire list in memory as long as the extracted element is on the stack. ... Was hoping there might be a way to explode a list into individual elements while destroying the original list so it's not taking up memory while your exploding it.

The UserRPL command NEWOB will help you out in this situation. After extracting the list element, simply execute NEWOB on the object. This actually creates a separate version of the individual object in TEMP, drops the pointer to the embedded one, and replaces it with a pointer to the new one. So long as there are no other references to the list hanging around, the next GC will destroy the list.

Note that this only helps after the extractions have been done; there's no way to have the system discard part of a list while you are still in the middle of an extraction.

Depending on what your code is doing, you may find that you have to force the system to clear out the references that it saves in order to free the memory. One easy way to do this is to simply execute "0 DROP". This won't help if you still have any other references to the original list on the stack, though.
Find all posts by this user
Quote this message in a reply
09-24-2017, 03:02 PM
Post: #156
RE: List Commands Library for 50g
(09-24-2017 02:35 PM)pier4r Wrote:  LPOPL (List POP Last) ?

or LPOPT (List POP tail)

LPOPR (list POP reverse?) is a bit counterintuitive to remember.

Anyway the name is a minor thing if there is a documentation.

I thought briefly about LPOPT, actually. But then I realized that the concept of HEAD and TAIL is already established in RPL as "first element" and "all other elements" instead of first/last.

The "R" was meant to designate "right", which is similar to LRSPL (Right Split). I guess LRPOP would have been more consistent with that naming, but LPOPR just seemed to flow better. Smile

I'm flexible, and will go with whatever makes sense to the interested parties on this one. John, what do you think?
Find all posts by this user
Quote this message in a reply
09-24-2017, 03:03 PM
Post: #157
RE: List Commands Library for 50g
(09-24-2017 02:50 PM)DavidM Wrote:  
(09-24-2017 01:43 PM)snrowe Wrote:  When researching how to free up memory when manipulating large lists I learned the following. When you extract even one element from a list that has not been stored to a variable it keeps the entire list in memory as long as the extracted element is on the stack. ... Was hoping there might be a way to explode a list into individual elements while destroying the original list so it's not taking up memory while your exploding it.

The UserRPL command NEWOB will help you out in this situation. After extracting the list element, simply execute NEWOB on the object. This actually creates a separate version of the individual object in TEMP, drops the pointer to the embedded one, and replaces it with a pointer to the new one. So long as there are no other references to the list hanging around, the next GC will destroy the list.

Depending on what your code is doing, you may find that you have to force the system to clear out the references that it saves in order to free the memory. One easy way to do this is to simply execute "0 DROP". This won't help if you still have any other references to the original list on the stack, though.

Thanks for the suggestion I had tried NEWOB and this gave the best results, but was still problematic if the NEWOB was a large string since it would have to be recompiled in the original list was still in memory.
http://www.hpmuseum.org/forum/thread-7867.html

I was wondering if there were tools to work directly on the list in memory opposed to having to copy the list or any of its elements. For instance is it possible to delete the pointer to the list and dump pointers for each element to the stack (essentially destroying the original list object in memory and replacing it with a pointer to each element that gets dumped to the stack.)
Find all posts by this user
Quote this message in a reply
09-24-2017, 03:09 PM
Post: #158
RE: List Commands Library for 50g
(09-24-2017 03:02 PM)DavidM Wrote:  I thought briefly about LPOPT, actually. But then I realized that the concept of HEAD and TAIL is already established in RPL as "first element" and "all other elements" instead of first/last.

The "R" was meant to designate "right", which is similar to LRSPL (Right Split). I guess LRPOP would have been more consistent with that naming, but LPOPR just seemed to flow better. Smile

I'm flexible, and will go with whatever makes sense to the interested parties on this one. John, what do you think?

I like LPOPR personally.
Find all posts by this user
Quote this message in a reply
09-24-2017, 03:37 PM
Post: #159
RE: List Commands Library for 50g
(09-24-2017 03:03 PM)snrowe Wrote:  Thanks for the suggestion I had tried NEWOB and this gave the best results, but was still problematic if the NEWOB was a large string since it would have to be recompiled in the original list was still in memory.
http://www.hpmuseum.org/forum/thread-7867.html

Thanks for the reference to that other thread... I somehow missed that when you first posted it. It directly references the "exploded list" issues and would have saved me some time if I had seen it! Oh well.

(09-24-2017 03:03 PM)snrowe Wrote:  I was wondering if there were tools to work directly on the list in memory opposed to having to copy the list or any of its elements. For instance is it possible to delete the pointer to the list and dump pointers for each element to the stack (essentially destroying the original list object in memory and replacing it with a pointer to each element that gets dumped to the stack.)

Not in the way that I think you mean. Part of this stems from the fact that the actual memory footprint of a series of discrete objects in TEMP is not the same as the memory layout of a list containing those same objects in TEMP. So there's not a direct way to simply "remove the list prologue and SEMI" and put pointers to all the elements on the stack.

I suppose it may be possible using Saturn code to isolate a given string within a list in TEMP, remove the surrounding (extraneous) list structure, compress the TEMP slot, move all slots above it to keep the chain intact, etc., but that would be a massive undertaking.

One other thought I've had for this: have you considered storing the string to port memory, dropping references, forcing a GC, then recalling/purging? If you have room in one of the ports, that might be a workable alternative.
Find all posts by this user
Quote this message in a reply
09-24-2017, 04:13 PM
Post: #160
RE: List Commands Library for 50g
(09-24-2017 03:37 PM)DavidM Wrote:  I suppose it may be possible using Saturn code to isolate a given string within a list in TEMP, remove the surrounding (extraneous) list structure, compress the TEMP slot, move all slots above it to keep the chain intact, etc., but that would be a massive undertaking.

Actually, this may not be as far-fetched as I thought. The Saturn routine Shrink$Any might be very useful for something like this. I don't have any direct experience with using that type of approach, but perhaps others here might have more insight.

While it would only help with a single list element, I could see an approach that does the following:

- MOVEDOWN the string into the starting position of the list (wiping out the list prologue)
- Adjust the string length field to encompass the entire list size (including the SEMI at the end)
- Shrink$Any the string to the original length, which may in fact handle the rest of the "massive undertaking" that I mentioned above.

Anyone know enough about Shrink$Any to know if this is a sensible approach?
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 1 Guest(s)