HP Forums

Full Version: Modification of Linear Congruential Generator
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
The LCD RNG is commonly used, mostly because it's deficiencies are well-understood. The modulus does need to be large to be useful. For example, in practice, the RNG X=X*5^19 mod 2^48 was popular (and probably still is) in much scientific work. Fischman has a paper comparing this generator (which is not a very good 48 or 46-bit generator) with all possible 32-bit generators. This 48-bit (actually cycle length 2^46) generator is better than the best 32-bit generator (correlation, distribution, cycle length).

One problem with a power-of-2 modulus is that the last bits have a simple structure. The bottom bit has a cycle of 2, the next bit has a cycle of 4, etc. There has been some discussion (on the internet) of using 128-bit generators (I've been using a 256-bit but not a very good one) and discarding the lowest few bits. Actually, the upper bits have a structure but it's not particularly harmful. The leading bit has a cycle of 2^m for an m-bit generator so the leading bit of a 32-bit generator has a cycle of 2^32. This cycle consists of a string of length 2^31 followed by the complementary string, also of length2^31. The second from the leading bit has a similar structure.

To make bits appear less structured, a large prime modulus can be used. (This generator is never mixed and only misses the number 0.) The disadvantage is that more arithmetic is needed and the parameters are not as easy to find. In the binary case, any multiplier congruent to 3 or 5 modulo 8 gives a full cycle (thus the popularity of powers of 5). For a prime modulus, the multiplier has to be a primitive root which takes a bit of work to find.

With binary moduli, the low order bits between different multipliers (and different moduli) are related. A cycle of 2^k (bit k from the bottom of any binary generator) from 2 different generators will have the same difference thus possibly introducing correlation into a computation. The length of the sequence produced by using 2 binary generators is the same as that of the longer generator. With primes, the length of combined cycles is the LCM of the two generators (the LCM of P-1 and Q-1 for prime generator using P and Q). The prime modulus generators are "independent" in that two of them will fill a square with more random-looking patterns than binary generators do. Thus prime-moduli generators often use Sophie-Germain primes (or "safe" primes) (primes of the form 2*P+1 with P prime.)

In all cases, the continued from A/M where A is the multiplier and M is the modulus, should have all "small" partial quotients. (A large partial quotient means that the entire cycle seems to be made of a few long cycles.)

Now, the "new" (at least I've not seen this idea and I've searched.) It's well-known (among number theorists anyway) that the quadratic residues of a prime are "randomly" distributed (the exact distribution isn't known.) A quadratic residue is a number A for which x^2=A mod P has a solution.The pattern of residues of different primes are "independent" in general. Also, with primes of form 4*P+1, there are P quadratic residues less than 2*P and P above (half of the numbers modulo a prime are residues and half are not.) For primes of for 2*P+1 this is not true (sometimes by a lot). (There's no name for 4*P+1 primes like that for 2*P+1; I'll call them 41-primes and the other 43-primes.)

A quadratic residue cannot be a primitive root. The product of two residues or two non-residues is a residue and the product of a residue and a non-residue is a non-residue.

So here's my method (tested since I thought of it this morning): let the modulus be a prime of form 4*P+1 where P is a prime (I have a nice H50g program to do this). Go through the primitive roots (I think there are P of them so they are common) and for a primitive root B, compute A = B^2 mod 4*P+1. This A will be the multiplier for modulus M-4*P+1. Screen the possible values of A so that A/M developed as a continued fraction has "small" (1,2,3,4...) partial quotients. Then X=X*A mod M is the generator. It has a period of either 2*P and will cycle through the quadratic residues or the non-residues depending on the initial seed. The idea is to use the "randomness" of the quadratic residues to further mix the cycles. Basically, one is "randomly" cycling something more "random" that the sequence 1,2,3,4,....


Short example: M=61493, A=27614 Cycle length: 32746 (fits in 32 bits)

Long example: M=274877910173, A=52905939427 Cycle length: 137438955086 (still small enough that the HP50g doesn't screw up computations)

None of these are cryptographically secure. Lattice reduction (among others) gives parameters that produce the same cycle. More anon; the stuff I have been working on gave the above as a side trip.
(10-13-2020 03:34 AM)ttw Wrote: [ -> ]In all cases, the continued from A/M where A is the multiplier and M is the modulus, should have all "small" partial quotients. (A large partial quotient means that the entire cycle seems to be made of a few long cycles.)

Can you explain what it meant ? An example would be nice.

Quote:Short example: M=61493, A=27614 Cycle length: 32746 (fits in 32 bits)

Long example: M=274877910173, A=52905939427 Cycle length: 137438955086

Cycle length should be n = order(A, M), i.e. smallest powers n, such that A^n ≡ 1 (mod M)

>>> pow(27614, 15373, 61493)
1L
>>> pow(52905939427, 68719477543, 274877910173)
1L

Perhaps you may be interest in PCG


Correlations between successive terms of an LCG are bounded by a complicated expression with the cycle length in the denominator and the numerator is dominated by either the largest partial quotient or (a bit more accurately) the sum of partial quotients of the continued fraction of A/M. Here A is the multiplier and M is the modulus. Thus to get a complete cycle, I need the square of a primitive root. The usual LCG (with prime modulus) generates A*x0, A^2*x0, A^3*x0.... I am walking through the number 1,...M-1, by A^2*x0, A^4*X0, A^6*x0,.... A is a primitive root so the sequence is half the sequence that would be generated in the usual manner. Of course, A/M and (A^2 mod M)/M will have different continued fraction representations. I use M=4*P+1 to avoid the problem of quadratic excess. I noticed (a couple of days ago) that the odd and even terms of the usual LCG alternate between quadratic residues and non-residues. Because of the quadratic excess effect (Wikipedia on quadratic residues gives a good explanation), the odd and even terms can have a different distribution for primes of form 4*k+3. There is no quadratic excess for primes of form 4*k+1.
Could you share the tests/programs?

Maybe others (me) can thinker a bit with them as well. A description is crucial but it is difficult to reproduce what you saw (in terms of numbers, sequences, tests) from it.
Many years ago I was looking at how the HP-41CX is able to generate pseudo-random numbers using it's clock (using date and time whose combination is unique). This approach absolved the user from providing an initial seed.

I implemented a Visual Basic (for Excel) class that implemented a clock-based random number. Rather than using the PC's own clock (the first obvious choice), I devised a virtual clock. Each time you requested a random number, the clock ticked one virtual second (and update the number of minutes, hours, days, etc. if need be). I also realized that I can design a custom time system where, for example, a virtual minute could be 100 second, and a virtual hour be 100 virtual minutes, and so on. You can replace the 100s with other numbers (such as prime numbers) if they give you better results.

Bottom line, my conviction that using a virtual clock with a custom time/date system can help generate good random numbers.

Namir
Two small examples of a Sophie Germain Prime (47=2*23+1=4*10+3) which are primes of the form 2*P+1 (or maybe that's a "safe" prime as authors differ in their notation). I've used these for decades as have others. One reason is that the cycle length of such a prime is 2P and if combined (either as independent variables or just to make a longer cycle) with a prime of form 2*Q+1 (where Q is another prime), the combined length is 2*P*Q. One loses little in cycle length by such combinations.

Another reason to use primes in and LCG is that the low order bits of such a generator do not follow any obvious pattern. Powers of two for M are fast to generate but the lower order bits have obvious patterns.

Definitions:A quadratic residue modulo M (usually a prime) is a number A such that X^2=A modulo M has a solution. A non-residue has no such square root. A primitive element (or primitive root) for M is a number A so that the powers of A modulo M generator all numbers 1 to M-1 (in some order). Primes, powers of primes, and twice powers of primes have primitive elements. The reason is that A must be a primitive element modulo M (so that powers of A generate all numbers from 1 to M-1). Aprimitive element cannot be a quadratic residue. The product of two quadratic residues is a residue; the produce of two non-residues is a residue; and the product of a residue and non-residue is a non-residue.So an LCG with prime modulus M and multiplier A will generate the longest cycle if A is a primitive root. The cycle generated by any element must divide M-1. The residues and non-residues have a parity relationship.

The (actually slight in practice) problem is that the distriburtion of even steps and odd steps of an LCG of the form X=A*X mod M may differ in distribution. The residues of primes of form 4*k+1 are distributed evenly in 1 to 2*k and 2*k+1 to M-1. Likewise the quadratic residues are distributed evenly among the odd and even integers below M. The residues of primes of form 2*k+1 are not evenly distributed in these cases. Here are a couple of examples.

For the prime 17, the residues are: (1, 2, 4, 8; 9. 13. 15, 16) and the non residues are: (3, 5, 6, 7; 10, 11, 12, 14) Half below and half below the midway point. One can count that the odd and even residues are also evenly distributed. (Still don't get half even and half odd in the quarter divisions.)

For the prime 19, the residues are: (1, 4, 5, 6, 7, 9, 11, 16, 17) There are 5 numbers below 9 and 3 above. I haven't seen this mentioned before in the context of pseudo-random number generation.

For the toy generator base 19, using 3 as a multiplier, the numbers generated from x0=1 are: ( 3, 9, 8, 5, 15, 7, 2, 6, 18, 16, 10, 11, 14, 4, 12, 17, 13, 1) The even entries are the residues and the odd entries the non-residues. The even entries average too small; I don't know if this makes much difference in simulations.

For comparison, I'll use the prime 37 (it's 4*k+1 but k isn't prime as I would suggest for real work). Also there are 18 quadratic residues so comparisons are simple between 19 and 37. The number 13 is a primitive root of 37 so for (example only) generator one uses 13^2 mod 37=169 mod 37=21. This is no longer a primitive root but is the square of a primitive root, and thus a quadratic residue. If x0 is a residue, this multipler will walk through the residues; if x0 is a non-residue the the sequence will consist of non-residues.

Using the recurrence X = 21*X mod 37. (21, 34, 11, 9, 4, 10, 25, 7, 36, 16, 3, 26, 28, 33, 27, 12, 1) 9 numbers 18 or below and 8 above, 9 odd, 9 even. For primes for form 4*k+1, the sums of the residues and non-residues are equal; for those of form 2*k+1, this isn't true.

My first suggestion for a LCG is to take M as a prime equal to four times another prime. The multiplier A should be the square of a primitive element mod M.

For either case, the continued fraction of A/M as a fraction should have a small sum of partial quotients. Perhaps having good lattice structure would also be helpful but in practie, I have found that having good partial quotients leads to "fairly good "other properties.

A larger test case (small enough for me to generate today on the HP50g), is M=274877969669 = 4*68719492417+1. The quickest nice primitive root I got was: 72502532330 whose square mod M and thus A = 246657239535.
(10-16-2020 01:18 AM)Namir Wrote: [ -> ]Many years ago I was looking at how the HP-41CX is able to generate pseudo-random numbers using it's clock (using date and time whose combination is unique). This approach absolved the user from providing an initial seed.

I implemented a Visual Basic (for Excel) class that implemented a clock-based random number. Rather than using the PC's own clock (the first obvious choice), I devised a virtual clock. Each time you requested a random number, the clock ticked one virtual second (and update the number of minutes, hours, days, etc. if need be). I also realized that I can design a custom time system where, for example, a virtual minute could be 100 second, and a virtual hour be 100 virtual minutes, and so on. You can replace the 100s with other numbers (such as prime numbers) if they give you better results.

Bottom line, my conviction that using a virtual clock with a custom time/date system can help generate good random numbers.

Namir

I mostly worked on Crays which had a clock that updated at system speed. The Cray-1 had a clock that updated (64 bits) every 12.5 nanoseconds. It made a good seed source. Some micros have "real-time clocks" that update too slowly. (Though the Hp50g is good, the whole thing runs slowly.) When teaching, I always suggested using the clock, writing it out to re-use for debugging if things go wrong and to ignore when things go right.
I implemented the Mersenne Twister on the Prime:
https://www.hpmuseum.org/forum/thread-7716.html
Implicit in all the stuff I wrote is a simpler suggestion. (It does not take advantage of the "independence" of quadratic residues though.) When using a prime modulus and primitive element for multiplier X = X*A mod (M), M should be taken to be a prime of the form 4*k+1 rather than 2*k+1. (This means not using Sophie Germain primes and similar). In the case of 4*k+1 primes, the odd and even calls to the RNG are both uniformly distributed.
I have since discovered that, although there are lots of "local" theorems about the independence of quadratic residues, there are still problems. For primes of the form 4*k+3, the number of residues in (1,2k) is greater than the number in (2k+1,4k); I have been able to detect these even with 64+bit primes. For primes of the form 4*k+1, the intervals, (1,k) and (3k,4k) have more quadratic residues than do (k,2k) and (2k,2k). Also with primes of the form 3*k+1, the number of primes in (2k, 3k) is k, but the other intervals are not equal.

I did find a common computation that may suffer from this; if one computes Gaussian (or normal) variates using the Box-Muller or similar forms, one uses the random numbers u(n) and u(n+1) differently; one is for the radius and the other for the angle in the formula. While the u (Uniform random numbers) are uniformly distributed, the even or odd subsets are not necessarily uniform.

What's happening is that in a purely multiplicative generator (which has lots of good features), the multiplier is a "primitive root" of the base (so all powers of the multiplier cover the entire range). The square of a primitive root cannot be a primitive root (it's a perfect square) but will "walk" along the quadratic residues (or non-residues depending on the initial value) and covers half the range.

This problem does not occur with the usual pseudo-random number generators based on powers of two.

My fancier RNGs do combine various processes with LGCs and shift register generators. This removes all the "usual" problems. I'll post something later after I compute some parameters.
Reference URL's