# HP Forums

Full Version: (42, all flavours) Integer Division - how?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3
(12-13-2020 06:41 PM)Albert Chan Wrote: [ -> ]It would be nice if Free42 exposed FMA(a,b,c) to the user ...

An alternative to FMA is an accurate cross product: ab - cd (or equally ab + cd). This can be used for a number of things including FMA.

Pauli
(corrected)
using floor(q) now, and the 'correct or not' approach:
(replace all NOPs with LBL 00 on a real 42S, or the DM42 for the time being)

Code:
 { 45-Byte Prgm } @     X       Y       Z       T  LBL "DIV" @            b       a  RCL ST Y  RCL ST Y  /  IP @                  q        b       a       a  LASTX  X<Y?  DSE ST Y  NOP @                        floor(q)  b       a  R^  STO ST Y @            a        a       q       b  R^  MOD @          b      a%b      a       q       q  ISG ST L @     c  NOP  X<>Y  LASTX @                c       a       a%b     q   MOD  STO- ST Y @    c       a%c     a'      q       q  X<> ST T @     c       q       a'      q  LASTX @                c       q       a'      q  MOD @          c       b'      a'      q       q  X#Y?  RCL- ST L  X#Y?  DSE ST T  NOP  R^  END

Cheers, Werner
(12-14-2020 05:16 AM)Paul Dale Wrote: [ -> ]An alternative to FMA is an accurate cross product: ab - cd (or equally ab + cd). This can be used for a number of things including FMA.

Yes, but an FMA is built into the Intel library, and extended precision is not..
Werner
(12-14-2020 06:59 AM)Werner Wrote: [ -> ]using floor(q) now, and the 'correct or not' approach:

I started with q = floor(a/b), then correct, but it is not necessary.
You can use round(a/b), ceil(a/b), IP(a/b), ... it does not matter.

The correction, r is still limited to 0, -1

Example, say we use ceil(a/b) for initial q

a/b = 10 - ε ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → q = 10 ﻿ ﻿ ﻿ ﻿ ﻿ → r = -1
a/b = 10 +ε ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ → q = 11 ﻿ ﻿ ﻿ ﻿ ﻿ → r = -1

What if a/b = 10 ± ε, because of round-to-nearest, get rounded to exactly 10 ?
It is OK. For interger, round/ceil/floor/IP does nothing to it.

Since edge case work, so is the rest ...
That simplifies things:

Code:
 { 37-Byte Prgm } @     X       Y       Z       T  LBL "DIV" @            b       a  RCL ST Y  RCL ST Y  /  IP @                  q        b       a       a  Rv  MOD @          b      a%b      a       q       q  ISG ST L @     c  NOP  X<>Y  LASTX @                c       a       a%b     q   MOD  STO- ST Y @    c       a%c     a'      q       q  X<> ST T @     c       q       a'      q  LASTX @                c       q       a'      q  MOD @          c       b'      a'      q       q  X#Y?  RCL- ST L  X#Y?  DSE ST T  NOP  R^  END

Werner
(12-13-2020 06:41 PM)Albert Chan Wrote: [ -> ]Bonus: it removed the integer arguments requirement.

At the risk of being proven wrong again..
as long as b+1 is exact, (and q is not too large), the code for idiv3 works for fractional a and b, too, no?

function idiv3(a,b) -- assumed b > 0
local q, c = floor(a/b), b+1
a, b = a%b - a%c, q%c -- a-b = [2-2c, c-2]
if a==b or a==b-c then return q end
return q-1
end

Werner
(12-14-2020 10:10 AM)Werner Wrote: [ -> ]At the risk of being proven wrong again..
as long as b+1 is exact, (and q is not too large), the code for idiv3 works for fractional a and b, too, no?

function idiv3(a,b) -- assumed b > 0
local q, c = floor(a/b), b+1
a, b = a%b - a%c, q%c -- a-b = [2-2c, c-2]
if a==b or a==b-c then return q end
return q-1
end

Yes, fractions still work!

a = (a//b)*b + (a%b) = q*(c-1) + (a%b)

﻿ ﻿ → a%b - a%c - q%c ≡ 0 (mod c)

If c=b+1 is exact, and q is correct, above identity holds.

And, optimization for (mod c) by 2 comparisons also hold.
Checking range of each terms (note, I use opened end intervals here):

a1 = (a%b) - (a%c) = [0, c-1) + (-c, 0] = (-c, c-1)
b1 = (q%c) = [0, c)

a1 - b1 = (-c,c-1) + (-c,0] = (-2c, c-1)

a1 - b1 ≡ 0 (mod c)﻿ ﻿ ﻿ ﻿ ﻿ → a1 - b1 = 0, or -c

But, a1 - b1 = -c implied we need room, such that c+ulp(c) ≠ c
So, we switched it around, and test: (a1==b1) or (a1==b1-c)

---

c=b+1 (exactly) is equivalent to test: (b+1-1-b == 0) and (b+1-b-1 == 0)
With b>0, ulp(b) = ulp(c) = some multiples of machine epsilon.

This is lua 5.4 implmentation for floor-mod, from llimits.h
PHP Code:
#define luai_nummod(L,a,b,m)  \  { (void)L; (m) = l_mathop(fmod)(a,b); \    if (((m) > 0) ? (b) < 0 : ((m) < 0 && (b) > 0)) (m) += (b); }#endif

For our situation, m and b have the same sized ulp.
Even if it needed sign correction, (m) += (b), there no no cancellation errors.

ULP consideration is needed. Unlike truncated fmod, floor-mod might not be exact.
Eli Bendersky's post showed why fmod is exact: Computing Remainders by Doubling

We need to watch out for cancellation errors, when floor-mod fix the sign.

lua> eps = 1e-10
lua> x, y = eps%1, -eps%1
lua> x, y
1e-10 ﻿ ﻿ ﻿ ﻿ ﻿ 0.9999999999﻿
lua> x+y, 1-y-x ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ -- x+y rounded to 1, but not exactly equal 1
1 ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ ﻿ 8.274037096265818e-018
(12-14-2020 04:12 PM)Albert Chan Wrote: [ -> ]a = (a//b)*b + (a%b) = q*(c-1) + (a%b)

﻿ ﻿ → a%b - a%c - q%c ≡ 0 (mod c)

If c=b+1 is exact, and q is correct, above identity holds.

I forget to show what happened if q is not correct.
In other words, if q is correct, show:

a%b - a%c - (q±1)%c ≠ 0 (mod c)

Assumed what is to proof is actually true, subtract, we get:

±1 ≠ 0 (mod c)

Since b > 0, c = b+1 > 1, this is always true ⇒ assumption is true too.
(12-14-2020 04:12 PM)Albert Chan Wrote: [ -> ]ULP consideration is needed. Unlike truncated fmod, floor-mod might not be exact.

For idiv3(a, b), with tiny negative a, indeed we may hit with cancellation bug.

lua> a = -0.09
lua> b = 1.1 - 1 -- so c = b+1 is exact
lua> idiv3(a , b) -- should be -1
-2

For negative tiny a = -ε

a1 = a%b - a%c = (b-ε) - (c-ε) = -1 + error
b1 = q%c = q = 0, or -1

It is this error that make idiv3 failed.

If the problem scaled to integers, the problem goes away.

lua> A, B = a*0x1p56, b*0x1p56
lua> C = B+1
lua> A%B - A%C
-1
lua> idiv3(A,B)
-1
Yes, but that is only with binary floating point, not with decimal, as in HP calcs ;-)
Werner
(12-15-2020 12:43 PM)Werner Wrote: [ -> ]Yes, but that is only with binary floating point, not with decimal, as in HP calcs ;-)

Not true. When floor-mod "fix" sign, subtraction cancellation may occur, binary or decimal.

But, the fix is easy. (your code already fixed this, by using IP)

For a<0, |a|<b: min(a/b) = min(ulp(b)-b)/b) = min(ulp(b)/b) - 1 > -1

Confirm:
Free42 Binary: 2 53 X^Y 1 - 1/X 1 -﻿ ﻿ → -9.999999999999999e-1
Free42 Decimal: 1E34 1 - 1/X 1 - ﻿ ﻿ ﻿ ﻿ ﻿ → -9.999999999999999999999999999999999e-1

(12-15-2020 10:37 AM)Albert Chan Wrote: [ -> ]For negative tiny a = -ε

a1 = a%b - a%c = (b-ε) - (c-ε) = -1 + error
b1 = q%c = q = 0, or -1

It is this error that make idiv3 failed.

If error = 0, this always work, nothing to fix.
If error ≠ 0, and q = 0, we get this:

a1 == b1 test : -1 + error == 0 ﻿ ﻿ ﻿ ﻿ → error == 1
a1==b1-c test: -1 + error == -c ﻿ ﻿ ﻿ → error == -b

Since error is tiny, error(b-ε) ≤ ½ulp(b), error(c-ε) ≤ ½ulp(c), both tests failed.
This force correction, returning corrected q-1 = -1

This is revised idiv3 should work with non-integer inputs, if c = b+1 is exact.
I also added a fast-path, to speed things up.

PHP Code:
function idiv3c(a,b)        -- assumed b > 0    local q, c = ceil(a/b), b+1    if q*b > a then return q-1 end    a, b = a%b - a%c, q%c   -- a-b = (-2c, c-1)    if a==b or a==b-c then return q end    return q-1end
(12-13-2020 07:43 PM)Thomas Okken Wrote: [ -> ]
(12-13-2020 06:41 PM)Albert Chan Wrote: [ -> ]It would be nice if Free42 exposed FMA(a,b,c) to the user ...

Noted...

Free42 FMA already work in progress !
https://github.com/thomasokken/free42

With FMA, idiv simplified to this. (code optimized for speed, and removed b>0 requirement)
PHP Code:
function idiv5(a,b)    local c = a/b    local q = floor(c)    if b > 0 then b=-b else a=-a end    if q == c and fma(q,b,a) < 0 then return q-1 end    return qend
(12-15-2020 02:51 PM)Albert Chan Wrote: [ -> ]I also added a fast-path, to speed things up.

PHP Code:
function idiv3c(a,b)        -- assumed b > 0    local q, c = ceil(a/b), b+1    if q*b > a then return q-1 end    a, b = a%b - a%c, q%c   -- a-b = (-2c, c-1)    if a==b or a==b-c then return q end    return q-1end

That fast path wouldn't be useful in my case: I wanted to use integer division to correctly split
Cc00 = Qq*Xx + Rr, where each letter is a half-length integer and Cc<Xx

For instance, in a 2-digit calculator:
2300 = 57*40 + 20, but 2300/40=58
2700 = 37*72 + 36

What I did up till now is determine Q and q separately with
Cc0 = Q*Xx + Aa
Aa0 = q*Xx + Rr

and each split Yy0=Q*Xx+Rr is carried out as
Q := Yy0/Xx;
Rr := MOD(Yy0,Xx);
if FRC(Q) >0
then Q := INT(Q);
else
if Rr>0 then
Q := ROUND(Q - Rr/Xx,0);
end;
end;

sadly, that still is faster than the integer division - mainly because the rounding part is almost never reached, certainly not in the 34-digit Free42..

Werner
the rounding part I do as

if R>X-R then Q:= Q-1;

for inputs Cc0 and Xx, and Cc<Xx, if FP(Cc0/Xx)=0, R # X-R (round-to-even does not occur, while in Cc00/Xx it does)

Werner
Actually that code works for Cc00 = Qq*Xx + Rr, too, save for the round-to-even cases, that are very rare.
So I can do

Code:
Qq := Cc00/Xx; Rr := MOD(Cc00,Xx); if FRC(Qq) >0 then Qq := INT(Qq); else   if Rr>0 then     if Rr>Xx-Rr then Qq:= Qq - 1;     elsif Rr = Xx-Rr then     c := Xx+1;     b := MOD(Qq,c);     a := Rr - MOD(CC00,c);     if NOT(a=b or a=b-c ) then Qq:=Qq - 1;   end; end;

and the test for Rr>0 is not necessary either

Werner
(12-17-2020 12:42 PM)Werner Wrote: [ -> ]Actually that code works for Cc00 = Qq*Xx + Rr, too, save for the round-to-even cases, that are very rare.
Code:
    Half-way-case correction ...     elsif Rr = Xx-Rr then     c := Xx+1;     b := INT(Cc00/c);     a := Rr - MOD(CC00,c);     if NOT(a=b or a=b-c ) then Qq:=Qq - 1;   end; end;

Your half-way-case had a typo. It should be b := Qq % c
If Qq is correct, we have:

Cc00 = Qq*Xx + Rr = Qq*c - Qq + Rr
Qq = Qq*c + Rr - Cc00

We take (mod c) for both side, b = LHS, a = RHS, and do the test.

---

If remainder() is available, we can replace (mod c) test with it.
PHP Code:
function remainder(x, y)            -- assumed x >=0, y > 0    local r, halfy = fmod(x, y+y), y/2    if r <= halfy then return r end -- q = (x-r)/y must be even    r = r - y                       -- q is odd, r = (-y/2, y)    return r < halfy and r or r - y -- halfway case: r = -y/2, q is evenend

remainder() guaranteed that halfway case have even quotient.

For half-way case, Qq must be even (example, 11.5 -> 12, 12.5 -> 12)
We can use sign of remainder, for the half-way case correction:

→ if remaider(Cc00, Xx) < 0 then Qq := Qq - 1;

---

If Cc00/Xx fractional part is 0.5, it implied Cc000 / Xx is integer, ends in 5.

To "remove" 3 0's, and turn last digit to 5, Xx = 8*k
(Cc00 ± 100) / Xx will *not* be half-way case, since 100/8 = 12.5, not an integer.

We can use gap test to check for half-way case. This work on a 2-digits calculator:

→ if (Cc00+100)/Xx - Qq < Qq - (Cc00-100)/Xx then Qq := Qq - 1;

Example, if Cc00 = 900, Xx = 40, Qq = round(22.5) = 22

LHS = high gap = 25 - 22 = 3
RHS = low gap﻿ ﻿ = 22 - 20 = 2

This meant Qq were already rounded-down, and no correction needed for it.
(12-17-2020 06:41 PM)Albert Chan Wrote: [ -> ]Your half-way-case had a typo. It should be b := Qq % c

Thanks, I corrected it.

Quote:If Cc00/Xx fractional part is 0.5, it implied Cc000 / Xx is integer, ends in 5.

No.. each 0 is the length of a half integer, so for a 12 digit machine it is 000000 and for Free42/DM42 it's 17 zeroes ;-)

Cc00 = Qq*Xx + Rr is the first part of dividing a multiprecision number stored in sequential registers by a number not exceeding a single register:

R1 R2 R3 R4 R5
Aa Bb Cc Dd Ee ... / Xx

At each step you have the carry from the previous step, Cc, and so you have to do
CcYY = Qq*Xx + Rr

with Rr next step's Cc.

We can only work with 2 letters at the same time, so first

Cc00 = Qq*Xx + Rr (as previously covered)
t := Yy-Xx+Rr; (because Yy+Rr may overflow)
Cc := MOD(t,Xx);
IF t>0 then
Qq := Qq + (t-Cc)/Xx + 1;
end;
Yy := Qq;

Werner
(12-17-2020 06:41 PM)Albert Chan Wrote: [ -> ]If Cc00/Xx fractional part is 0.5, it implied Cc000 / Xx is integer, ends in 5.

To "remove" 3 0's, and turn last digit to 5, Xx = 8*k
(Cc00 ± 100) / Xx will *not* be half-way case, since 100/8 = 12.5, not an integer.

We can use gap test to check for half-way case. This work on a 2-digits calculator:

→ if (Cc00+100)/Xx - Qq < Qq - (Cc00-100)/Xx then Qq := Qq - 1;

(12-17-2020 07:50 PM)Werner Wrote: [ -> ]No.. each 0 is the length of a half integer, so for a 12 digit machine it is 000000 and for Free42/DM42 it's 17 zeroes ;-)

I misunderstood. "0" = h zeroes, not 1 zero (*)
But, the logic holds. Just replace "hidden" 1 with h

$$\large{CcOO \over Xx}$$ is half-way cases → $$Xx = 2^{2h+1} \cdot k$$

$$\large{CcOO\;±\;IOO \over Xx}$$ will *not* be half-way case, since $$\large{10^{2h} \over 2^{2h+1}} = {5^{2h} \over 2}$$, not an integer.

Gap test work with with "half-integers" too.

(*) I had assumed "0" = 1 zero, on a 2-digit calculator, and checked how the code perform.
Code:
Cases      %   Types   404     8.2  FRAC(Qq) > 0, return Qq  2420    48.9  Mod test, return Qq  2050    41.4  Mod test, return Qq - 1    76     1.5  Half-way cases (evenly split 38/38) -------------  4950     100
remainder() based idiv is very elegant.
PHP Code:
function idiv6(a, b)    -- assumed b > 0    local r = a/b    local q = floor(r)    if q == r and remainder(a, b) < 0 then q=q-1 end    return qend

Above is similar to this (6a version possibly faster, only tried to get remainder sign)
PHP Code:
function idiv6a(a, b)   -- assumed b > 0    local r = a/b    local q = floor(r)    if q ~= r then return q end    r = a % (b+b)    local R = r < b and r or r-b    local h = 0.5*b    -- halfway    if R < h or r == h then return q end    return q - 1end

All the code for mod/half-way test is the last if line.

R < h:

if residual is less than half of b, q had rounded-down. For floor divide, no need for correction.

r == h:

if residual is half of b, q (because of round-to-even) is even.
But, r also guaranteed even q -> no need for correction.

Update: slightly faster version 6b, removed R, and r shifted, with range = [-2h, 2h)

PHP Code:
function idiv6b(a, b)   -- assumed b > 0    local r = a/b    local q = floor(r)    if q ~= r then return q end    r = a % (b+b) - b    local h = 0.5*b     -- halfway    if r <= -h or (r < h and r >= 0) then return q end    return q - 1end
The chance of hitting half-way cases are small.
We can do the remainder check by then. This made the code simpler too.

PHP Code:
function idiv6c(a, b)   -- assumed b > 0    local r = a/b    local q = floor(r)    if q ~= r then return q end    r = (a%b) * 2    if r == b then r = a % (b+b) end    return r > b and q-1 or qend

Example, for 2-digit calculator, above/below halfway decides how to correct q

a/b = 700/22 = 31.81..., rounded to q = 32
ulp = (a%b)/b = 18/22 ≈ 0.81 ULP, above halfway, return q-1 = 31

a/b = 700/23 = 30.43..., rounded to q = 30
ulp = (a%b)/b = 10/23 ≈ 0.43 ULP, below halfway, return q = 30

---

For halfway case, r is updated, r = a % (b+b)
Previously, r = b, this implied updated r = b/2, or 3b/2
To maintain even q, and remainder range of ±b, 3b/2 got shifted down 2b, to -b/2

Only -b/2 remainder require correction: a/b = q - 1/2, rounded-up to q, floor-divide should be q-1

Example, for 2-digit calculator

a/b = 460/40 = 11.5, rounded to q = 12 (half-way to even)

r = (a%b)*2 = 40 = b --> signal halfway case
r = a%(b+b) = 60 --> signal remainder = -20, floor-divide should return q-1 = 11

Reuse the code for non-halfway: return r > b and q-1 or q

For above example, r=60, b=40, thus it return q-1 = 11, as expected

Update: removed variable h=b/2. This made the code more robust.

Previously, we had r = a%b, followed by r = a%(b+b) if halfway.
Last line was return r > h and q-1 or q

For halfway case, we may hit with *slight* floor-mod rounding.

If a is negative, a % (b+b) = (b+b) - |a| % (b+b) = 4h - 3h = h
But, 3h term might not be exact, thus last h may have *slight* rounding error.

Update 2: fix a typo for halfway test, r == 0 should be r == b
Pages: 1 2 3
Reference URL's
• HP Forums: https://www.hpmuseum.org/forum/index.php
• :