[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:Mathematics at DEC
Moderator:RUSURE::EDP
Created:Mon Feb 03 1986
Last Modified:Fri Jun 06 1997
Last Successful Update:Fri Jun 06 1997
Number of topics:2083
Total number of notes:14613

637.0. "Inverse modulo 2**32 -- a puzzle" by CLT::GILBERT (eager like a child) Tue Dec 30 1986 16:30

The following routine computes the multiplicative inverse (modulo 2**32)
of any odd longword.  It's written in Macro-32.

How does it work?  Can you prove that it works?

invsub:	subl3	#1, r0, r2	; subtract 1 from the input (r0) giving r2
	movq	#1, r0		; set r0 to 1; set r1 to zero
10$:	bbc	r1, r0, 20$	; branch to 20$ if r0 has bit r1 clear
	subl2	r2, r0		; subtract r2 from r0
20$:	incl	r1		; increment r1
	addl2	r2, r2		; double r2, and ...
	bneq	10$		; if r2 is not zero, go to 10$
	rsb			; the result is in r0


To make the problem accessible to those with scant knowledge of the
VAX instruction set (even though I added comments to the above code),
the following Pascal function is roughly equivalent.

function invsub (arg: integer) : integer;
type
    int_or_bits =	{ the variant record lets us to treat x (below)	}
	record		{    as both an integer and an array of bits	}
	case boolean of
	    true:   (int: integer);
	    false:  (bits: packed array [0..31] of boolean);
	end;
var
    x: int_or_bits;
    y,z: integer;
begin
z := arg - 1;
x.int := 1;
y := 0;
repeat
    if x.bits[y] then		{ if the y-th bit of x is set then ... }
	x.int := x.int + z;
    y := y + 1;
    z := z + z;
until z = 0;
invsub := x.int;
end;
T.RTitleUserPersonal
Name
DateLines
637.1ENGINE::ROTHWed Dec 31 1986 08:195
    See also note 170;  the multiplicative inverse should satisfy

	X * (INV X) = 1 MOD 2**32

    - Jim
637.2ENGINE::ROTHWed Dec 31 1986 09:2140
    It's easy to see how this works by writing out the multiplication by
    hand (here only 8 bits are shown):

		 xxxxxxxx
	       * yyyyyyyy
	       ----------
		 xxxxxxxx	add if y0 is 1
	        xxxxxxxx	add if y1 is 1
	       xxxxxxxx		add if y2 is 1, ...
	      xxxxxxxx
	  +     etc
	 ----------------
	 zzzzzzzz00000001	desired form of product

    Starting at the lsb we find that X0 and Y0 must be 1.
    Set an accumulator, A, to X.

    shift X left 1 place.
    Now we know X0 is 1, so if A1 is 1 we must add X into the accumulator
    and set Y1 to 1, else we must clear Y1 and not add X to the accumulator.

    Repeat until X has been shifted left 8 bits and cannot affect the low
    bits of the product; at this point A will equal 00000001 and Y will be
    the multiplicative inverse mod 2^8.

    Your routine has some nice embellishments; for example it shifts the
    Y bits in place of the A bits saving a register.  It does not matter
    if you add or subtract X, since only testing the next bit in A counts
    at each step because adding or subtracting xor's the lsb in any case.
    Also, the LSB has been toggled at startup so that the process of
    updating A simultaneously stashes the next correct bit in the Y
    register.

    Nice code. But it seems that there must be a faster way than simply
    looping thru the bits.

    Can anyone think of faster way which uses existing multiply/divide
    instructions?

    - Jim
637.3CLT::GILBERTeager like a childWed Dec 31 1986 09:5812
re .1
	Oops.  My memory fails me.

re .2
	I think it does matter whether you add or subtract or XOR.

	There must be some loop invariants, some equations relating
	the input argument, R0, R1, and R2 that is true on each iteration
	of the loop, and which imply that R0 holds the correct result
	when we fall out of the loop.  For example, at 10$, we know
	(by induction) that R2 = (input-1)*2^R1 mod 2^32, and that
	0 <= R1 <= 30. 	But what is R0?
637.4error---><---answerVINO::JMUNZERWed Jan 07 1987 17:1731
Let   X    = input
      Y(j) = inverse of X with respect to 2^j  (for j = 1, 2, 3, ...)
  but Y(0) = 0

At 10$:  R1(j) = j
	 R2(j) = 2^j * (X - 1)
	 R0(j) = Y(j) + (1 - X * Y(j))

The secret is that (Y) uses low-order bits, and (1 - X*Y) uses high-order
bits.  Someone saw that they could be squeezed together!

There are two cases going from R0(j) to R0(j+1):

		A				
	Y(j+1) = Y(j)
	R0(j) = Y(j)  mod 2^(j+1)
	R0 bit j is zero
	R0(j+1) = R0(j)
		= Y(j) + (1 - X * Y(j))
		= Y(j+1) + (1 - X * Y(j+1))

		B
	Y(j+1) = Y(j) + 2^j
	R0(j) = Y(j) + 2^j  mod 2^(j+1)
	R0 bit j is one
	R0(j+1) = R0(j) - R2(j)
		= Y(j) + (1 - X * Y(j)) - 2^j * (X - 1)
		= Y(j) + 2^j + 1 - X * (Y(j) + 2^j)
		= Y(j+1) + (1 - X * Y(j+1))

John
637.5Thanks!CLT::GILBERTeager like a childFri Jan 09 1987 20:390