number theory - Lehmer's algorithm for finding the greatest common divisor

Wednesday, 4 May 2016

number theory - Lehmer's algorithm for finding the greatest common divisor

I'm trying to understand Lehmer's GCD algorithm but I found two conflicting descriptions about when to end the reduction step.

One comes from Lehmer, and is what Knuth, Jebelean and Wikipedia describe, i.e. to perform the Euclidean algorithm "until the quotients differ".

The other is found in Modern Computer Arithmetic by Brent and Zimmermann: "HalfBezout takes as input two 2-word integers, performs Euclid’s algorithm until the smallest remainder fits in one word".

The problem is that these two methods give slightly different results. For example, for two 64-bits words 10499958131665514997 and 14799178230035213023, the first method computes the following coefficients:

[0, 1, 1, 0]
[1, 0, -1, 1]
[-1, 1, 3, -2]
[3, -2, -7, 5]
[-7, 5, 24, -17]
[24, -17, -31, 22]

[-31, 22, 148, -105]
[148, -105, -179, 127]
[-179, 127, 4623, -3280]
[4623, -3280, -4802, 3407]
[-4802, 3407, 9425, -6687]
[9425, -6687, -23652, 16781]
[-23652, 16781, 80381, -57030]
[80381, -57030, -104033, 73811]
[-104033, 73811, 496513, -352274]
[496513, -352274, -600546, 426085]

[-600546, 426085, 2298151, -1630529]
[2298151, -1630529, -2898697, 2056614]
[-2898697, 2056614, 10994242, -7800371]
[10994242, -7800371, -24887181, 17657356]
[-24887181, 17657356, 85655785, -60772439]
[85655785, -60772439, -110542966, 78429795]
[-110542966, 78429795, 196198751, -139202234]
[196198751, -139202234, -306741717, 217632029]
[-306741717, 217632029, 1116423902, -792098321]

while the second method computes one addtional step:

[1116423902, -792098321, -2539589521, 1801828671]

So my question is, which method is the "correct" one? Or perhaps more precisely, does the second method always produces correct reduction, i.e. the first method terminates a bit too early? Is the first method too "cautious"?

This is the Python code I used to generate the sequences with:

def first(x, y):
    A, B, C, D = 1, 0, 0, 1
    while True:
        if y+C == 0 or y+D == 0:
            break
        q = (x+A)//(y+C)
        if q != (x+B)//(y+D):
            break
        A, B, x, C, D, y = C, D, y, A-q*C, B-q*D, x-q*y
        print(map(int, [A, B, C, D]))

    return A, B, C, D

def second(x, y):
    s = 0; old_s = 1
    t = 1; old_t = 0
    r = y; old_r = x
    while r >= 2**32:
        quotient = old_r//r
        old_r, r = r, old_r - quotient*r
        old_s, s = s, old_s - quotient*s

        old_t, t = t, old_t - quotient*t
        print(map(int, [old_s, old_t, s, t]))
    return old_s, old_t, s, t

x, y = 10499958131665514997, 14799178230035213023
first(x, y)
second(x, y)

Answer

As far as validity of the corresponding algorithm goes, we have considerable freedom. At each step, we have two pairs of coefficients $(A,B)$ and $(C,D)$ that are:

Computed from $x$ and $y$ , the leading bits of long integer inputs $X$ and $Y$ .

Intended to be useful for $X$ and $Y$ : we want $AX+BY$ and $CX+DY$ to be significantly smaller than $X$ and $Y$ .

Valid GCD simplifications for any inputs: we have $\gcd(Ax'+By',Cx'+Dy') = \gcd(x',y')$ for any $x'$ and $y'$ , because we are multiplying by an integer-invertible matrix at each step, so we can express $x'$ and $y'$ as integer linear combinations of $Ax'+By'$ and $Cx'+Dy'$ .

In both of the slightly-different implementations of this algorithm, we use $x$ and $y$ to compute pairs of coefficients until some stopping point. Then we replace $X$ and $Y$ with $AX+BY$ and $CX+DY$ and repeat the algorithm. If we stop sooner or later than necessary, we can't really go wrong; we will always compute the correct GCD in the end. So what happens if we change the stopping point?

If we put the stopping point too soon, then we won't have made as much progress, so $AX+BY$ and $CX+DY$ will be larger than necessary. We will have to do more iterations of the outer loop.

If we put the stopping point too late, then the progress we make on $\gcd(x,y)$ will have stopped reflecting progress on $\gcd(X,Y)$ . The values $Ax+By$ and $Cx+Dy$ will keep shrinking, but $AX+BY$ and $CX+DY$ will behave essentially randomly: they may shrink or they may start growing again. We may have to do more iterations of the outer loop.

Lehmer's algorithm as outlined in Wikipedia puts the stopping point as exactly the point at which optimal steps for $(x,y)$ are no longer necessarily optimal for $(X,Y)$ . We know that $\frac{AX+BY}{CX+DY}$ and $\frac{Ax+By}{Cx+Dy}$ are both always between $q_1 = \frac{A(x+1)+By}{C(x+1)+Dy}$ and $q_2 = \frac{Ax+B(y+1)}{Cx+D(y+1)}$ , so if there is not an integer between $q_1$ and $q_2$ , the calculation for $(X,Y)$ is unambiguously the same as the calculation for $(x,y)$ . Once that stops being true, the algorithm stops.

The HalfBezout algorithm in Modern Computer Arithmetic doesn't bother calculating the upper and lower bounds. Instead, it knows that the stopping point will come approximately when the remainders are half the size of a word, so it just stops when that condition is satisfied.

This heuristic is sometimes too careful and sometimes too reckless. In particular, when $x$ or $y$ are smaller than expected, it will actually perform fewer steps than it could. In other cases, such as your example, it performs more steps than the quotient-comparison method.

However, it only needs to compute one quotient at each step rather than two, and it's a heuristic that guarantees a rate of progress (by approximately one half-word) in $\gcd(X,Y)$ , so it's probably fine.

Blog

Wednesday, 4 May 2016