[Hand-Written Paper Implementation] Asymptotically Fast Factorization of Integers
Step-by-Step hand-written Guide to Dixon's Algorithm Based on the Original 1981 Paper : Asymptotically Fast Factorization of Integers
Quick intro
LeetArxiv is Leetcode for implementing Arxiv papers. We offer weekly, hands-on, step-by-step coding guides to programmers who want to transition into careers in computational research.
In LeetArxiv fashion, we’ll implement this paper step-by-step, page-by-page, semicolon-by-semicolon.
*We provide a hand-written implementation that can be translated to different programming languages.
1.0 Introduction
In 1981, while working at Carleton University, John D. Dixon invented a simple, but remarkably fast integer factorization algorithm. He titled the paper, Asymptotically Fast Factorization of Integers1.
In this article, we shall code Dixon’s original algorithm in C, and establish the algorithm’s implementation nuances.
The original paper is 6 pages long. In typical LeetArxiv style, we’ll go through the entire paper step-by-step. We recommend you open this paper link in a separate tab while coding alongside this guide.
Why is this paper important?
Big O complexity : Dixon’s algorithm was the first ever integer factorization algorithm with proven sub-exponential complexity.
Historical significance : The Quadratic Number Sieve and the General Number Field Sieve2 are optimized version’s of Dixon’s algorithm.
Paper simplicity : The orginal paper is only 6 pages long and super easy to follow.
Page 1 : The Problem to Solve
On page 1, Dixon introduces an identity at the heart of all integer factorization algorithms3 :
where n is a composite number (not a prime), x is not equal to y and x is not equal to negative y.
This identity is proven to provide factors 67% of the time. That is, every 2 out of 3 (x, y) pair works4.
*We dove deep into the theory of modulos here in our Super-friendly Guide to Finite Fields for programmers.
Using the identity above and the GCD algorithm, we can factor any integer, n, into its prime factors.
Here is a hand-written latex example to factor the integer, 35 using the information on page 1.
Let’s factor the number 35 using the identity above.
Step 1 : We set n to 35 to get the equation below.
\(x^2 \equiv y^2 \pmod{35}\)
Step 2 : We brute force x and y values that satisfy the equation above.
\(\begin{array}{ll} \text{When } x \ \text{or} \ y = 1: & 1^2=1 \equiv 1 \pmod{35} \\ \text{When } x \ \text{or} \ y = 2: & 2^2 = 4 \equiv 4 \pmod{35} \\ \text{When } x \ \text{or} \ y = 3: & 3^2 = 9 \equiv 9 \pmod{35} \\ \text{When } x \ \text{or} \ y = 4: & 4^2 = 16 \equiv 16 \pmod{35} \\ \text{When } x \ \text{or} \ y = 5: & 5^2 = 25 \equiv 25 \pmod{35} \\ \text{When } x \ \text{or} \ y = 6: & 6^2 = 36 \equiv 1 \pmod{35} \\ \end{array}\)
Step 3 : We choose values for x and y. From brute-force, we observe
\(\begin{array}{ll} 6^2 \equiv 1 \pmod{35}\\ \text{and}\\ 1^2 \equiv 1^2 \pmod{35}\\ \\ \text{Taking}\\ x = 6 \ \text{and} \ y = 1\\ \text{we get}\\ 6^2 \equiv 1^2 \pmod{35}\\ \end{array}\)Step 4 : Find x-y and x+y, where x = 6 and y = 1.
\(\begin{array}{ll} \text{(i).} \ x-y = 6 - 1 = 5 \\ \text{(i).} \ x+y = 6 + 1 = 7 \end{array}\)
Step 5 : Compute GCD(x+y, n) and GCD(x-y, n), where n = 35, x+y = 5 and x-y = 1
\(\begin{array}{ll} \text{GCD}(5, 35) = 5 \\ \text{GCD}(7, 35) = 7 \\ \end{array}\)
Step 6 : If the GCD’s match then we have our answer : 35 = 5 * 7
Page 2 : Dixon’s Algorithm
This section illustrates Dixon’s approach to solving the identity introduced on page 1. On page 2, Dixon provides pseudocode for his algorithm, then he proves the Big O complexity of his algorithm.
2.0 Brief Introduction to n-smooth Numbers
In number theory, an n-smooth number is an integer whose prime factors are all less than or equal to a given number n.
For example, a 7-smooth number has all prime factors ≤ 7.
2.1 Dixon’s Trick
Dixon observed that brute-forcing squares is somewhat cumbersome.
He proposed a more efficient search : brute-forcing n-smooth numbers.
The screenshot at the beginning of the section formally describes the algorithm. We provide a plain-English, with annotated C, example of the algorithm.
Let’s factorize n = 84923 using Dixon’s Algorithm
Initialization :
Define a list of numbers L, ranging from 1 to 84923.
L = {1, …, 84923}
Define a value v, this is the smoothness factor.
v = 7
Define a list P containing all the prime numbers less than or equal to v.
P = {2, 3, 5, 7}
Define B and Z, two empty lists. B is a list of powers while Z is a list of accepted integers.
B = { } and Z = { }
Step 1 : Write a for loop that indexes the list L. Each element in L is indexed as z. The for loop exits at the end of the list.
int n = 84923; for(int i = 1; i <= n; i++) { int z = i; }
Step 2 : Find z² (mod n). Then find the prime factorization.
\(\begin{array}{c} 1^2 \mod 84923 \equiv 1 \mod 84923 = 2^0 \cdot 3^0 \cdot 5^0 \cdot 7^0 \mod 84923 \\[8pt] \vdots \\[8pt] 513^2 \mod 84923 = 8400 \mod 84923 = 2^4 \cdot 3^1 \cdot 5^2 \cdot 7^1 \mod 84923 \\[8pt] \vdots \\[8pt] 537^2 \mod 84923 = 33600 \mod 84923 = 2^6 \cdot 3^1 \cdot 5^2 \cdot 7^1 \mod 84923 \\[8pt] 538^2 \mod 84923 = 34675 \mod 84923 = 5^2 \cdot 19^1 \cdot 73^1 \mod 84923 \end{array}\)Step 3 : If z² (mod 84923) is 7-smooth, then append it’s powers to list B and append z to list Z.
Z = {1, 513, 537}
B = { [0, 0, 0, 0], [4, 1, 2, 1], [6, 1, 2, 1] }
Step 4 : This step is split into two parts.
Part 1 : Find B modulo 2.
\( B = \left\{ \begin{bmatrix} 0 & 0 & 0 & 0 \\ 4 & 1 & 2 & 1 \\ 6 & 1 & 2 & 1 \end{bmatrix} \right\} \pmod{2} \equiv B = \left\{ \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 1 \\ 0 & 1 & 0 & 1 \end{bmatrix} \right\} \)Part 2 : Check if any row combinations of B sum to even numbers.
For example, summing Row 2 and Row 3 gives us a vector of even numbers.
\(\begin{array}{c} \text{Let } R_2 = \{0,1,0,1\} \text{ and } R_3 = \{0,1,0,1\}, \\[8pt] \text{then } R_2 + R_3 = \{0,1,0,1\} + \{0,1,0,1\} \\[8pt] \hspace{55pt} = \{0,2,0,2\}. \end{array}\)Step 5 : This step is split into four parts.
Part 1. (Finding x): Multiply the corresponding z values for the rows found in Step 4. Then find the square root. This gives us x.
For Row 2, we had 24 * 31 * 52 * 71.
For Row 3, we had 26 * 31 * 52 * 71
Thus, we find x :
\(\begin{array}{ll} (513 \cdot 537) ^ 2 \pmod{84923} = y ^ 2 \\ \\ \text{where}\ x^2 \pmod{84923} = (513 \cdot 537) ^ 2 \pmod{84923} \\ \\ \text{thus} \ x = (513 \cdot 537) \pmod{84923} \\ \\ \text{so} \ x = 275481 \pmod{84923}\\ \\ \text{Finally} \ x = 20712 \pmod{84923}\\ \end{array}\)Part 2. (Finding y) : Multiply the corresponding smooth factorizations for the rows found in Step 4. Then find the square root. This gives us y.
\( \begin{array}{ll} y^2 = (2^4 \cdot 3^1 \cdot 5^2 \cdot 7^1) \times (2^6 \cdot 3^1 \cdot 5^2 \cdot 7^1) \\ \\ \text{By the multiplication law of exponents,} \\ y^2 = 2^{(4+6)} \cdot 3^{(1+1)} \cdot 5^{(2+2)} \cdot 7^{(1+1)} \\ \\ \text{Thus,} \\ y^2 = 2^{10} \cdot 3^2 \cdot 5^4 \cdot 7^2 \\ \\ \text{Taking square roots on both sides gives} \\ y = 2^5 \cdot 3^1 \cdot 5^2 \cdot 7^1 \\ \\ \text{Therefore,} \\ y = 32 \times 3 \times 25 \times 7 \\ \\ \text{Finally,} \\ y = 16800 \end{array} \)Part 3. (Find x + y and x - y) where x = 20712 and y = 16800.
x + y = 20712 + 16800 = 37512
x - y = 20712 - 16800 = 3912
Part 4. Compute GCD(x+y, n) and GCD(x-y, n), where n = 84923, x+y = 292281 and x-y = 258681
\(\begin{array}{ll} \text{GCD}(37512, 84923) = 521 \\ \text{GCD}(3912, 84923) = 163 \\ \end{array}\)Quick check shows 84923 = 521 * 163.
We provided a step-by-step implementation of Dixon’s algorithm. We illustrated all the nuances hidden within the pseudocode. Finally, we gave a concrete example.
If you came this far then it’s only right you subscribe :)
Citations
Dixon, J. D. (1981). Asymptotically fast factorization of integers. Mathematics of Computation, 36(153), 255–260.
Lenstra A.K., & Lenstra, H.W. (1993). The Development of the Number Field Sieve. Lecture Notes in Mathematics. Springer.
Case, M. (2003). A Beginner’s Guide to the General Number Field Sieve. Oregon State University.