Factoring large numbers with the continued fraction method

Recently I mentioned a general method for factoring a large number, not by searching explicitly for factors of the number, but searching instead for congruent squares. This method lies at the heart of most contemporary algorithms for factoring large numbers.
Let’s look at the “continued fraction method” for finding such congruent squares. This method was state of the art in the 1970s, and although Schneier writes that this method is “not even in the running” compared to other more recent methods (e.g. quadratic sieve, elliptic curve method, and the number field sieve), it is worth a look for a few reasons:

  • The continued fraction method has about 90% in common with the more powerful quadratic sieve
  • There’s a certain joy in computing continued fractions, and some sort of magic involved in why it works
  • This method blows trial division out of the water, and if your only factoring experience is with trial division, you’ll be suitably amazed at its power.

The algorithm gets its name from the method of finding congruent squares \pmod n. Namely, we compute the continued fraction convergents of \sqrt{n}:
\sqrt{n} = a_0 + \cfrac{1}{a_1+ \cfrac{1}{a_2+ \cfrac{1}{a_3+\dotsb }}}
The basic idea is that at each step, the continued fraction representation of \sqrt{n} is in some sense the ‘best’ rational approximation so far, and so the residual is a relatively small number. As an example, if we stop at step i, then we can work out the convergent (i.e. the simplified continued fraction) from standard recurrence relations:
\sqrt{n} \approx h_i/k_i
Squaring this, we get a ‘residual’ r_i
nk^2_i - h_i^2 =(-1)^i r_i
So that
h_i^2 \equiv (-1)^{i+1} r_i \pmod n.
Which is relatively small compared to n and, as such, is relatively easy to factor. The next part of the algorithm involves factoring r_i over a previously chosen ‘prime base’, which may be the first few primes for small numbers, and 100s of primes for larger numbers. Since r_i is relatively small by design of the algorithm, it can be quickly factored over the prime base.

As an example, let’s apply this method to the number 2257 with the prime base of [2,3]. The first two congruences that fully factor over the prime base are:
47^2 \equiv -2^4 3^1 \pmod {2257}
95^2 \equiv -2^0 3^1 \pmod {2257}
Although the right-hand side of both of these equations are not square, combining the two equations yields a pair of congruent squares, namely:
(47 \cdot 95)^2 \equiv 2^4 3^2 \pmod {2257}
Which reduces to
2208^2 \equiv 12^2 \pmod {2257}
Which is a non-trivial congruent square, and as discussed previously, taking the \gcd of 2208 \pm 12 and 2257 gives proper factors of 2257. Try it!

 
Automating this algorithm requires a way of combining congruences such as the above in order to obtain perfect squares. This is achieved using row reduction of a matrix to find a linear dependence in the prime base exponents \pmod 2. There are also some subtleties in the choice of primes for the factor base; replacing n by kn with k square-free allows for a different (possibly better) factor base to be chosen. More details are provided in the references below.
 

If you’d like to have a go at implementing this algorithm, here are a few test cases. All are semiprimes (i.e. the product of two similarly sized prime numbers). The second last is infeasible using trial division (or at least my implementation of it), and the last broke Pollard’s rho method (e.g. through python’s sympy.ntheory.factorint() function). I estimate that my implementation of the continued fraction factorisation algorithm can factor integers of the order of 40 decimal digits. When n gets large, diligence is required to ensure all calculations occur on arbitrary precision integers rather than floating point numbers!

# digits semiprime
4 9271
6 418679
10 6426786497
15 665059573553159
21 729461128210276840421
30 194551432662383450877400470563

 

Further reading

Cohen, H. (1993) A Course in Computational Algebraic Number Theory. Springer-Verlag, Berlin Heidelberg

Crandall, R. and Pomerance, C. (2005) Prime Numbers: a computational perspective. Second Edition, Springer, New York.

Morrison, M.A. and Brillhart, J. (1975) A method of factoring and the factorization of F7. Math. Comp. 29, 183-205

Schneier, B. (1996) Applied Cryptography: protocols, algorithms, and source code in C. Second Edition, Wiley, New York.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s