# Factoring large numbers with the continued fraction method

Recently I mentioned a general method for factoring a large number, not by searching explicitly for factors of the number, but searching instead for congruent squares. This method lies at the heart of most contemporary algorithms for factoring large numbers.
Let’s look at the “continued fraction method” for finding such congruent squares. This method was state of the art in the 1970s, and although Schneier writes that this method is “not even in the running” compared to other more recent methods (e.g. quadratic sieve, elliptic curve method, and the number field sieve), it is worth a look for a few reasons:

• The continued fraction method has about 90% in common with the more powerful quadratic sieve
• There’s a certain joy in computing continued fractions, and some sort of magic involved in why it works
• This method blows trial division out of the water, and if your only factoring experience is with trial division, you’ll be suitably amazed at its power.

The algorithm gets its name from the method of finding congruent squares $\pmod n$. Namely, we compute the continued fraction convergents of $\sqrt{n}$:
$\sqrt{n} = a_0 + \cfrac{1}{a_1+ \cfrac{1}{a_2+ \cfrac{1}{a_3+\dotsb }}}$
The basic idea is that at each step, the continued fraction representation of $\sqrt{n}$ is in some sense the ‘best’ rational approximation so far, and so the residual is a relatively small number. As an example, if we stop at step $i$, then we can work out the convergent (i.e. the simplified continued fraction) from standard recurrence relations:
$\sqrt{n} \approx h_i/k_i$
Squaring this, we get a ‘residual’ $r_i$
$nk^2_i - h_i^2 =(-1)^i r_i$
So that
$h_i^2 \equiv (-1)^{i+1} r_i \pmod n$.
Which is relatively small compared to $n$ and, as such, is relatively easy to factor. The next part of the algorithm involves factoring $r_i$ over a previously chosen ‘prime base’, which may be the first few primes for small numbers, and 100s of primes for larger numbers. Since $r_i$ is relatively small by design of the algorithm, it can be quickly factored over the prime base.

As an example, let’s apply this method to the number 2257 with the prime base of [2,3]. The first two congruences that fully factor over the prime base are:
$47^2 \equiv -2^4 3^1 \pmod {2257}$
$95^2 \equiv -2^0 3^1 \pmod {2257}$
Although the right-hand side of both of these equations are not square, combining the two equations yields a pair of congruent squares, namely:
$(47 \cdot 95)^2 \equiv 2^4 3^2 \pmod {2257}$
Which reduces to
$2208^2 \equiv 12^2 \pmod {2257}$
Which is a non-trivial congruent square, and as discussed previously, taking the $\gcd$ of $2208 \pm 12$ and $2257$ gives proper factors of 2257. Try it!

Automating this algorithm requires a way of combining congruences such as the above in order to obtain perfect squares. This is achieved using row reduction of a matrix to find a linear dependence in the prime base exponents $\pmod 2$. There are also some subtleties in the choice of primes for the factor base; replacing $n$ by $kn$ with $k$ square-free allows for a different (possibly better) factor base to be chosen. More details are provided in the references below.

If you’d like to have a go at implementing this algorithm, here are a few test cases. All are semiprimes (i.e. the product of two similarly sized prime numbers). The second last is infeasible using trial division (or at least my implementation of it), and the last broke Pollard’s rho method (e.g. through python’s sympy.ntheory.factorint() function). I estimate that my implementation of the continued fraction factorisation algorithm can factor integers of the order of 40 decimal digits. When $n$ gets large, diligence is required to ensure all calculations occur on arbitrary precision integers rather than floating point numbers!

 # digits semiprime 4 9271 6 418679 10 6426786497 15 665059573553159 21 729461128210276840421 30 194551432662383450877400470563