Factoring large numbers is a time-consuming problem. RSA cryptography, and secure communication over the internet, depends on this fact. Algorithms for factoring numbers are also fascinating in their own right.
The elementary factoring algorithm is trial division, in which we test whether our number is divisible by all prime numbers less than the square root of the number. Alternatively, if we don’t have access to a table of prime numbers beforehand, we can try all potential divisors of the form (which is equivalent to sieving out all multiples of 2 and 3).
Trial division works well up to a point, and in fact trial division up to a small number (say 1 million) is often the first step in factoring an unknown composite number, which can easily be done in under a second these days. However, when a number has only large divisors (say two divisors similar in size), trial division becomes impractical.
Two ideas: can we improve trial division in this situation by starting from the largest potential divisor (e.g. the first number of the form above ) and working down? What about testing randomly generated potential divisors?
Fermat (mid-1600s) implemented a method based on the first idea as follows:
Let’s assume where and are similar in size (i.e. close to ). Then by defining and we can write as . As such, start as the first square number above , and check if is a perfect square. If it is, we can easily work out the two factors and by the definitions of and above; if not, increment to the next square number.
This approach has considerable benefit over trial division starting at in that perfect squares have special forms (e.g. a perfect square can never end in a 2,3,7 or 8), and the number of times we’ll actually need to test if is square is relatively few. While this trick helps with hand calculation, it doesn’t really speed things up in the computer age.
The second idea (i.e. randomly searching for factors) doesn’t help at all. For really large numbers, the probability of randomly selecting a factor basically becomes zero. However, there are ways for looking for certain numbers (congruent squares) that aren’t factors, but allow us to find the factors. While these methods are not exactly random, they cleverly generate a set of numbers that are likely to contain congruent squares. Three of these methods are the continued fraction method, the quadratic sieve, and the number field sieve, the last of which holds the current record for factoring numbres and can work on numbers of around 150 decimal digits.
The generalisation of Fermat’s ideas is as follows. Instead of searching for , we look for and such that
Ignoring trivial cases means that we also require the following two conditions:
Together, these conditions give us a way of finding a proper (i.e. non-trivial) divisor of by finding the greatest common divisor () of and . Why? First of all, the is less than since if then is a multiple of , which contradicts condition (3). Also, we can see that since if then by Euclid’s algorithm, we can find integers and such that
Now, multiply both sides by and we see that
The left-hand side is congruent to zero by condition (1), so that
Which contradicts condition (2).
In the next post I plan to discuss the continued fraction method, which uses the above to factor composite numbers.
Knuth, D.E. (1969) The Art of Computer Programming, Volume 2: Seminumerical algorithms. Addison-Wesley, Reading, Massachusetts.