```
Z<-FACTORS N;X
Z<-(Z>0)/Z<-(0=N #mod X)*X<-1+iN
Z<-PRIMES N
|Generate prime numbers less than N
Z<-(2=+/y0=N #outer #mod 1+N)/N<-iN
Z<-PRIMES_IN LST
|Extract primes in list
Z<-(2=+/y0=LST #outer #mod LST)/LST
```

2. Establish absence of factors.
Stop wasting time.
3. Quadratic sieve.
This is really a development of Fermat's method.
Based on the expression N=X^2-Y^2
Search for perfect squares in the set X^2-N
By clever modular arithmetic it should be possible to eliminate
most values of X^2-N without ever having to calculate a square root. In
practice, starting from a given X it should be possible to compute new
values X^2-N by simple additions assuming X goes up in unit steps. This
allows the computation of large mostly zero, binary arrays to eliminate
testing. Despite the ease of filtering the search space, the search
space is much larger than testing up to sqrt(N). The quadratic sieve
may be started at sqrt(p*N) to test for pN=X^2-Y^2. This method works
best with large arrays of mostly zero elements, but it seems slow
compared with trial division.
There are many methods that rely on finding solutions to
x^2=y^2 mod N or x^2-Y^2=kN. Clearly the fastest method would be to
guess a pair of such numbers immediately. Both the continued fraction
method and the quadratic sieve attempt to find small quadradic residues
of N in the hope of building an equation
kN= x^2-y^2 => x^2=y^2 (mod N)
If for example residues x^2 = pq y^2=qr z^2=pr then (xyz)^2= (pqr)^2
and we find a solution s^2=t^2 mod N, and hopefully this gives a non
trivial factorisation. There is much literature on this concept and
there is a long history of splitting this trajectory chasing amongst a
large number of machines. With numbers of the RSA challange size the
search for quadratic residues which can factorize may take almost
forever.
4. Trial GCD
The GCD of N and a large number X will nearly always be 1. When it
is not, you get a factor of N, or possibly N itself. The algorithm is:
for (X<-RANDOM;1=R<-GCD(X,N);X<-RANDOM)
return (R)
Here RANDOM guesses numbers, and a lucky guess gives either a factor
or the number N itself. Suppose N=pq then the number X could lie on one
of the lines x=pk or x=qk or possibly x=pqk, where k is a positive
integer.
5. Curve enumeration
Enumerate points on the hyperbola xy=N. Normally (N,1) and (1,N) are
there, but other points are hard to find. Math proofs of the average
order of functions such as d(n) and phi(n) depend very much on these
arguements. The simplest factor search would look for min {x^2+y^2 | xy
= N} with x, y integer. The smallest values could be as near as x=y=
sqrt N. Fermat's test can tell if the number of points is more than
two, but does not say where these points are.
6. Follow trajectories.
(Z/NZ) is a ring and (Z/NZ)* is a semi-group with unit. It is not a
group, and this is easily verified for composite N. Finding a
particular non invertible element is equivalent to the factorization
problem. The idea is to generate sequences of numbers X so that gcd(X,
N) gives a factorization. The elliptic curve method does a group
addition algorithm many times in order to find a non invertible element
of the semi-group (Z/NZ)*. The group inversions are all those of point
distances in some geometrical structure imposed on Z/NZ.
Pollard's p-1 method is based on this logic. This essentially
iterates the sequence X<-(X^p) #mod N where p runs through a list of
primes and prime powers up to a limit. There is a way of reducing the
number of GCD tests by only checking after a hundred iterations or so,
but it is possible to miss out factors this way.
Pollard's p-1 method depends on the fact that one of the values p-1
where p runs through prime factors of N has a factorisation in terms of
very small primes: in other words p-1 is a round number. If the number
N does not have this property then the method will not work. A more
general method uses properties of elliptic curves.
An elliptic curve is defined by an equation such as
Y^2=X^3+aX+b (mod N).
An elliptic curve over N is the set
EC={ (x,y) | y^2=x^3+aX+b (mod N)} U { O=(N,N) }
Here O=(N,N) is the point at infinity.
Given points P=(s,t) and Q=(u,v) it is possible to define a third
point R=P+Q by the chord-tangent process. Draw the line through P,Q to
meet the curve in a third point R'. Draw a vertical line through R' to
intersect the curve at its reflection about the X axis at the point R.
This operation is commutative and associative: to add together points
X,Y and Z then order in which the points are added does not affect the
result of the calculation.
(X+Y)+Z = X+(Y+Z)=(X+Z)+Y
Proof of this fact is not trivial.
If you prove that points on the curve can be paramaterized by values
of the Weirstrass P-function (pronounced pay) then associativity is a
trivial consequence, but this depends on algebra and analysis that took
about 80 years to get worked out.
Given points P=(s,t) and Q=(u,v) the coordinates of the point
R=(x,y) which is equal to P+Q may be calculated by the following
formulae.
Firstly if s-u is none zero define theta=(t-v)/(s-u)
Then x = theta^2 -(s+u)
y' is defined by (y'-t)/(x-s) = theta
y' = t + theta*(x-s)
y = -y' = -t - theta*(x-s)
Secondly if s-u and v-t are both zero, then point addition gives
R=P+P = [2]P. In this case the value theta is the slope of the tangent
to the curve at P=(s,t). theta = (a+3s^2)/2t.
x = theta^2- 2s; y'= t+theta*(x-s) and y= -t-theta*(x-s).
All calculations are done modulo N. Expressions such as a/b are
converted to values modulo N by finding the number b' such that bb' = 1
(mod N) then replacing a/b by a*b' (mod N). b', the multiplicative
inverse of b mod N is found by an process known as the Euclidian
algorithm. If N is composite then there are some values for which the
inversion process fails. These numbers are multiples of a factor of N.
In case N=pq is the product of two prime factors then there are p+q
such points. Division by zero is pay-dirt in factorisation challanges.
In the case that P and Q have equal values for x, i.e. s-u=0, but
the y values are distinct i.e. t+v=0, then we say the vertical line
meets the curve at infinity (O), and P+Q=O.
For a given number N it is easy to construct an elliptic curve EC
passing through any pair of points P,Q in (Z/NZ)x(Z/NZ). The set of
points O,P,P+P,P+P+P, ... etc. forms the orbit of the point P in EC.
For convenience the sum P+P...+P taken m times is written [m]P.
Point multiplication starts off as a dynamic system on
(Z/NZ)x(Z/NZ). Elliptic curves introduce an extra point at infinity.
For any point a on the curve E the mapping F(x) = x+a is a function
which is complicated enough to be used in cryptography. The easy to
write function in fact gives a set of points which jump all over the
box [0,N-1]x[0,N-1].
Elliptic curve works as follows. Select elliptic curves over Z/NZ
then work out the values [m]P for a point P on the curve with a
specially selected value m. If calculation of [m]P requires a division
by zero somewhere along the line, then a factor has been found. In most
cases the calculation will proceed without a hitch, so it is necessary
to chose a new curve and a new point to restart the calculation.
In practice this method much tuning in order to avoid intolerably
sluggish performance.
A function long studied by mathematicians is the bilinear
transformation T(Z)<- (AZ+B)/CZ+D. This can be used for factorisation.
The arithmetic is simpler than the elliptic curve method. Given a non
zero point Z[0] in Z/NZ and a 2x2 matrix
T = |A B| with AB-CD non zero then
|C D| it is possible to define the orbit of
Z[0] under T by the rule
Z[n+1] = (A*Z[n]+B)/C*Z[n]+D)
This could be called the 'blit' method of factorisation and it is
undoubtedly well known. Because the successive powers of the matrix T
may be computed by a squaring and multiplication method it should be
possible to compute T^m * Z[0] for large values of m with only order
K*log(m) operations. Whether this increases the likelyhood of finding
divisors of zero is another question.
In practice the blit method with short orbits (less than 100) seems
more effective than the elliptic curve method in finding small
divisors.
Special choices of the matrix M give lucas sequences.
(X[n+1], Y[n+1]) <- M * (X[n],Y[n])
The fractal curve trajectory z<-z^2+c is used in Pollard's rho
method of factorisation. This gives a very simple method of finding
large factors of a number. The Pollard Rho method was invented in the
1980s, and the idea is extremely clever. The running time is of order
sqrt(p) where p is a factor of the composite number N. Getting the
formula from a book [1] took some time. The local library eventually
provided an inter library loan. For RSAC numbers with from 200-700
digits the effort would still require 10^50 - 10^200 steps.
The method is called the rho method because it looks at a trajectory
of points and seeks a place where the trajectory degenerates into a
closed loop, visiting a set of points in sequence.
._________________________________.
| |
| | x3=x15
| 89 |
| 7 10 | The trajectory x1,x2,x3...
| 6 11 | looks like the greek letter
| 5 12 | rho.
| 4 13 |
| 314 |
| 2 |
| 1 |
._________________________________.
```
Z<-P_FACTOR STR;CNT;A;N;Q;QV;SEED;T;X;Y |Pollard's rho-method. |See
Reisel [2] pp 180-181 CNT<-12000 & A<-SEED<-"" & QV<-0 N<-"-cnt
CNT<-#fi SHIFT|-a A<-SHIFT|-seed SEED<-SHIFT|-v QV<-1" GETOPTS STR IF
0=rA; A<-YRAND N IF 0=rSEED; SEED<-YRAND N Q<-"1" & X<-Y<-SEED & Z<-N
WHILE 0
```0 |Q<-"1" WEND
Z<-A EA B;C;D;R |Euclidean algorithm with A>B R<-A #mod B & D<-B WHILE
~R #equiv "0"; C<-D & D<-R & R<-C #mod D & WEND Z<-D
Z<-YRAND X;N |Long random 0<=ZN<-rX; Z<-z ? #fi X & ->0 Z<-(z ?
#fi 5tX),"0123456789"[?(N-5)r10]
Z<-X YMOD M |Modulus function including negative X IF "-" #ne 1tX; Z<-X
#mod M & ->0 Z<-M-("0"-X) #mod M

6. Analytic expression.
The number theoretic functions phi(N) and sigma(N) have well known
generating functions. The values of these number theoretic functions
can be extracted by contour integration. Numerical integration
techniques could give a method of evaluating these functions. In
practice these integration techniques would be rather flakey and depend
very much on guesswork. People call this stuff a 'Monte Carlo Method'.
There are also options of finding really rapid evaluations of
numbers such as the 2^n binomial coefficient (mod N) for really high
values of n in time polynomial in n. These can be resolved as integrals
of (large) powers of trignometrical functions.
7. Summary
The Rho method was the first breakthrough in hundreds of years. Many
methods have been adapted to factorising numbers of distinct forms such
as 2^n+-1. The RSA challange numbers are so large that elementary
arithmetic operations are significantly slow.
The Number Field Sieve (NFS) [4] seem most promising of the present
methods because there seems plenty of opportunity to bring in results
from other areas of mathematics. There is plenty of choice. Much of
algebraic number theory was motivated by people looking for a proof of
Fermat's conjecture.
The book by Crandell and Pomerance [3] outlines factorisation
methods to be used by the as yet to be invented quantum computer. In
this area researchers are busy teaching the computer to factorize tough
numbers such as 6 and 15.
[1] Prime Numbers and computer methods for factorisation
(2 nd edition)
Hans Reisel. Pub Berkhauser. Boston 1994
[2]
www.rsasecurity.com/rsalabs/challenges/factoring/challengenumbers.txt
www.rsasecurity.com/rsalabs/challenges/factoring/RSA-640.txt
[3] Prime Numbers. A computational Perspective
Crandell & Pomerance
Springer
[4] http://nfsnet.org
Website devoted to number field sieve.
[5] Maths Holy Grail could bring disaster to internet.
Tim Radford , Guardian 7/9/2004