THE POLITICAL ECONOMY OF MODULAR ARITHMETIC
An account of the attempted factorisation of the RSA-numbers
3 OCTOBER 2004
I have been trying to get $600000 for four months or so. Du Sautoy's
book [1] must have had some influence on the decision to invest the
time. The RSA website looks reasonable. They are supposed to be a
company with some money, but anyway the interesting thing about the
factorisation challenge is the way that e-commerce would feel some
tremors. The money offered is real money. The public domain source code
for d4maths is widely avaialable, and cached in many places, but of
course someone else could produce factorisations running the software
being used here.
It is also possible that large teams operationg number field sieve
will get through to these numbers, but they are still stuck with an
#exp(#sqrt(#ln(n))) complexity which is definately sub exponential
growth, but not much. It is still worse than any power of #log(n), so
factorisation is still an NP problem, provided that this is the best
method.
The same sort of estimate comes up in the prime number theorem.
pi(x)= integral (1/ ln x) + error(n).
The error term found over one hundred years ago is the quantity
n * #exp(-#sqrt #ln n). To this day no one has pushed exact
estimations of pi(x), the number of primes less than x, to beyond x in
the twenty to thirty digit range.
There are ethical concerns in being involved in the security
industry. When someone wants to keep something secret they aften want
to deceive. Policing the communication channels has always been
important in the growth of the state. There are many popular books on
ciphers and code breaking not to mention reminiscences of those
directly involved. The basic steps of public-private key encryption and
key exchange were worked out in the late 1970s well before the growth
of the internet.
Electronic security has good sales pitch. The expert could argue
that Germany had lost two world wars because of poor communications
security. Next came the Cold War and all those stories of public school
educated cypher clerks subourned into sexual adventures and treachery.
Both Ludlum and Le Carre represent many of the organisations which
invest heavily in electronic security gadgetry as ridddled with
incompetance and beaurocracy as well as conspiracy and interservice
rivalry. The salesman simply has to tell the army that the navy has
ordered something better and the army will want the latest.
The trouble about electronic security is that it is not fool proof.
The mere fact that the industry is so big confirms that the world is
run by fools. Michael Moore may call them stupid white men, but of
course some of the underlying stupidity has been adapted by all races.
The class group which Moore describes includes many with high
electronic security budgets. They didn't have enough Arabic translators
to read internet chatter about 9/11 until well after the event.
In the UK there have been demonstrations outside an American
listening station in Yorkshire. Places like the listening station, and
GCHQ in west England are part of an eavesdropping infrastructure which
is meant to defend us all from terrorism and hostile foreign powers.
These defenders of our liberty can be compared with the crew of a
sinking ship manning hand pumps to stop the wreck from sinking. They
are suffering information overload, while given yet more targets to
monitor.
I once drove up to the Langley headquarters of the CIA although in a
somewhat furtive fashion. I was not sure if I would be confronted by
armed guards or something. I just took the turn off that said CIA and
drove up to a turning place and went back to the freeway which follows
the Potomac, going North out of the city, hoping not to be persued.
Later on I read that someone drove up near there to fire shots at
employees as they arrived to work.
The 1990s saw the privatisation of the internet. Internic went from
the national science foundation to a private company. Very quickly the
world has adapted a center periphery topology with the USA at the
center, but a strong periphery along the Pacific, and North Atlantic.
China and India are already competing for clerical or manufacturing
jobs in the center.
By the time I visited Washington I thought the place a dump. I had
seen the Excorcist with its visions of incompetant but expensive
doctors and general pent up mayhem willing to be unleashed on the mind
of an innocent young girl whose mind had probably been emptied by TV
advertising. There seemed to be a large black underclass which might
get jobs sweeping up in the Pentagon or some other institution, if they
were lucky.
Dulles airport had seemed like a seedy run down bus station on the
outskirts of a Third World city. In the capitalist USA they don't go in
for lavish subsidies to airports and airlines as in most of the rest of
the world. Airports around the Pacific, in Europe, and in Saudi Arabia
had seemed much better than Dulles. In 2004 the business of subsidising
airlines has become less fashionable world wide,
Dulles was rare on potted plants and convenient restaurants on
simulated terraces. The disembarkation trucks seemed old fashioned. I
went on and ended up staying the first night in the Ramada Hotel. This
was the same hotel which had seen a Washington politician (Marion
Barry))busted for sniffing cocaine and cnsorting with prostitutes.
The hotel also had a bible in the drawer. Before I had seen the
Koran or perhaps the sayings of Buddha, in the hotel bedrooms. Welcome
to America, land of the Gideon Bible and the porn channel. It's no
surprise that Sayyid Qutb was willing to mock the place.
Before I arrived in America I only knew about it from books and
films. Philip K Dick had written about places run by a ferocious
military industrial complex willing to enslave more and more of the
people in a relentlessly stupid persuit of unattainable objectives.
American presidents came out of the laboratories of commercial
technology companies that could shrink time and space, and satisfy
certain levels of consumption. These commercial companies could distort
reality in many ways including the use of drugs, electronic stimulation
or ordinary use of the mass media.
Other influences were the French connection and Clint Eastwood. The
French Connection showed people living in poverty. Many drug dealing
films such as Superfly showed poverty. Noam Chomsky wrote about poverty
and deception.
Washington was a center of deception. Lobbying firms would try to
get the government to spend money on subsidised projects ranging from
anti-malaria pills to x-ray lasers and space based defense systems.
Running the world was part of this delusion. The technology is so
good that people think they can run wars from video consoles in the
leafy suburbs, but if the technology is really that effective they
could outsource and run the wars from cheaper countries.
Philip K Dick mentions running wars from video consoles, but the
idea has also appeared in one of Dennis Wheatly's books. A group of
satanists ensconced in Antarctica were trying to help Hitler and
Imperial Japan, and their efforts were effected with the help of
telepathic zombies sitting at consoles. The satanists were foiled
because a British man and his wife got washed up in Antarctica by
accident and they eventually got the upper hand over the satanists.
The idea of popular data encryption is to allow people to spend
money on the internet. Public-private keys may be created on demand.
The first round of RSA style encryption was widely available by the
early 1980s. The Rivest, Shamir, Adelmen (RSA) method was published in
the late 1970s. The basic concept was trapdoor functions. It is very
easy to fall through, but it is almost impossible to get out. You can
very easily write your secret in a code that no one can read, and then
put it in a public letter box to get read by the person who would
either benefit from that secret, or just use it to authorise a payment.
All of this may come as a surprise to those who regard the banking
system of America as one of the most inefficient in the world. Long
after the invention of the RSA system, interstate checking accounts
were very rare. Many banks were very local affairs with often
ramshackle financial foundations. Many banking functions were performed
by credit card companies and these handled most transactions.
American commerce has lead to the demands for secure communications
to validate payments. Key distribution had always been a disadvantage
in earlier forms of encryption but trapdoor functions got round this
difficulty. The first trapdoor functions were brought from the theory
of optimisation. This line of thought was quickly abandoned when the
RSA method came along. A newer version of encryption is the Elliptic
Curve method. Both of these methods give trapdoor functions which are
designated jumps in a box. You can jump somewhere, but you cannot get
back to where you were last move without a magic key. In practice the
RSA algorithm takes significantly more time for decryption than
encryption. Decrypting the RSA algorithm is equivalent to solving the
discrete logarithm problem (DLP).
Back in the 1970s people were working towards a Data Encryption
Standard (DES). This only gave 64-bit security and there was already a
debate as to whether this was good enough. The government had already
put encryption software on its list for export controls. IBM was
involved. Control of electronic communications was important and
costly.
Closely linked with the discrete logarithm problem (DLP) is the
science of pseudo random number generators. D.E. Knuth devotes half a
book to the topic in his treatise on computer programming [5]. Pseudo
random number generators are quite good for commercial encryption. In
spy stories the secret agent encodes the message as numbers, and then
adds a number from a code book to each digit before transmission. The
recipient, who has access to the code book is able to recover the
original message by subtraction. The length of the encryption key is as
long as the message. The code books can easily be replaced by pseudo
random number generators. These are alorithms which churn out streams
of random looking numbers. Most early random number generators used
modular multiplacation: X[n+1] = X[n] * A (#mod M) where A and M are
suitable chosen constants. The modern commercial application of the one
time pad is television. Before transmission of the program each scan
line is rotated by a certain amount, depending on the particular scan
line. The receiver is able to subtract the offset because the user
installed an electronic card which knows the decoding algorithm.
5 OCTOBER 2004
Modern cryptography texts always talk of Bob and Alice. This is
because the letters a and b can be used in algebra. RSA works like
this. Alice keeps a triple (E,D,N) identified with a suffice as
belonging to Alice. Bob has a similar triple (E,D,N).
Bob wants to send a secret message x to Alice, using Alice's public
key (EA,NA). Bob computes y = x^EA (mod NA). Alice decrypts by
computing y^DA (mod NA) and gets back x, the plaintext. Here the symbol
x^n (mod NA) stands for modular exponentiation; that is to say, x is
raised to the n-th power modulo NA, where NA is Alice's public key.
Normally these keys are hundreds of digits long. Computations are
tedious. Some of the technical details can be found as an internet
document called rfc2313.txt. This document contains references going
back to the 1970s. RSA laboriatories is somewhere in Massachussetts.
The people who write this sort of stuff are trying to fashion
industrial grade algorithms. Industrial grade prime numbers are not
really discussed in rfc2313, although implementation details are there.
The British claim that government sponsored researchers came up with
pretty much the same thing back in the 1970s but were prevented from
publishing by government secrecy regulations. The essential RSA
algorithm was fairly hard to patent at the time because lawyers
striking comical poses to try and patent large prime numbers could be
compared with previous idiotic attempts to patent incorrect values of
pi. Nowadays the patent lawyers are casting out for government and
legislature support in patenting algorithms including many with a
history of thousands of years. Software patents are likely hinder
opportunities of developing new technology to all except those in the
employ of large corporations [6].
The RSA algorithm was put in the public domain via the ACM and then
the Internet. It found its way to industrial standardisation documents.
The details were communicated via open channels. Rivest, Shamir and
Adelman got people to listen to them because of some sort of trust
factor. This trust factor was essentially verification that their work
was correct by use of well known theorems going back to Fermat in the
1600s. Legendre and Euler had also contributed to the mathematics.
6 OCTOBER 2004
The RSAC prizes.
name value digits digit sum
RSA-640 $20,000 N 193 806
RSA-704 $30,000 N 212 1009
RSA-768 $50,000 N 232 1018
RSA-896 $75,000 N 270 1222
RSA-1024 $100,000 N 309 1369
RSA-1536 $150,000 N 463 2153
RSA-2048 $200,000 N 617 2738
Total $625,000
Crandell writes that the Avagadro number, about 10^24 represents a
good count of all basic computer operations used in recorded human
history. Going on from this he suggests a brute force attempt to
factorize numbers larger than 10^50 is equivalent to replicating all
computer operations carried out up till 2000. He notes that the
Disney/Pixar animation, "Toy Story" was worth about 10^17 processor
operations which is comparable with the 10^18 processor operations
required for epic factorisations: numbers with around two hundred
digits.
8 OCTOBER 2004
Crandall and Pomerance must go back to the library.
Internet research shows that America's National Security Agency is
better guarded than the CIA. The NSA also has it's own turnoff from the
highway system, but there is a notice warning casual visitors not to
take that exit. The initials NSA apparently mean Never Say Anything,
and the NSA is supposed to be behind attempts to dumb down the Data
Encryption Standard (DES) from 128-bits to 56-effective bits. The idea
was that the NSA wanted the right to eavesdrop on commercial traffic
and break industrial security codes by trying out every possibility
with its banks of supercomputers.
The NSA machinary was too slow to pick up the student references to
the targets of the nine-eleven attacks. Apparently they had been more
busy preventing the export of encryption expertise from the USA rather
than checking out what people were saying in student slang.
How different from the British. We expect our intelligence experts
to finish off the Times crossword (composition now outsourced to South
Asia) in two minutes, and then go on and pick out plots from all parts
of the world which can threaten national security. Meanwhile the top
managers will be told what to dig up on the internet about Saddam's
weapons of mass destruction. What's more the British won't try and ban
the export of cryptography software, because they will not show the
existance of something by trying to ban it.
later ....
The view from my flat is different today. They have demolished the
last of the Claywood Towers. Previously I had looted furniture and
fittings from these derelict blocks. A breadknife, coffee table, wine
glasses and chinese bamboo blinds rewarded my efforts.
later ...
There is much to do. The main factorisation methods of today are the
Quadratic sieve, and the Number Field Sieve: I have yet implement
either of these. The programs seem tricky to write, and sieving time is
to be counted in months for 200 digit numbers.
REPUNITS OR NUMBERS MADE UP OF ONES
One of the most stupid looking methods of factorisation of the
number N involves finding the greatest common divisor (GCD) of N with
the numbers 1, 11, 111, 1111 etc. These numbers are called repunits and
eventually every odd prime number will divide these things. N itself
will be a factor of a repunit. If you could work out how to get the
product of 2^k repunits mod N in polynomial time in k then the
factorisation problem would be solved. You can certianly find the 2^k
th repunit mod N in time proportional to k. The problem arises in
accumulating the product.
9 OCTOBER 2004
Yesterday I visited lowtech media lab. Alex, a placement worker
seemed interested in the idea of continuously running a program in an
attempt to get money. I also looked at the weblogs and saw a couple of
enquiries from the Rochester Institute of Technology (RIT) concerning
the random squares method. During the night I started the matrix
generation phase for this method. The GF(2) linear algebra is the next
step. There are already a couple of 80x80 matrices for small 10-15
digit numbers.
RANDOM SQUARES METHOD
Fri Oct 15 10:15:44 BST 2004
This is simply stated. Given a number N to be factored collect a set
of quadratic residues mod N which are themselves the product of small
primes. Use the set of factorisations X[i]^2 = product q[i,j] to
generate an equation X^2 = Y^2 mod N, and then test gcd(X+Y,N) and
gcd(X-Y,N) for factors of N. The cases X=Y are called trivial
factorisations and they are not so interesting.
Sat Oct 16 09:34:39 BST 2004
Today I must return Silverman's Arithmetic of Elliptic Curves to the
library. It is a day late. The book has obviously been kept in the damp
in the past. It is on interlibrary loan from York University. York is
on a flood plain, and over the last few years there have been
spectacular floods in many neighbourhoods of York. Pundits say it may
be a sign of global warming, while others say the floods are related to
overstocking of animals on the Pennine watersheds.
I have constructed an implementation of the random squares method,
but the method of picking out smooth numbers is very slow. The program
would take about 100 days continuous running to factorise a 20 digit
number. 11 digit numbers can be done rather faster.
LINEAR PROGRAMMING AND MAXIMUM COST MIX
Mon Oct 18 11:48:33 BST 2004
During the 1970s I worked as a computer programmer in Sheffield. The
employer was a Sheffield steel group which leased space in a building
now functioning as the Bristol Hotel. The task was to write a system
which would run locally as a substitute for a networked system which
was already in use, back in 1975. A computer in America worked out how
much special ingredients had to be added to molten steel in order to
meet specification and minimise costs. The American computer was able
to access melt specifications via a link to instrumentation on the shop
floor. In two years working for the company I visited the melting shop
about twice. The company already had a working program available, but
this program was written for the wrong machines.
Most of the working day was spent at the office and during
lunchtimes there were drinking sessions at the pub over the road.
Sometimes I would vary the routine and slip off the the Anglican
Cathedral, to lounge about in the pews and pretend to be meditating.
More recently I have found solace in the Catholic Church on Norfolk
Row. It's possible to go in and buy candles, light them, and then
place them on holders in front of an altar, and meditate for people's
souls. It's like temples over the world: the Sacre Couer of Montmartre,
some Shiite mosques in Iraq and Iran, and Hindu and Buddhist temples
anywhere. The link between religion, candles, and ritual comes from
before the days of the electric light.
The least cost mix project was dubbed 'Maximum Cost Mix' by many in
the office. I tried to make some of the old programs work, but the
source code produced a six inch thick printout of assembly language,
and the first part of the user dialogue was the intriguing question
"file name". The programmer was a brilliant refugee who had fled
Communist cppression in Czechoslovakia, and had just recently fled the
steel group to work in public service for some marine dockyard
authority in the North East. The old programs were written for ICL 1900
computers, and designed to run via MAXIMOP, the then favoured network
operating system. MAXIMOP was an acronym for Maximum Online
Programming. The ICL compatible programs were written in PLAN. The
actual computing work was extremely tedious and boring. The theory was
rather more interesting.
Linear programming was developed in the 1940s as part one of the
scientific management techniques later oversold as the recipe for
winning the second world war. Personnel selection could allow the army
to allocate the best men to the most suitable posts. A German speaking
intellectual such as Henry Kissinger would get a job as chauffeur for
the top brass while businessmen would be assigned to jobs suitable for
personal enrichment in logistics and scientists might get jobs
designing and building mighty weapons. America certainly was building
the information infrastructure necessary for a Marxist-Leninist style
command economy.
Linear programming can be used to devise ways of keeping the right
proportion of amino acids in a scientific diet. You have different
ingredients with different costs and different ratios of amino acids.
Certain thresholds should be exceeded for a balanced diet, but it is
also possible that some quantities should not be exceeded. Pet cats and
dogs benefit from this science. Humans in the richer countries tend to
use irrational criteria or perhaps market forces just go for 'maximise
junk, maximise profit' as a model.
Suppose you have ingredients J, with cost c[j] for ingredient j.
Each ingredient contains A[i,j] percent of amino acid I. The constraint
values for each amino acid i are that the total amount b[i] must lie in
some specified range.
The problem can be written in matrix algebra terms.
Maximise (c.x) given Ax <= b
Here c.x is sum over J of c[j]x[j] while the expression Ax is the
matrix vector product whose row sums are sum { A[i,j]x[j] ; j in J}
Linear programming is tedious and tricky to put on a computer, although
it is possible to work from scratch and develop a system in a few days
now.
The the main difficulty behind linear programming is matrix
generation. The story of amino acids and balanced diet took several
decades to work out. It is just the same as estimating marginal costs
in business. Nevertheless, linear programming is still used in fields
such as portfolio selection or product mixing. There is a large
literature in the field. Integer linear programming with the
constraints that some variables are integer is still a recurrent
problem. The first trapdoor functions were envisaged as a solution to a
class of knapsack problems which are integer linear programs with a
single constraint: maximise c.x given a.x < b.
Exactly the same considerations apply to some factorisation
algorithms. Both Dixon's Random Squares algorithm, and the Continued
Fraction Algorithm use techniques of linear programming. Matrix
generation may take several years, while the set of equations is so
large that calculations are daunting for all but the most powerful
computers. The matrices involved have millions of rows and columns, and
team efforts usually end up with the linear programming phase done one
some monster with access controlled by advance booking.
How does linear programming get involved in factorisation methods ?
The answer goes back to Fermat and Gauss. Fermat determined that some
numbers could easily be factored by writing 0=x^2-y^2 mod N. If
different x and y which can be found which satisfy this equation then
x+y and x-y will be small multiples of factors of N. If N is odd for
example N=ab, then (a+b)^2-(a-b)^2 = 4ab = 4N, so x=a+b and y=a-b are
non trivial solutions of the congruence. Fermat knew this, and he
started testing values x^2-N to see if they are perfect squares. These
values could start at just after #sqrt N and a factorisation might be
quickly found for numbers with almost equal factors. Waiting to find a
perfect square in the series x*x mod N can be tedious, and usually far
slower than trial division. Gauss improved Fermat's method by
suggesting that a solution to x^2=y^2 mod N could be found in parts.
All the numbers x*x (mod N) could be factorised as they were generated
and then some of the equations could be multiplied together to get a
couple of numbers X and Y so that X^2=Y^2 mod N.
Instead of factorising all numbers of the form x*x mod N, or
quadrating residues as they are called, only those numbers made up of a
small number of factors are stored. These numbers are usually prime
numbers up to some limit B. This is called a factor base. Numbers made
up only of factors less than B are called B-smooth numbers.
Dixon's Random Squares algorithm requires the building of a table
with the following columns: x, a=x*x mod N, and the exponents of the
factors of a in the factor base. If some of the prime factors of a
occur in even powers, then these can be divided out from x and a. You
then get a list with items:-
M= 422941
b= 40 [ 12]
X X*X binary-vector
62767 874 o......oo...
247763 1547 ...o.oo.....
249270 1767 .o.....o..o.
421348 3 .o..........
6031 35 ..oo........
33730 1610 o.oo....o...
70881 22 o...o.......
74605 2465 ..o...o..o..
1 1 ............
412908 1131 .o...o...o..
74950 138 oo......o...
306144 195 .oo..o......
216434 19 .......o....
220946 15873 .o..oo.....o
195995 759 .o..o...o...
squares 307 matrix 15 12
Next the program has to seek relations between rows. The trivial
factor x=1 y=1 is found in row eight, so the program goes and tries
this four times by including most of the other rows and attempting to
perform an easy to program matrix inversion. The transcript of the full
factorisation looks like this:
nr= 15 nc= 12 dm= 0
dependency vector 0 0 0 0 0 0 0 0 1 0 0 0
select rows 8
Z= 422941 1 1 1
nr= 15 nc= 12 dm= 1
dependency vector 0 0 0 0 0 0 0 1 0 0 0 0
select rows 8
Z= 422941 1 1 1
nr= 15 nc= 12 dm= 2
dependency vector 0 0 0 0 0 0 1 0 0 0 0 0
select rows 8
Z= 422941 1 1 1
nr= 15 nc= 12 dm= 3
dependency vector 0 0 0 0 0 1 0 0 0 0 0 0
select rows 8
Z= 422941 1 1 1
nr= 15 nc= 12 dm= 4
At this stage the computer has only identified x^2=y^2 with x=y=1.
The matrix generation phase should filter out such trivial
factorisations. Next the program shuffles the rows using a normal
random shuffle algorithm. This time a relation is found almost
immediately.
nr= 15 nc= 12 dm= 0
dependency vector 0 1 1 0 0 1 1 0 0 0 0 0
select rows 1 2 5 6
Z= 422941 122110 4830 577
In fact the full factorisation of 422941 is 577 733. The whole
algorithm seems thousands of times slower than trial division, but the
experts say that it is better for larger numbers. The business of
randomising the rows of the matrix is something which could be avoided
by linear programming. The random squares method is so slow to find
good quadratic residues that it would be worth integrating the business
of matrix generation with the generation of dependencies as early as
possible.
After the random squares method comes the continued fraction method.
Here some ingeneous tricks allow for the generation of many quadratic
residues mod N with only half the digits of N. The ease of getting
quadratic residues comes at some cost; sometimes the continued fraction
sequence will start to repeat or more likely there will be many trivial
factorisations. There is every difference between a good and a bad
matrix. The website for nfsnet.org speaks of factorisation efforts on
some numbers being abandoned, or restarted after the matrix generation
phase failed to produce a useful set of relations between the rows
(that is, a set of rws whose sum is zero). All matrix operations are
done over the field GF(2), the field with two elements 0 and 1 and the
arithmetic defined modulo 2.
PAYROLL CONVERSION PROJECT
Mon Oct 18 19:47:09 BST 2004
Henry called. Henry sells the Big Issue in town. He gets hassled by
the local smack & crack dealers because his face is known. He is a
victim more than a criminal, but he has spent time in jail. He was in
jail during the 1991 Gulf War and he claims that the jailers called out
for recruits to join the struggle against Saddam Hussein. Despite his
good qualities Henry did not get early release that time.
Henry saw me behaving really stupidly. I had just installed CFRAC
and Random Squares on the laptop, but the programs did not work. During
most of the time Henry was there he saw me looking at the screen and
heard me complaining about compatibility problems between Linux and
Windows. After Henry left the house I examined the problem, and saw
that there are both 16 and 32-bit versions of the run time system for
the factorisation programs. Since a month ago I have just been running
the programs with the 16-bit version. The difference between 16 and 32
bits goes back to the 1950s. It's similar to the byte-sex problem which
can cause data to be lost, or misinterpreted. All these problems relate
to the representation of data in a computer, and its addressing.
16-bit computers store most numerical data as two bytes. These bytes
are contiguous in computer memory, and store the data at locations a
and a+1 where a is the location of an integer, or number. These numbers
go from 0 to 2^16-1 with a special convention that numbers greater than
2^15 are considered negative. 32 bit computers use 4 bytes to store
integers. The range is 0 to 2^32-1 with nearly half of these numbers
being negative. Textual data is just stored as arrays of bytes. Modern
computers tend to start storing data at locations whose addresses are
powers of two. These powers of two may go up to values as high as 64.
Address space in memory tends to stay below 2^32 but remember that a
66000 x 66000 matrix will be tricky to handle in this context.
Things go wrong when assumptions are made about the location of
particular bytes in arithmetic. A 16 bit number such as 258 is stored
over two bytes:
LO HI
.-------.
Intel | 2 | 1 | 258 = 2+2^8
.-------.
HI LO
.-------.
Motorala | 1 | 2 | 258 = 2^8+2
.-------.
Getting round this problem was one of the most messy parts of
developing the internet. The business was worse with 32 bit machines.
Intel kept its standard, so a 32-bit number would get stored something
like N = HI,LO,HI,LO where the most significant bits of an integer may
not lie close to its byte address.
Byte sex problems occurred during the time I was trying to convert a
payroll in Paris. The company was called SLIMOS, and the payroll
conversion started in an office belonging to Credit Lyonnais near La
Defonce in the western part of the city. During one morning going to
work my journey was briefly interrupted by a death by suicidal jumping,
or pushing from the crowded metro platform into the tracks of an
oncoming train. I was to squeamish to look at the mangled body. The
original payroll was written for a french machine, and the idea was to
convert it to run on a PDP-11 running a clone of the RSX operating
system. The salesmen had told the customer that the machine could run
programs written in COBOL. I was hired to move about 50 or 60 COBOL
programs over the hardware platforms and operating systems. The man who
hired me was a former French paratrooper who had obviously been in
Algeria at some time. France was full of ex-colonial people who had had
to come back from the countries of the former empire. Of course that
number included many Africans and Asians who had directly benefitted
from French colonialism and they came along to enjoy the benefits of
francophonism as long as possible. During the 1980s I thought the 13
arrondissment tower blocks full of Viet-namese very civilised places.
Monsieur Gerain, the former paratrooper, had written true bloatware.
The COBOL programs were so long and complicated that the PDP-16
compiler would often give up and vomit streams of spurious errors.
Unfortunately the screen coordinates were stored as pairs of bytes
because no-one thought a screen would have more than 256 x 256
characters. Indeed current reality is closer to 10 x 40 or perhaps 100
x 250 pixels on a mobile phone. The whole business of byte sex caused
the cursor positioning routine to fail because COBOL compiled
structures differently on different machines. In the end the project
was abandoned, and I lingered a short while in the office before
getting fired. The other programmers would sometimes call me
Shakespeare because of the way I spoke. Monsieur Gerain was a barbouze,
a man with a beard who had been in the French military in Algeria. The
company for which he worked was part of a network of companies set up
to serve the French military industrial complex. I remember once going
to the offices of another company in the same economic niche. The place
was in a suburb of Paris and on the way there I noticed many police and
military standing around for some V.I.P. convoy. France seemed very
militarised during the 1970s and 80s. The top officials seemed to
command fleets of cars just like Saudi princes. The top assets of the
state become personal fiefdoms of the rich and later infamous.
A payroll program should be simple to implement. Later on in
Thailand I saw students write payroll programs for laptop computers in
days rather than months. These students could program the computers in
BASIC. That was much easier than COBOL, and byte sex problems would
generally not happen unless you wanted to use image dumps or write
machine code video drivers.
A payroll program could be done with any decent table handling
language. The check on the growth of a payroll program needs to be
achieved by a sensible negociation with the client. In practice most
clients are treated as morons by professional software companies. They
add bits and pieces to a program if the client wants more, even though
the client may end up getting an unworkable product.
XEDNI CALCULUS
Mon Oct 18 22:04:58 BST 2004
Some Internet cryptography groups talk about XEDNI calculus. This is
a term used to describe certain attempts to solve the discrete
logarithm problem, but the general methods may be applicable to other
problems including the recognition of certain classes of numbers. XEDNI
is Index spelled backwards. The essential idea is to trade off time
against space. Use huge arrays for hash tables and save time. It's
quite likely that many computer algorithms can be speeded up a lot by
storing common results and looking them up from tables. Modular
inversion with respect to an integer is a good example of potential
speedups. Other approaches would include just in time evaluation of
algebraic expressions. The ordinary random squares algorithm is a case
in point. New values of X and X*X mod N can be entered in a hash table
so that repeated values are found quickly, and matrix algebra could be
made uneccessary.
CFRAC TEST
Tue Oct 19 03:14:02 BST 2004
H<-MK_PQ 30
G<-CFRAC"-v -b 2000 ", H
Construct the number N = 27446761927390629967746984179 as a product
of two primes, then attempt factorisation. The continued fraction
algorithm ran for twenty four hours before producing a 308 x 303
matrix. After about twenty minutes or so it found the following
factorisation, as a difference of squares, mod N. The first six
eliminations just gave the same trivial factorisations. Randomization
torned up a dependency with 42 rows. These rows gave two different
values x, y with x^2=y^2 mod N.
x=14393427734175655991914869795, y=24596980067594104366400046908
gives a factor GCD(N, x+y) = 13513279008157.
The x value comes from multiplying all values in the X column for
the selected rows, while the y value was computed by multiplying out
all non zero elements of the factor base from the selected rows and
then taking a square root, and finally reducing this mod N. The square
root extraction is totally uneccessary: it is possible to collect a
prime factorisation of y^2, and then write y as a product of prime
powers taken from the factor base.
The whole process of solving linear equations by subtracting rows is
called Gaussian elimination. Gauss made his way into the English
language as a unit of measurement for magnetic flux, and later as a
negative verb: to degauss. Gauss was a brilliant mathematician and a
hard working scientist in astronomy and physics. One of his most famous
discoveries was N=delta+delta+delta; any number is the sum of three
triangular numbers. As a young man Gauss proved the Law of Quadratic
Reciprocity, which concerns numbers which are squares modulo some other
number.
ATTACK BY ZOMBIES
Tue Oct 19 20:41:13 BST 2004
Today I was assaulted by a couple of women.
Arrived at Park Hill at about 18:00. Said hello to Jane Croft and
her son as I passed along the balcony to my flat. I commented on the
weather. As I turned the corner of the balcony I noticed two women
waiting in an alcove. One was blond, but unattractive. The other was
more well built with darker hair, but also unattractive. I went to my
door. The blond woman propositioned me for a 'good time'. I said I was
happy enough without her, unlocked the door, and pushed the bike into
the lobby. The blond woman tried to follow me and when I resisted she
said she was pregnent, and accused me of hitting her. We went onto the
balcony in a sort of scuffle, and she called her friend who by this
time had started to walk away down the balcony. The friend ran back
down the balcony and they assaulted me. The blond one started demanding
money. while they both tried to stop me going into my flat and closing
the door. I started shouting "Help ! Help !". No one was around. Then I
pretended to be dying, shouting "My heart, My heart !", -- but I also
continued calling for assistance. After that I was rolling on the floor
while they were trying to access my pockets. At some point the blond
woman shouted that I had stolen money from her, and kept demanding
money, so I gave her 50 pence.
While I was on the floor being assaulted an Asian man passed, but
did nothing. Eventually my shouting and struggling put the women off
continuing the assault. At one point they tried to push me into the
flat to continue the assault there, but that did not happen. They ran
off when it appeared that others were approaching the balcony.
At around 6:15 pm Jane Croft and her son arrived at the balcony.
Jane is head of the tenants association, and we have briefly spoken to
her at tenant meetings. She had been moving out furniture for her
mother who is leaving the estate. She and her son had heard my shouts
from the car park, and had come to see what was happening. After I
picked myself I found my money and mobile phone and everything else was
intact. After speaking with Jane I rang the police at 999 and after
saying it was police emergency and police service I wanted as opposed
to fire or medical emergency the police switchboard did a very
competant job in logging the incident. Shortly after two armed police
with flak jackets, and guns arrived, along with a big Alsation. During
this time I was phoned up by Henry who I had been expecting, and
mentioned the assault. During this time the police pointed to various
people walking in the distance, and asked if I recognised them as the
assailants. They also tried to get accurate descriptions of the
assailants: accurate decriptions which I could not give.
The first police to arrive had obviously been diverted from more
serious business. I mentioned the bite on my hand, and asked if
forensics could investigate the wound to get a DNA sample from the
assailants saliva, but one of the policemen told me that such tests
were too expensive for this sort of case. Eventually they told me to go
in and wait, so I invited Mark and Henry down and I played backgammon
with Henry until the next lot of police arrived. This time there were
two men and a woman. We discussed things standing in the kitchen while
Mark and Henry sat in the front room. One of the men wanted to downplay
the incident and told me that it was the lowest level assault, and not
really an arrestable offence. The fact that I was standing there, had
called for assistance on my own mobile phone, and able to walk and talk
showed that no harm had really been done. I had lost nothing. The fair
haired policeman had made some issue as to why I had given the 50 pence
to the women. In the end I said I had given it to insult them. He had
written something down in his notebook and asked me to sign it. Here I
sat down to read it, which he did not really want me to do, and then I
started arguing about one of the words: 'confused' to be replaced with
'surprised', and also I insisted that they logged the biting in the
incident. The dark haired policeman had been quite reassuring to me. He
told me that the bite would not give me AIDS when I had mentioned the
possibility. At the end stage the dark haired policeman let me look at
the incident report sent out from the control room, and this indeed did
mention blood.
The police left, and later on I saw Henry and Mark out of the door.
Henry spotted a 50 pence piece on the ground, so in the end I had lost
nothing at all.
The whole incident is a good example of the lack of respect for
people which afflicts large areas in contemporary Europe. The
lumpenproletariat are always around, and there are many networks of
ignorance and stupidity where people learn abusive behaviour from each
other. Of course the rot goes right to the very top.
DEMOLITION JOB
Fri Oct 22 11:40:12 BST 2004
This morning I tried putting the factorisation programs into cgi-bin
so that they can be run from a web page. The routine takes a seed
number and the number to be factored. Unfortunately the script works
incredibly slowly when run via the apache server and the loopback
address at 127.0.0.1. Small numbers are OK, but numbers that require
time seem to cause the whole screen to freeze, and the disk drive seems
to engage in a spectacular flurry of activity, almost as though trying
to self destruct. It seems that the Apache server itself may need some
extra things in the configuration file to give enough priority to the
factorisation. The alternative would be to have factorisation running
as java applets on the client machine. This is the best way really
since it uses time on the client machine. Another method is trial
division in a javascript. I am sure these things have already been
done.
In the meantime our great leaders are doing their own types of
demolition jobs in Iraq, and condoning similar operations in Gaza. The
spectre if 'International Terrorism' excites the flatterers and toadies
who try to control access to the leaders therevy ensuring that they
will hear only what they want to hear. Salam Pax, an Iraqi, describes a
visit to Washington. He met Americans who knew his own country as well
as Salam himself. These people gave him their time, although the
leaders very seldom spoke to mere experts low down in the
intelligence-business-government heirarchy. Iraq is a morass for the
high expectations of the neo-cons.
Software is similar. Many don't work government projects are given
the go ahead despite technical infeasability because expectations are
raised between the programmer's prototypes and the salesman's perfect
solution.
LAPTOP COMPUTER
Sun Oct 24 13:04:57 BST 2004
Saifon has just got a Toshiba Laptop to help her in her business. I
just received the first call. It's hard to explain some things over the
phone. The problem is the upper case numeric keys. It's here that the
UK keyboard differs most from the American. Dao put me onto another
person who was trying to help her. I told him that the American
keyboard was the most useful for electronic commerce and not to bother
setting the thing up with a UK keyboard. This whole document is being
typed with the use of American layout because most UK layouts put the
programming symbols on keys that may be completely omitted on UK
keyboards. The vertical bar (|) is particularly annoying. It is used as
a pipe symbol in MS-DOS commands, and the bar represents 'logical or'
in languages such as C. The 'tilda' (~) is also hard to find on UK
keyboards. The man had been talking about reinstalling the operating
system, but that sounds like massive overkill. The trouble is that many
consequences follow from keyboard selection, and certainly with many
flavours of linux the first question in the installation dialog is the
selection of keyboard. The safest choice is always the American layout.
It is hard to persuade people the truth of this when many have become
hostile to America because of its invasion of Iraq and its continued
support of the Zionists. People have to be taught that anti-American
feelings can be counter-productive. The Americans invented personal
computers and the internet, and most of these work best with the
American keyboard layout.
More difficulties are to come. Saifon has offered to give me money
to come and help her. There are many networking issues. I want to see
if I can get Erica Klarreich to help her. My mobile phone is weak on
battery and top up credit. Long range communication is difficult.
.... later
I have been running comparisions between CFRAC and the Lenstra's
Elliptic Curve method. The ECM just worked quite well on a 26 digit
number, but CFRAC would also have done it. The Pollard Rho method would
also generally work on these problems with an iteration count of
between 180000 and a million. The ECM would be excellent if you could
select the best elliptic curve at the beginning. The current ad hoc
method is to select a random point P=(x,y) and a random value A. The
elliptic curve y^2=x^3+Ax+B is chosen by computing B, the constant term
as B= (Y*Y)-X*A+X*X mod N, for X,Y corresponding to the start value P.
Next the program computes further values in the orbit of P, using the
elliptic curve chord tangent process. The values [m]P are chosen
carefully. The value m is a product of low prime numbers and prime
powers. If the order of the elliptic curve is a smooth number, then a
factorisation will be found in time dependent on the largest prime
factor of the order of the curve.
The method employed at the moment is naive. Modular inversions are
done for each addition. The algorithm is quite slow. The advantage is
that a lucky choice may be made first time. An algorithm such as CFRAC
is deterministic, and if it fails, there is no real fall back position:
it is necessary to use the more complicated quadratic sieve. At this
stage I see the project should be abandoned. None of the clever methods
can tackle thirty digit numbers at all, because the factor base becomes
too large, or else smooth numbers become vanishingly uncommon. It is
possible to program a blinking bar which shows the search for smooth
numbers. A long bar shows that a number is not smooth while a vanishing
bar shows a hit, or a smooth number. With large numbers and small
factor bases the bar seems to jump backwards and forwards, but it never
becomes very short. With larger factor bases the bar may well have
smaller lengths, but the movement becomes very sluggish, and for the
RSA numbers it takes minutes to do just a few operations.
A lucky choice in ECM is the best on offer at the moment. This
itself is a problem. Different people should try using different random
seeds or different starting points. Anyone who wanted a lot of users to
crack RSA would like to have some ways of dealing out personal starting
values which were all different. In particular an 'out-of-the-box'
installation might set everyone running the same script.
BENCHMARKS
Mon Oct 25 13:48:16 BST 2004
32-digit numbers.
H<-MK_PQ 32
G<-YEC_FACTOR"-b 20000 -v ", H | trivial factorisation
G<-YEC_FACTOR"-b 50000 -v ", H | trivial factorisation
G<-YEC_FACTOR"-b 14000 -v ", H | factor G found at p ~ 2837.
H 24997047150583041152436806085413
G 2537480016458323
F 9851130644753831
F*G 24997047150583041152436806085413
Last night I visited a friend at the other end of town. He has a
fast computer with windows ME. I tested out CFRAC on this computer
hoping to see if it could get to factor 32 bit numbers. The program
took an our to run. In the meantime my friend got Al-Jazeera on the TV
but because he wanted to listen to his music system I watched in
silence. I was reduced to watching the images of well dressed Arabs
enjoying a Ramadan party in some style with lavish tables and hosts of
cooks and servants. Underneath scrolled latest headlines in Arabic with
easily recognisable stories about the unrolling atrocities in Iraq and
the posturings of our great leaders Bush and Karzai. With multi channel
TV you can just as easily get bored in Arabic as English. In the panel
above the Ramadan partying went ahead at full steam. It's a bit like
Hello! and OK! invading the TV; so much like 1950s British TV.
While all this was Going on CFRAC produced its stuff.
M= 18040037218868180536114750512331
b= 2500 [ 367]
3953626515616998883657277575040 4479648267541 ......
7814935169012739718808651489397 3223013085115545 .oo...
14780204662769888368422512357774 20627908493 ......
..................
7808498673022541507208761507427 1838382837713985 .oo...
15054810785997722544892373305899 5710046366072406 oo....
870084151699329010770992499130 562107293438030 o.o...
squares 828495 matrix 372 367
H<-MK_PQ 32
G<-CFRAC"-v -b 2500 ", H
H 18040037218868180536114750512331
E<-H%F<-G YDX _1
E 661523668787282869
F 27270433500799
E*F 18040037218868180536114750512331
The program used a factor base of 2500 and collected 372 rows. The
correct factorisation emerged during the linear algebra. This is done
in two steps. First the program tries to find a relationship amongst
the first 367 rows, then deletes the first row and tries again until
there are less then 367 rows. If no factor is found all the rows are
shuffled, and the the first step is repeated. If no factor is found at
the stage it is possible to keep running the linear algebra step on the
CFRAC log file until a factor is found.
Epic factorisations involve collecting up to ten million rows and
then getting the linear algebra step to be run on supercomputers. The
advantages of all these linear algebra style methods is that the
progress of factorisation can be monitored, and an estimated completion
date can be worked out from the number of rows found in a given time.
My friend had MathCad on his computer. He wanted to show me some
stuff he had been doing on linear transformations of sextic equations
but after seeing pages and pages of algebraic formula rolled out in a
couple of minutes by MathCad he asked me to validate his method of
eliminating variables a,b,c,d,e,f,g,L,M,N,R,S,T by eye. He breaks
equations into two, adds brackets and gets MathCad to solve each part
in turn, but does not really explain how these things are meant to fit
together. I tried to explain that he was just collecting expressions
that defined algebraic varieties, and told him he should try and study
something about rings and ideals. Any way this friend has great
enthusiasm for what he is doing and it is hard to be discouraging. He
did manage to get a rather complicated method of solving the quartic
out of MathCad. I pointed out that I had seen much more concise methods
worked out by beginners on sci.math.
ACCESS CONTROL DATA BASE
Tue Oct 26 15:24:10 BST 2004
Saifon wants help with her business. She needs to know how to get
the computer to help her, rather than learning how to nurse the
computer. This is difficult because computers are often so sensitive
that it is impossible to get much out of them without going to
expensive courses: courses which cost either time, money, humiliation
or all of these. Learning from a friend can be the most huniliating.
You are made to seem really stupid unless you know how to 'crack' the
Microsoft registry and install stolen software or whatever. About 90%
of 'friends' are unlikely to have heard of the Free Software
Foundation. Parents who teach their children may still be effective.
The children are not necessarily so keen to work business style
programs. The children have a teacher who is interested in the subject,
or the pupil, or both.
Saifon may need an access control database. I keep one. When I get a
phone call I sometimes look for the originating number in the phone
book files, before answering. Most people who call at my flat are
somewhere on the computer. There are numbers for the police and the
housing authority there.
At the moment I am re-examining the Elliptic Curve Factorisation
Method (ECM). I have tried to put in the second stage where an
algorithm computes points [Q][M]P where P is a point on an elliptic
curve, M a highly composite number with factor base S and Q is a prime
number not in S. The number M is just the product of all suitable
primes and prime powers in S: namely the least common multiple of
numbers in S. Even though the algorithm is slow it does not seem to
slow down quite so much as CFRAC for large numbers. For the RSA
challenge numbers the factor base is in millions, and recognising
smooth numbers gets too tedious, but for ECM you just compute a single
large smooth number. In the factorisation business it's always easier
to construct large products, rather than break them up into factors.
SALON SAIFON
Thu Oct 28 03:42:03 BST 2004
CFRAC got installed on Saifon's Toshiba Sattelite Pro.
I tried to teach Saifon and eventually got her to edit an HTML file
with notepad and change the greeting, and then go back to the browser
and see the changes. Windows XP is rather good at this because IE5
brings the page source into notepad which was my first lesson. There
was an interesting group of people there. Saifon's secretary answered
the phone enough for me to get Saifon to run her fingers around the
keyboard. That itself is excessively complicated because the machine
was originally intended for Spain and the owner's handbook is in
Spanish. I spent the first part of the visit reading that, but with
great difficulty. Because a laptop has less keys it's always hunt and
find. There was an interesting group of people there. Monique, the new
girl from Korea seems to be travelling round the world. She visited
Thailand and Laos already. Sex may cost $70 for half an hour but
talking about Buddhism and Christianity may set the customer back even
more if the time runs and runs ... until such a limit that the customer
may end up even marrying the woman. Another girl was interested in the
'server' program which digs up all the images in the cache. The
internet had only been connected for a few days, but there was already
stuff from porn sites. The thai girl went and sat by the computer to
look when I got the refresh rate right.
I had forgotten to take the CD-ROM down on the train with me in the
morning, but Saifon's secretary showed me how to connect the computer
to the Internet. That woman is really smart. She also is already using
the machine for e-mail, but I guess that her telephone answering must
keep her to busy to be able to take time to teach Saifon.
In fact there are plenty of hardware guys amongst the customers.
It's quite likely that someone will even be able to put Linux on the
machine, although not necessarily well.
TETRIS GAME
The 'factor base' algorithms remind me of Tetris, the computer game
of the 1980s. Random shapes, usually quadrominos, fall down and the
player must manipulate these by rotations and translations to fill in
lines without leaving gaps. CFRAC and Quadratic sieve also fill in
lines, and when you get enough lines you win. The trouble is always the
excessive size of the factor base. The quadratic sieve looks tricky to
implement, because sieving takes place over very large numbers. The
formula x^2-n for x > [#sqrt N] will involve numbers with hundreds of
digits, and so they are not easily stored. Quadratic sieve tries to
speed up the recognition of smooth numbers. It is hard to see how to do
this with small block sizes of just a few hundred thousand.
CHINESE REMAINDER THEOREM
The 'Chinese Remainder Theorem' is important in the theory of rings
and ideals. As it happens any of the RSA numbers can be characterised
by just a few dozen remainders modulo P[j] for P[j] in a set of prime
numbers.
RSA_NUM
123018668453011775513049495838496272077285356959533479219732245215172640
05072636575187452021997864693899564749427740638459251925573263034537315
4
82685079170261221429134616704292143116022212404792747377940806653514195
9 7459856902143413
The above number can be represented modulo the following primes
2 1
3 1
5 3
7 5
....... intermediate terms omitted
547 401
557 338
563 368
569 7
571 570
Arithmetic can be done on numbers in the range 0 to product of
primes by treating long numbers as vectors and doing arithmetic modulo
P[j] for each component X[j].
SOCIAL CLEANSING
Mon Nov 1 14:59:24 GMT 2004
The middle classes in the UK vote for social cleansing. The estate
where I live is to be redesigned and all existing tenants have marching
orders. There is no longer a system of waiting lists with points
allocated to people based on need: the powers that be have embarked on
an expensive and complicated system called 'property shop'. The old
system could be brought back with agent programs called 'bid-bots' or
bidding robots, but that would be hard to put over to the idiots that
commission I.T. systems in this country. Some major pharmaceutical
companies are now moving their expensive clinical trial work to Asia.
They will get better statistics in India, home of Ramanujan.
Some of the tenants have lived here most of their lives. They are
involuntary immigrants to a future which seemed a nightmare back in the
1960s. Wholesale, surveillance, and police criminal record checks for
even the most menial jobs, but there is still crime and inequality. One
of the more sensible people at tenants meetings is a previous
construction worker who has been site foreman on several large
projects. He was able to explain the tree planted slopes nearby. Many
tenants want these trees removed to give a clear sight zone for
security cameras but the engineer explained that the trees serve as
soil stabilisation. Of course none of the architect plans seem to take
into account global warming and the possibility of increased frequency
of really heavy rain and mudlslides. Sheffield has many hillsides which
could start moving in real storms.
Returning from the tenant meeting I went up the stairs. The lifts
were unattractive because of the racist graffiti that had been daubed
there that morning. About four levels up I came across a marmalade
coloured cat that I had not seen before. The cat was too shy to be
approached, but it did not run away when I sat down on the corner above
and watched it.
The American elections are tomorrow. Saturation media coverage makes
them boring. Even Osama missed his chance. A definite endorsement of
Ralph Nader would at least have had its tragi-comic value. IMHO Ralph
Nader deserves full marks for perseverence. Noam Chomsky and Soros urge
people to vote for Kerry, but Kerry is from practically the same
machine as Bush. 'We Can Build You' [6] has much to say about the
American system.
For the last few days I have been running ECM (Elliptic Curve
Method) on test numbers with two prime factors. The numbers have only
24 or 25 digits but at least I can test out strategies. It seems better
to try several elliptic curves with small factor bases up to about 4000
or so, then 4000-40000 in phase two. Quite a lot of the numbers have
given factorisations very early on with a lucky choice of the curve.
ECM has a good recommendation: try and win $200000 but if you fail you
might win a million dollars by shedding light on the Birch
Swinnerton-Dyer conjecture !
Eureka! While writing this entry I tried a 30 digit number. It just
stopped shortly into phase two.
H<-MK_PQ 30
G<-YEC_FACTOR"-b 7500 -v -cnt 5 ", H
F<-H%E<-1dG
H 893661884227811292215883301721
F 99552159916210609
E 8976820643369
F*E 893661884227811292215883301721
REFERENCES
[1] Marcus du Sautoy
The Music of the Primes
Fourth Estate / Harper Collins
ISBN 1-84115-579-9
[2] RFC2313 RSA Laboratories
[3] jobzap.txt
[4] rsq.df Source code of CFRAC, Random Squares.
[5] D.E.Knuth The Art of Computer Programming. (c 1969 --)
multiple volumes. Addison Wesley
[6] See Frequently asked questions. Freedevelopers.net
[7] We Can Build You. Philip K. Dick.