rgc
This is generated from a plaintext file, rgc.doc.
Return
Next |End
This page is dedicated to all those who will never really be able to
work out the differences between pointers and a subset of the positive
integers. C was first developed by AT&T and many new features and
programmes were created at the University of Berkely in California.
Next | Back|Top|End
A compiler translates a human readable language to machine
instructions. Perhaps the first compiler was Ada, Countess of Lovelace.
She compiled the instructions for calculating Bernoulli numbers on
Babbage's analytical engine. These machine instructions comprised
addition, multiplication, subtraction, division and moving numbers
between registers. It's not so much different today, except that
compilers are programs and the human translators are called 'systems
analysts'.
Nowadays the process is that the system analyst translates user
specifications to orders for hardware and software, often involving
bribery and back-hand payments, and also translate ill considered wish
lists to action plans which are then handed to programmers to translate
into machine computer readable instructions with the assistance of
compilers and interpretors.
Analyst Programmer + compiler
Wishlist -> Action plan -> Computer instructions.
Sometimes the problems arising from ill considered specifications
will be treated in a systematic fashion by people working for the love
of it. Many useful image conversion programs and other software tools
arise this way.
There are cases when the writing of software can go badly wrong and
inadequate and erroneous systems or malicious code can proliferate onto
many machines. Both virus and delayed and cancelled projects are
symptoms of the inherent contradictions in monopoly capitalism.
Media elites are often afraid of those with the skills and
dedication required to work on file and format conversions. American
lawyers tried to get criminal proceedings and confiscations against a
Norwegian schoolboy who wrote a program to convert the DVD format to
ordinary video.
A compiler translates a source language into machine code. The
machine code must be installed and run by another program, normally the
operating system. Nowadays compilers are used to write about 99.9
percent of the operating system.
An interpretor reads a source language and translates statements and
runs them on the fly. Interpreted languages always run much slower than
compiled languages. In the past interpreted languages were much more
intereactive than compiled languages. Nowadays compilers come with
interpreted versions for development use.
Algol 60 Comittee 1960
CPL Cambridge & London Universities 1963
BCPL Martin Richards 1967
B Ken Thompson, Bell Labs 1970
C Dennis Ritchie, Bell Labs 1972
C++ ??
C# Microsoft 2000
Algol is an acronym for algorithmic language while CPL stands for
combined programming language. BCPL is CPL prefixed by Basic. C++ and
C# are self describing.
Algol 60 introduced Backus-Naur metalanguage for specifications.
This was a great advance because the full language definition only took
about five pages. The important thing is that the Backus-Naur
metalanguage makes extensive use of circular references. This
description was used extensively by Japanese PC manufacturers in the
early 1980s.
Most recent languages such as PERL, Java and Javascript borrow
extensively from the syntax of C.
The first C program that most people learn is quite short.
#include
main()
{
printf("Hello World\n");
}
Next | Back|Top|End
C is an Ascii language; that is to say it is a written language with
95 possible characters. C has features in common with many other
languages.
* Comments can be distinguished from the program.
* Spaces are needed to separate words of the program.
* Strings enclosed in quotes
* Numbers are digit strings including optional decimal point
* Identifiers are alpha-numeric strings starting with a letter.
* Operators are symbols such as '+','-','*','/', etc.
* Brackets such as '(',')','[',']' come in pairs.
Comments are introduced by '/*' and terminated with '*/'. White
space consists of blanks and horizontal tab characters. C uses the
backslash character to extend statements over more than one line, but
it is seldom necessary to do this.
In fact identifiers or names may also include the underscore
character '_'. There are also widely observed conventions about names.
Names starting with underscore are often used in system libraries, and
names consisting exclusively of upper case letters are often used to
denote symbolic constants or macros defined in #include files. The
backslash character is special; its main use is the creation of
non-printeable characters. Typical examples are:-
\b back space 0X08 010 8
\n newline 0X0B 013 11
\r carriage return 0X0D 015 13
\t horizontal tab 0X09 011 9
Next | Back|Top|End
Shortened Backus-Naur definition of C. This is environmentally
friendly means of presenting definitions because circular definition
saves paper. The metacharacter '|' means or. The string '::=' means the
same as. A definition such as "B::= 0 | 1 | B 0 | B 1" means a string
consisting of the characters 0 and 1 with at least one character.
See
Digits
digit ::= '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'
also write as 0-9.
::= Empty string. No characters.
Letters:
A::= A-Z and a-z Upper or lowercase letter
C::= A | '_' Letter or underscore.
X::= any printable character
e::= 'E'|'e' Exponent character
.::= '.' Decimal point
I::= C | I C | I D C-symbol
S::= empty | empty X
Operators
f::= '+' | '-' | '*' | '/' | '%' | '<' | '>' | '&' | '~' | '?'
'^' | '!' | '|' | '=='| '<=' | '>=' | '!=' | '||' | '&&'
Separators. Like punctuation.
P::= '(' | ')' | '[' | ']' | ',' | ';' | '{' | '}'
T Text
T::= "S" Double quotes
N Numeric literal
UI::= 'X' Character constant
UI::= digit | UI digit Unsigned integer
int::= + UI | - UI Signed integer
R::= int . digit | R digit Decimal number
N::= int | R | R e int General type
Expression
E::= D | T | I | I (E-list) | (E) | + E | E + E | L=E
E-list::= empty | E | E-list , E
+::= f | I Function or C-symbol
-::= ++ | -- Prefix/Postfix operator
f::= Primitive function, or operator
D::= N | T Data
E::= I Identifier
E::= E.E | E->E Structure selection
E::= +E Monadic operator
E::= E+E Dyadic operator
E::= (E) Priority of evaluation
E::= E[E] Indexing.
E::= L=E Assignment
E::= L f= E L = L f E
E::= - L | L - Prefix / postfix operator
L::= I | I[E] L-Value
An identifier consists of a sequence of letters, digits, and the
underscore (_). The starting character must be either a letter or
underscore.
A numeric token is a sequence starting with a digit or sign and
including an optional decimal point and an exponent string consisting
the letter 'E' or 'e' followed by a number which may have a sign (+ or
-). Because the minus sign, '-', is used as an arithmetic operator the
Text literals consist of any string between double quotes ("). If
the double quote is itself to be included in a string then it must be
escaped with backslash (\).
Statements consist of one or more expressions, separated by commas.
These expressions are evaluated in sequence and the result of the
statement is the last value calculated. Normally evaluation proceeds
from right to left with certain rules for priority of evaluation. A
compound statement is a sequence of statements followed by semi-colons
(;). A series of statements may be made into a function. Functions must
have an arguement list consisting of zero or more identifiers enclosed
in parenthesis. The first line of the function acts as a template.
Functions are defined in source files. There is a way of sharing
variables between some source files but excluding them from others.
A function definition consists of two parts: the function header,
and the function body. Notice that has itself become a keyword
in HTML.
Here stands for a null text string, that is to say a text
string with possibly no characters at all. The symbol 'I' stands for an
identifier as defined in the section on syntax.
Next | Back|Top|End
C is a typed language. That means that the programmer must specify
in great detail every item of data to be manipulated by the program. If
the programmer does not want to do this the C compiler will make an
assumption that the data is of integer, or number type.
Primitive data types are normally chunks of from 1 to 16 bytes, and
these reflect the underlying computer hardware. To the programmer the
most frequent are 'char', 'unsigned char', 'int', 'unsigned int',
'long', 'short','float','double', 'long double' and 'long long'. These
data types are meant to be machine independent. The same C source code
should work irrespective of the byte-sex of the target machine. Logical
values, representing true or false may be represented as almost any of
these data types, with the convention that zero means false.
Many text books show drawings of memory and memory locations to
describe how things are stored in the computer.
Bits are numbered in a byte. There are two conventions.
.--.--.--.--.--.--.--.--.
| 0| 1| 2| 3| 4| 5| 6| 7|
.--.--.--.--.--.--.--.--.
.--.--.--.--.--.--.--.--.
| 7| 6| 5| 4| 3| 2| 1| 0|
.--.--.--.--.--.--.--.--.
Bytes are numbered within a sixteen bit word. When the word is used
for arithmetic there are two distinct conventions. The most significant
byte may come either before or after the least significant byte in the
computer's address space. These two conventions are both used. Intel
chips store the least significan byte first. Motorala chips generally
do the opposite.
Intel MC68000
.--.--. .--.--.
|LO|HI| |HI|LO|
.--.--. .--.--.
32-bit numbers may be stored in several formats. If a number N is
expressed as four bytes N= b0 + b1*256 + b2*256^2 + b3*256^3 then the
two most common storage forms are:
Intel Motorala
.--.--.--.--. .--.--.--.--.
|b0|b1|b2|b3| |b3|b2|b1|b0|
.--.--.--.--. .--.--.--.--.
.--.--.--.--.
|b2|b3|b0|b1| Ocassional alternative form.
.--.--.--.--.
These different arrangements are called 'byte-sex'. For most
applications it is not necessary to know the byte sex, but there are
some graphic file formats which are not 100% convertable between
different machines.
Example: Storage of the string "Hello\n" in 16-bit computer.
Address Data
.--.--. .--.
|01|00| |H | Letters are used instead of ascii values.
.--.--. .--.
| |01| |e |
.--.--. .--.
| |02| |l |
.--.--. .--.
| |03| |l |
.--.--. .--.
| |04| |o |
.--.--. .--.
| |05| |0A| Line feed.
.--.--. .--.
| |06| |00| Strings end with a zero byte.
.--.--. .--.
These drawings may be useful, but they don't always show that there
is never really enough memory for the cutting edge state of the art
programs that your competitors are running with no problems. Byte sex
comes from how individual bytes make up longer integers used in
arithmetic.
We also know that public funded projects are always running their
software on computers without enough memory because of budget cuts or
occassionally outright corruption.
The model for C and for any other programming language designed for
a Von-Neumann machine is straightforward. Address space is a subset of
the positive integers. Address space is conceptually divided into space
taken by program instructions and space taken for the storage of user
data. Instructions and data look the same, and share this address
space.
A pointer type represents the location of something in the
computer's memory. While the language syntax of using pointers is very
precise the results of using pointers are extremely problematic. There
is no effective way of finding errors in programs which misuse
pointers. The best thing to do is to write programs which are always
correct. C has many facilities to help people do this. Most compilers
will print warnings when pointers are incorrectly used and many will
terminate after a certain threshold of warnings is printed. Sometimes
excessive error checking is quite annoying, but compiler checking is
even worse in Pascal or Modula 2.
Data types are recognised by printf. Most common at top of list.
'c' A single character
'd' A signed integer
's' A 'NULL'-terminated string.
'p' A pointer. This is printed with an 'x' specifier.
'f' Fixed float. Use "%w.pf"
'e' 'E' floating point number with exponent
'g''G' General. Use exponent if necessary.
'i' A signed integer.
'o' Octal. Integer printed in base 8 instead of base 10.
'D' A signed long integer
'u' An unsigned integer.
'U' An unsigned long integer.
'h' short, 'l' long ints, or 'L' long doubles.
'x' 'X' Hexadecimal, integer in base 16 instead of base 10.
Storage classes are important. In C these are given the names
'auto', 'extern', 'global' and 'static'. Auto variables are those which
represent on-stack storage. Global variables are those which can be
accessed by all programs in the system. By the time C was invented
large programs were always dying because one routine inadvertantly
corrupted data used by another. C generally requires programmers to
name variables which may be used by other routines. A variable is
defined as normal in just one source file, and other routines which
require this variable must prefix its name and type with the keyword
'extern'. Almost all other programming languages have a mechanism for
using global variables on a 'need to know' basis just like a series of
terrorist cells only divulge minimum necessary knowledge to the
acolytes.
Next | Back|Top|End
In C it is possible to define arrays of data elements. Both the
definition and use require the square brackets '[' and ']'. The length
of an array is the number of elements. C keeps things simple. Only one
dimension is allowed and subscripts start with zero.
Array declarations can be defined in Backus-Naur notation.
type ::= 'enum' | 'char' | 'int' | 'float' | 'double'
qualifier ::= | 'unsigned' | 'short' | 'long'
qualifier :: qualifier qualifier
dimension ::= | [] | [Numeric-literal]
dimension ::= dimension dimension
tabulator ::= dimension | '*'
declaration ::= qualifier type name dimension;
The size of an array must be known at compile time. If it is not
known then the pair of square brackets may be used to indicate an
array.
Example:
char greeting[] = {'H','e','l','l','o','\0'};
This declares an array of six characters, with initial values. In
fact a shorter way of getting a similar definition is char*
greeting="Hello";
The use of arrays or tables is necessary in many programming
applications. The maths technique of selection without replacement is
perhaps one of the most common mass market applications. Many people
want algorithms to select lottery numbers or even generate pin numbers.
Next | Back|Top|End
Full worked example.
A 'National Lottery' requires a user to select N different numbers
from 1 to M without replacement. The English National Lottery requires
six numbers from 49 to give a the jackpot but the gambler must select
seven numbers when filling in the form. The program is to be called
select.c and it accepts two numbers on input, given on the command
line. The program makes an apparently random selection. The program may
seem long and complicated but there are many people who want to know
how such processes work. For a start the model of a jar with coloured
balls and random selection is common in applications such as quantum
theory, cryptography and monitoring the stock market. It is also an
application where there is no simple method of achieving the result
without using arrays and subscripting.
lottery.c
/* select n numbers from {1,2,...m} based on time() */
#include
#include
#include
#define MAX_BALLS 1000
/* hold numbered balls in an urn or jar */
int urn[MAX_BALLS];
main(argc, argv)
int argc;
char **argv;
{
int n, m;
int i, k;
if (argc < 3) {
printf ("Usage %s number-to-select number-of-lottery-balls \n");
return -1;
}
n = atoi(*++argv);
m = atoi(*++argv);
/* sanity check */
if (n > MAX_BALLS || m > MAX_BALLS || n > m || n < 0 || m < 0) {
/* brush off user with a cryptic error message */
printf("arguement error. Don't bet today\n");
return -1;
}
printf ("Selecting %d balls from %d without replacement\n", n, m);
/* deterministic randomize */
srand((unsigned)time(NULL));
for (i=0; i
Next | Back|Top|End
The first set of operators can be found on any calculator. These are
the four operations of arithmetic, plus '+', minus '-', times ;*', and
divide '/'. The next most common is the equals sign '=', for moving
things about registers. In C the operators consist of one or two
special symbols. Operators can be strung together with names rather
like the symbolic expressions taught in algebra classes at schools.
These are infix operators: that is they are used in a context such as
x+y where the operator is written between the data to which the
operator is applied. There is another important operator called
assignment written '='. The expression a=b; means store b into a.
The most common arithmetic operators work on characters, integers,
floating point values, and sometimes they work in mixed expressions
with pointers and numeric values. The value of the result depends on
the types of the operand.
+ add
- subtract
* multiply
/ divide
% remainder
When these operators are used in assignment statements such as "x =
x + a", then the assignment operator pair may be shortened to "x += a",
etc. The very first C compilers accepted "x =+ a" but this was only in
the 1970s.
Relational operators are similar to arithmetic operators. They are
written between a pair of data values and the result is a logical value
true, or false. The most important relational operators are
< less than
> greater than
<= less than or equal
>= greater than or equal
!= not equal
== equal
All of these work on characters, numbers and pointers.
C also has operators that work on the individual bits of bytes and
integers. These need to be used with care because there are many places
for the compiler writer to go wrong when dealing with mixed data types.
& logical and
| logical or
^ exclusive or
~ unary not. Invert each bit.
C has the powerful concept of letting some operators stand for
complex logical constructions. '&&', '||' and '?' represent control
structures. Truth values represented by zero and non-zero may be
manipulated by logicise operators.
&& logical and
|| logical or
! logical not
The designers of C were very clever in their specifications. The
expression 'A && B' is true if both A and B are true, but if A is
false then B is not evaluated at all. Similarly in the condition 'A ||
B' the result is the logical or of A and B. If A is true then B is not
evaluated and the value 'true' is returned for the whole expression.
Next | Back|Top|End
There are several. The most important are 'if', 'else', 'for',
'while', 'switch', 'return'. The symbols '{' and '}' are used to mark
blocks of statements. Every C program must also contain the word
'main'. In fact many C programs may be written using only the words
'int', 'main', 'for' 'printf' and single letter names and operational
symbols.
The 'if' statement has the syntax:
if (condition) action;
Here condition is a stement and action either a single statement or
a compound statement enclosed in braces '{' and '}'. The keyword 'if'
is just about the most common keyword used in C-programs. The condition
part of the expression is evaluated and if non zero the action part of
the statement is evaluated. It is also possible to specify an
alternative course if the condition is false: use the keyword 'else'.
if (condition) action1;
else action2;
A series of conditions and actions can be chained together with if
and else. Conditional statements may be embedded into the actions by
use of braces.
Example:
if (a==0) {if (b>0) c=b; else c=-b;}
else if (a==1) b="foo";
else b = NULL;
It is important to distinguish between the operator '=' for
assignment and '==' as a comparision operator. The effect of a line
such as:-
if (a=1) die();
is to set the value a=1 and use the value 1 or 'true' in the
conditional expression. This means that the function die() is _always_
invoked in this code fragment. This misuse of '=' is one of the most
common sources of error.
It is also possible to write simple conditionals with the '?' and
':' operators. The syntax is:-
condition ? action_1 : action_2;
Here condition and action_1/2 are simple statements.
Example: Print out a random maze.
int a[1817];main(z,p,q,r) {for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)
q=3&(r=time(0)+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79
:0:p>158?-79:0,q?!a[p+q*2]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)
printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}
For iteration the most important is the 'for' statement.
for (start; condition; iteration) {body;}
Here any of the three statements 'start', 'condition', and
'iteration' may be empty. 'body' may also be omitted. Since statements
may be expressions separated by commas it is easy to see that a single
for loop may involve substantial programming effort. To print numbers 1
to 10 use the loop
for (i=1; i <= 10; i++) printf("%d\n", i);
Example: 800 digits of Pi.
int a=10000,b,c=2800,d,e,f[2801],g;main(){for(;b-c;)f[b++]=a/5;
for(;d=0,g=c*2;c-=14,printf("%.4d",e+d/a),e=d%a)for(b=c;d+=f[b]*a,
f[b]=d%--g,d/=g--,--b;d*=b);}
It is also possible to use a simple form of iteration with 'while'.
The syntax is:-
while (condition) body
Here body is a single statement, or a compound statement within
braces. There are also other iteration constructions using keywords
such as 'do', 'until' and 'repeat' etc. but in practice they are not
used so often.
The body of a loop is often a series of statements and there are
keywords to indicate whether all of these statements are performed on
each iteration. Use 'break' to force early exit from the loop and use
'continue' to skip a series of statements.
Example: find the position of an integer k in an array A of size N.
If the value is not present return N.
int A[N], i, k;
for (i=0; i
Next | Back|Top|End
Functions are logical divisions of the program. Normally functions
return a value. A function consists of a header line, then statements
between braces '{' and '}'. Parameter passing sometimes looks very
obscure. This is especially so with varargs functions. Right from the
beginning C contained some very powerful functions: 'fork' would clone
a running version of the same program.
Example:
add(a,b){return a+b;}
printf("2+5=%d\n", add(2,5));
2+5=7
The word int is supplied by the compiler for both the return value
and arguement types.
The function header has evolved since C was invented. At the
beginning the function header assumed a return type of 'int', and the
parameter types were specified in a list after the closing parenthesis.
The function add would be written in full:
int add (a, b)
int a;
int b;
{
return a+b;
}
Later it was decided to allow type specifiers within the
parentheses.
int add(int a, int b)
{ return a+b;}
Within the body of a function it is possible exit by using the
'return' statement. 'return' is normally followed by an expression,
which is the value used by the calling program. It is not always
necessary to use the return value, and sometimes it is undesirable to
do so.
Function definition tells the compiler the number and the types of
the parameters. The use of a function consists of writing its name
followed by a arguement list enclosed in parantheses. The arguements
are separated by commas.
Example:
#include
main()
{
int x = 2;
int y = 3;
printf ("x=%d y=%d x+y=%d\n", x, y, add(x, y));
return 0;
}
int add(int a, int b) {return a+b;}
Here the function add() is defined in the program, and printf() is a
library function. Printf itself is written in C and different C
compilers provide slightly different versions of this function which
vary in functionality. The most important thing about printf is that it
has a variable number of arguements.
When a program is compiled it may not always be possible to tell if
the number of arguements used in function call are the same as those
provided for in function definition. One of the first ever programs
written in C did this sort of checking. It is called 'lint', with the
idea that lint is the sort of fluff that accumulates on bedding etc.
and is host to potentially lethal dust-mites, pathogenic fungi and
animal droppings. Lint scans C source files and reports possible misuse
of functions and pointers. Nowadays hardly anyone uses it. Lint often
produces voluminous output with many spurious warnings. Compilers have
also improved over time and asking for all warnings will give lots of
hints about potential causes of difficulty in the program.
C was designed to use certain conventions in parameter passing.
* single bytes are converted to 'int'.
* floating point values are converted to 'double'.
* arrays are passed by reference.
These concepts are always hard to learn for the first time. It
usually takes several weeks to really understand what is going on.
There are many widely used programming languages that none of the
flexibility of C while other languages have far more elaborate schemes
for defining and calling functions.
In practical terms the conversion of 'float' to 'double' at
interfaces means that a programmer can always use 'double' for large
numbers and forget about the 'float' data type. Similarly routines
which take single characters as arguements can be coded to expect an
'int'.
Next | Back|Top|End
In C pointers are references to memory locations, and therefore are
not meant to be the same as integers. In fact pointers share many
properties with unsigned integers.
* Zero is special. It is called NULL.
* Pointers are totally ordered.
* Some pointers have successors and predecessors.
* Incorrect operations on pointers crash the program.
A set S is totally ordered if for any two members a,b in S at least
one of the statements a>=b or a<=b is true. The NULL value of a pointer
can be used as 'false' in logical expressions.
The '&' operator gets the pointer corresponding to a data item,
while the '*' operator gets the value corresponding to a pointer. The
double use of '&' and '*' can be parsed because in the context of
pointer reference '*' and '&' are unary operators.
Almost everyone finds pointers hard to understand at first.
Sometimes a programmer wants to use a pointer which can serve any
data type. To do this use the keyword 'void'.
The address space for a program is modelled as a set of numeric
ranges representing positions of instructions and data in the computer.
Normally this is not contiguous: there is more than one range.
Instructions that reference sensitive parts of address space such as
input-output buffers or a memory mapped video are usually hidden in
library functions.
Next | Back|Top|End
Structures are ways of building up more complex data structures from
the atomic classes and pointers. They are introduced by the keyword
'struct'. Unions represent areas of memory which may contain different
structures at different stages of processing. The length of every
structure may be determined by the 'sizeof' operator.
A class is a collection of definitions of data and functions.
A structure is introduced by the keyword 'struct'. The syntax is:-
struct name_of_structure {
type_1 member_1;
type_2 member_2;
.....
type_n member_n;
};
Next | Back|Top|End
C programs assumed three channels: standard input, standard output
and standard error. These things are normally fairly complicated so
details may be hidden in where some things which look like
functions are really macros. In particular getting single keystrokes
from the user is a nightmare because it may be necessary to time the
user out, or accept keyboard input when the program wants to do
something else. Getting mouse pointer and button signals is even worse.
Writing C programs with bulk input and output is not difficult.
'printf' serves for almost all output while 'getc' or 'fgets' serve
to provide for nearly all input possibilities. Highly structured data
such as the output of programs can be read with the 'scanf' function.
ioctls, termios and sockets provided all necessary tools for
building the internet.
Next | Back|Top|End
UNIX introuced C and the idea of input output redirection and pipes.
UNIX was made to be an intereactive operating system where every user
could call for operating system resources. Most management was very
hostile to this idea at the time.
Simple sequences of programs can be entered from the command line.
Example:
vi hello.c
cc hello.c -o hello
chmod a+x hello
./hello
Management thought the data on the computer might get corrupted if
programmers actually got access to the computer. Remember that Ada
actually died before she got access to the machine. Worse still, her
name has become a language ADA mostly used in sinister military
applications.
UNIX was developed so that AT&T could print better manuals and also
to allow many people to access the same computer simultaneously. Three
levels of privacy were guaranteed right from the beginning.
This is reflected in the file system. The permissions heirarchy is
divided into owner, group, and all. The unwary user will find that most
problems are caused by getting the permissions wrong.
Next | Back|Top|End
Most large systems such as UNIX, LINUX, X-Windows, have hundreds of
include files and more or less standard libraries with subtle and
insidious differences between different versions of operating systems
and compilers. Many of these differences are not documented and only
discovered when someone is trying to convert code from one machine to
another, and working to a very strict deadline.
A particularly shocking example is memcpy(src, dst, size) where
certain Microsoft Compilers caused a 64 kilobyte block move when a size
of zero is used.
Libraries are collections of functions in files which may be run by
several programs. Libraries need linking to these programs and the
links may be either static or dynamic. With the internet the libraries
may reside on a different machine to the program and their correctness
cannot be deduced until the program is actually run. Big problems with
new programs written for old libraries and vice versa.
Next | Back|Top|End
The edit, compile and run cycle has been described for a LINUX user
with access to a shell window. When more complicated programs are
developed it may be important to keep a record of how programs are
compiled and linked. This is done with Makefiles. MAKE is a program
that checks the sources of a given program and whether any of them are
more recent than the program itself. If this is the case then make
rebuilds the program by compiling the necessary routines. With large
systems, including almost all window managers the maintenance of
Makefiles became increasingly tedious and many new languages have been
invented to generate Makefiles. Another tendency has been the
replacement of Makefiles by Project files with INTEGRATED DEVELOPMENT
ENVIRONMENT (IDE).
The IDE requires a very thick book with hundreds of screen diagrams
to describe. It also eats up space and it is not essential for the
operation of the compiler.
Next | Back|Top|End
Now you have read this far go out and buy Donald E Knuth's opus, The
Art of Computer Programming (Addison Wesley). Read at least three
volumes and send your resume to Bill Gates. Bill will be pleased to
hear from you.
Next | Back|Top|End
Most computers don't come with a C-compiler. It's necessary to
install the software via the internet, or from a CD-Rom. Most
distributions allow for the installation of C, C++ and objective C.
It's possible to pay for a compiler, or get one for free. The free
versions are generally well tested. The compiler and necessary software
will usually take over 10Gb, and comprise hundreds of files. Generally
the C-compiler itself was used to write all of the software that gets
installed here. The main components are the compiler and assembler,
library manipulation utilities, include files and documentation. A
minimal working set will include packages such as gcc3, libc, kernal
headers and man-pages.
Next | Back|Top|End
Here is a list of keywords sorted by frequency from C source for
the editor which was used to type this document.
if 1250|#include 102|long 23|#ifndef 5
int 975|switch 102|sizeof 21|ifndef 4
return 880|default 90|do 20|short 4
break 522|double 78|register 13|signal 4
case 521|char 61|union 11|#if 3
#define 496|#endif 60|continue 10|global 2
for 400|#ifdef 53|ifdef 10|void 0
else 334|unsigned 40|exit 9|static 0
extern 206|while 34|#elif 8|float 0
struct 145|#else 26|goto 6|enum 0
| Back|Top|End
Part of this section is taken from Sunil Rao's work. The FAQ also
gives details of where to download free C compilers. The DJGPP compiler
for DOS/WINDOWS contains very good documentation including a complete
specification of all of the standard library functions.
[0] Sunil Rao Learn C, C++ FAQ
http://www.raos.demon.co.uk/acllc-c++/faq.html
Steve Summit http://www.eskimo.com/~scs/cclass/cclass.html
Ted Jensen http://pweb.netcom.com/~tjensen/ptr/cpoint.htm
Tom Torfs http://members.xoom.com/tomtorfs/cintro.html
[1] Kernigan & Ritchie The C-programing Language (c 1976)
http://cm.bell-labs.com/cm/cs/cbook/
[2] D.E.Knuth The Art of Computer Programming. (c 1969 --)
Addison Wesley
K N King "C Programming: A Modern Approach"
http://knking.com/books/c/
H M Deitel and P J Deitel "C - How to Program", 2nd Edition
http://www.deitel.com/products_and_services/publications/chtp2.htm
Back to the Top