UNIX shell

This is generated from a plaintext file,  zsgs.txt.
Return

ZEN SUFI GUIDE TO THE SHELL

Next |End

INTRODUCTION

Next | Back|Top|End If you plan on owning a computer with linux installed it is useful to know how to control programs via a command line interface, called the 'shell' on UNIX systems. If you don't like the world, change it ! Knowledge of the shell allows you to make simple changes quickly. With a logged in console it is possible to kill programs which appear to freeze the screen. Using the shell gives a record of the things you have done. This is useful in remembering changes made to your files. Typing commands is often faster than navigating through many different screens. If the windowing system is not correctly tuned it may be impossible to configure the screens to do drag and drop copy. Changing file permissions for dozens of files may be another problem. Use of a single command may save you from navigating dozens of screens. Many system administration commands were designed for people who worked at terminals with no GUI. Many of these have been adapted to the different windowing systems (GNOME, KDE etc) by means of wrappers. An icon on the desktop has a properties menu, and with this it is possible to navigate the different subwindows to a command line option. These commands are run by the shell interpretor. Scripting tools are better suited for communicating ideas about improving the computer interface than drawings of screens and windows. A script is a one dimensional representation of ideas which may manipulate data with multiple dimensions. Many thick computer books have a low information content because of the large number of screen diagrams. Moving towards using command lines is like changing your habits to improve sustainability. By keeping track of useful commands that you used you can build your own resources for more efficient use of the computer.

INSTRUCTIONS FOR BEGINNERS

Next | Back|Top|End [1] Don't use complicated regular expressions in scripts. If the script does not work it is almost impossible to debug. [2] Don't try to put any sort of intereactive user interface in a shell script. A GUI is better. Scripts may be adjusted to have options and a usage message. The user should learn the interface by reading the documentation rather than by trying to navigate a user interface by filling in forms. There are methods for reading user within scripts, but these methods have changed thousands of times over the last few years to meet the demands of globalisation and foreign language processing. [3] Don't try to use 'if' statements, 'for' statements or any other types of control statements in your script. Just write single commands on separate lines to be performed sequentially. [4] Don't use the 'cd' command to visit every directory where you may want to look at a file. Use the command line short cut of <TAB> for file name completion to build up long pathnames. [5] Don't be afraid to save long output to temporary files. Although utilities such as 'more' and 'less' allow you to scroll through data, saving data to a temporary file is the first step towards recycling this data as a script. It is very easy to generate scripts to work on lists of files with a text editor such as 'vi'. [6] Don't waste time. If you work out some complicated expression which does what you want, pipe the history list to a file and save that expression in a file with a name that you can remember. Build up your own resources as you go. Don't even listen to people that say scripting is easy. [7] Don't write scripts which modify themselves. [8] Don't write malicious scripts. [9] Don't try to learn everything at once. [10] Don't expect a script to work first time.

USEFUL SHELL FUNCTIONS

Next | Back|Top|End alias Enquire or set alias awk Deal with tabular data cat file* concatenate file cd dir change working directory chmod opts file* Change attributes and permissions cp src* dst copy source files to destination diff Difference between files. display file* Display images file file* List possible type/origin of files. find node types find files from node matching types. ftp host file transfer protocol gcc GNU C Compiler (System build tool) grep string file* find lines containing string in files gzip file* compress files history Command history host str DNS lookup utility kill N end task number N less file Look at file ls {dir..} list files in directories ln {options} link files lp file* print files make Update series of dependent files man topic Online help on topic mdir floppy disk directory mcopy floppy disk copy more file Scroll through file or stdin mount mount filesystems od Look at non ascii file ping host test connection with host ps process status. pwd print working directory rxvt Open a console on X windows. sed Batch stream editor strings file Look at ASCII strings in a file su {name} Swap user to root, or given name tar opts dst file* bundle or unbundle files. telnet host intereactive connection to another computer touch file* Change timestamps of files vi file* edit files. ':n' in vi gives next file wc file* Word count files wget options download web pages or ftp server files xset options X-windows behaviour xterm Set up X-terminal

THEORETICAL AND HISTORICAL DEVELOPMENT

Next | Back|Top|End Unix programs have a standard input and a standard output. There is also a standard error output. Programs work on symbol strings. The set of symbol strings S and the set of all possible programs P are both semigroups. A program can be represented as a partial function f:S->S. A partial function from S is a function which given an input a in S will either give an output f(a) in S or else some sort of result to indicate that a correct output cannot be defined. On push button calculators such functions as reciprocal (1/x) or square root are good examples of partial functions. A semigroup S has law of composition: given any two elements a,b there is a product ab. The law of composition is associative: (ab)c = a(cb). A semigroup has an identity if there exists an element such that ea=ae for all a in S. The set of symbol strings S may be made into a semigroup by the operation of concatenation. If a and b are strings then ab is the string consisting of symbols of a followed by the symbols of b. The empty string is the identity element. A semigroup may have subsets which are also semigroups. A semigroup may be defined by a set of generators so that every element of the subset is a compostion of members of the generator set.A 'regular expression' is meant to be a concise notation for a (possibly infinite) list of generators. 'a.*' could represent all strings starting with a. 'regular expressions' are often used as 'filters' hopefully to seperate sense from nonsense. Note: in UNIX shell language these laws of composition are realised with the functions 'cat' and 'touch'. 'cat a b > c' creates a new file c which is the symbols of a followed by those of b. 'touch e' will either create a new zero length file, or update the timestamp of an existing file. Partial functions from a domain S to a range S can be made into a semigroup by mapping composition. If f:S->S and g:S->S are two functions, fg(a)=f(g(a)). First apply g,then apply f. There is an identity function i:S->S such that i(x)=x for all x in S. Do nothing! The fact that both programs and strings form semigroups is no surprise. The concept of a Von Neumann machine is a machine where both programs and data are stored in different parts of the same memory. The UNIX operating system was developed by people who knew all about semigroups and sets of functions. Most computer related jobs are tedious, boring, repetitive, and all too often stressful. Although many of the inventors of UNIX went to 'Ivy League' colleges such as those shown in biopic 'A Beautiful Mind' their objectives were the same as those pioneers who tried to make the Industrial Revolution less tedious by inventing devices such as the 'Scoggin' or the 'Cotton Gin'. Von Neumann machines were OK, but programming was incredibly tedious and it often involved setting up switches to represent a series of binary numbers. With UNIX you just use a text editor to create a string of symbols with spaces and newlines to improve readability, and then you can convert that string to a program which really enhances the functionality of the computer. The text editor is usually designed for people who can't type particularly fast. Long words are abbreviated to one or two letters. UNIX was produced with no design input from accountants, lawyers or salesmen. In UNIX symbol strings are stored as files, with filenames. The formal description of programs and strings as semigroups may appear tedious, but this formalism had a long history. When Babbage and Ada Lovelace set about the first theoretical computations in the early 1800s they were thinking about evaluating certain values of the mysterious Zeta function in order to impress potential commercial backers. The Bernouilli numbers are values Zeta[2*n] divided by even powers of pi. Of course everyone knows that Babbage never got enough commercial backing to finish the project. Babbage and later Bool were interested in the formalism of function composition. By the early 1900s Bertrand Russel and others were working on systems wherby all mathematical proofs could be written as symbol strings. By the 1930s Kurt Godel and Alan Turing produced important theoretical proofs on the limitations of these formal systems. Godel proved that mathematics could not be made consistent without forever having to add new axioms; Turing proved that there were uncomputeable functions. The people who designed UNIX knew most of this stuff even if the salesmen had never heard of it. The UNIX shell is designed for people who can't type, and it is also best to have an American style keyboard layout. If you want to combine programs in the form of function composition, with a program f taking the output of a program g, then you need to type 'g|f' rather than just 'f g'. The vertical bar character is essential. If you want to run two programs f and g succession you just type 'f ; g '. The names of the programs are separated by a semicolon (;). UNIX was invented at AT&T's Bell Research Laboratories in New Jersey, quite close to Princeton University. Most of the people involved wanted to use the computers in interesting theoretical projects such as enumerating points on elliptic curves or computing the zeros of the Zeta function. Industry and public utilities provided other interesting problems such as the optimisation of transport and telephone networks, the scheduling of meetings, or the smart management of a planned economy. The brutal management of planned economies was already well established. UNIX was designed to be fault tolerant. Logical paradoxes or poor design features could be tolerated through use of the standard error output. In the age of windowing systems you can always see the spurious errors turned up byy media players or web browsers by running the from a command line. Writing programs is a sort of 'hands on' experience of mathematical logic.

SYMBOL STRINGS AND LEXICAL ANALYSIS

Next | Back|Top|End The UNIX shell allows people to add value to the product by turning symbol strings into programs. Just write the program into a file 'x' and then type a command such as 'chmod a+rx x'. Invoke the program by just typing 'x', or if that does not work try typing './x'. Naturally not every symbol string will result in a working program, but the UNIX shell will still try and run any file as a program if requested. Just try running an empty file created by 'touch' and made runneable by 'chmod'. No problem. Many of the early functions in the UNIX shell were oriented towards creating programs and instruction manuals. Explanations were provided in an online manual, invoked with the 'man' command. One of the first tasks of the UNIX development team was the writing of the UNIX operating system in a high level language, C. The lines of a program, or commands typed by the user are subject to lexical analysis. Characters in the symbol set are divided into classes such as alphabetic characters, digits, spaces and tabs, non printeable control codes and characters with a special meaning. In the UNIX shell the following characters have a special meaning:- ' ' blank. Separate words (functions). '>' redirects standard output to a file. '<' take input from a file '$' substitute from variable name ';' sequence programs '(' start sub-shell. ')' end subshell '&' run preceding command in background. ',' general sequencing operation '*' Wildcard character '#' take following text as a comment. '?' Possible wildcard, or URL invocation character. '\' Escape character. '/' Filename separator. Other special characters include the three types of quote symbol ('"`). The meaning of a character often depends on its position in a string. For example 2>&1 redirects standard error to the standard output. Excercise: Modify the attributes of an HTML file and then invoke it as a shell script. Are there any precautions you should take before doing this ?

SYNTAX: FUNCTION AND PARAMETERS

Next | Back|Top|End The syntax of a UNIX command reflects the function and paramater notation of mathematics. The first word of a command line is either a program or built in function. The second and subsequent words are paramaters which may be options or file names. Example: ls path-name ls lists files on the specified path. The pathname is omitted, then list files in the current directory. There are various options such as -1 for a simple tabular listing or -l for a long and verbose listing. ls $HOME will give the files in your home directory. Options to a program often control the amount of output. If you want to use the output of one program as input to another it is best to use a minimal output option.

FILE GLOBBING

Next | Back|Top|End In UNIX 'globbing' means the expansion of file or path names containing special characters. The most important of these is '*' which stands for any possible string (within limitations). This feature is most often useful when files have systematic names. Some sort of 'globbing' was a feature of operating systems which preceded UNIX. One of the first stages of command line analysis is to scan a line for words containing the '*' symbol, and to replace each of these words with a blank separated list of file names. You can try typing the line 'a=*; echo $a' to see the effect. What you see is not necessarily what you get !

FREQUENTLY ASKED QUESTIONS

Next | Back|Top|End Q. ascii integer conversion in shell A. This is done by 'soft coercion'. A string remains a string until the context of the expression indicates an arithmetic operation is required. To get this context use square brackets. If 'soft coercion' reminds you of stories about Guantanomo Bay or whatever then remember [...] as the prison boundary and that most things inside get 'soft coerced'. To get the real value do not forget the dollar sign. $[2+3] gives 5. Q. how to multiply two numbers in shell scripting and get a float data type? A. Don't waste time. Q. how to divide 2 float value in shell script in unix? A. Don't waste time. Q. how to use decimal fraction in shell script ? A. Don't waste time. Of course you can use a shell script to pass strings between numeric calculator engines. UNIX started off with engines such as 'awk', 'dc' or desk calculator, and 'bc' which was a more friendly version of dc. Nowadays awk is the most common calculation engine used in scripts. Q. sed join two lines ? A. Use ':j' as command line. Q. sed [ ^J] tab ? A. ^I is the tab character. Q. Shell script numerical comma ? A. Programmers should add/remove commas as required using their own functions wherever necessary. Q. shell script read next line ? A. use the built in command 'read'. 'read x' would read the next line into the string variable 'x'. Q. unix command line passing paramaters ? Q. shell scripting input paramaters variables ? A. Input paramaters are generally called 0,1,2,3 etc. To get the value of a variable put the $ in front of its name. $0 is normally the name of the script or program, as invoked. It is possible for the same script to test for different values of $0. Use the 'ln' or 'cp' command to create soft links which correspond to the different names. Q. unix arrows A B C D ? A. UNIX terminal emulators use ANSI escape sequences to encode cursor movement on a video screen. These take the form ESC [ {number} {A|B|C|D} where the number indicates the steps to move, and A=North(Up), B=South(down), C=East(Right) D=West(Left). Q. UNIX sed global subs ? A. Try s/str1/str2/g Q. unix subscripting ? A. Use square brackets: name[index]=value. Note that square brackets are also used for 'soft coercions'. This is an example of 'overloading' where a word or symbol has many different meanings depending on the context. It is easy to make mistakes when a scripting language makes great (primiscuous ?) use of this feature. Q. Why are manual pages so verbose and opaque ? A. Linux and UNIX have grown over the years. Many new options have been added to standard UNIX functions. The different functions are written by many different people who want to see UNIX/Linux a system for the third millenium. The manual pages for the 'bash' shell conmprise over four thousand lines of text. These describe shell invocation, command substitution and built in commands. To understand the commonly seen scripts which use 'awk' and 'sed' it is necessary to read the man pages for these two programs. Q. I don't have much time. Why should is spend so much time learning about so many different shell commands ? A. The 'shell' was designed to speed up the user's intereaction with the computer. This is especially true for people who can't type very fast. A pipeline is a series of programs connected by vertical bars: a|b|c... . A pipeline invokes the first program and passes the output to the second program and so on. Pipelining makes it easy to combine several functions to fulfil a complex data processing task. It is easy to save and run canned scripts, so that it is usually uneccessary to type repeated commands. Just write the commands in a file and give the file the 'x' attribute. You can the recall all of these commands by typing the file name. A. What is the PATH variable ? Q. When someone types a command the shell interpretor usually performs a dictionary lookup on the first word of the command. This is done in several stages: first the word is checked against the table of reserved words defined as part of the shell language. There are only a couple of dozen of these. If no match is found, the interpretor attempts to match the word with the name of a program which the user is allowed to run. This involves a search of specified directories where a user may expect to find programs or scripts. As systems grow, so the places where programs are to be found become more diverse. To see the directories scanned type the command 'echo $PATH'. Q. Why is the command interface called the Shell ? A. I don't really know why, but for some reason the program which runs from machine startup is often called the kernel. In the distant past some salesman or promoter must have drawn a series of concentric circles to describe layers of software complexity. The drawing could represent a nut. Dante's Inferno also comes to mind, with concentric layers of Hell. Nowadays the situation has changed and windows usually obscures this layer completely. Q. Why are there different shells such as sh, bash, csh, and ksh ? A. Bourne worked on the first version of the UNIX shell at AT&T. The bash is the Bourne Shell. C-sh is the C-shell which was common on Sun workstations. 'ksh' is the Korne shell. 'sh' is meant to be a common reliable version which can be used in startup scripts. Q. What is the Linux Console ? A. When the windows freeze over, a Linux user should be able to press Control-Alt-Fn and get a login prompt on an ASCII screen. It is good practice when embarking on some operations to make sure that a logged in consol is available so that you can stop run away programs with 'kill'. It is also possible to search the process list with the 'ps' command and look for unresponding programs, or undead programs called 'zombies'. You then can try the 'kill' command with an appropriate program number. You will find that some programs are very hard to kill and you may need to send a whole lot of signals to a process before it finally quits.

HOW TO FIND OUT MORE

Next | Back|Top|End Look at the manual pges for some of the commands. Study the manual pages for 'bash'.

EXAMPLE. DISPLAY IMAGES IN DIRECTORIES

Next | Back|Top|End #/bin/bash a = `find /home/user -name '*.jpg'` display -delay 100 $a

EXAMPLE. LOTTERY SELECTION SCRIPT

| Back|Top|End
#!/bin/bash # # Use of random number generator # Pick $2 numbers from 1 .. $1 # Default to UK National lottery selection # function shuffle() # randomly permute elements of a list { awk 'BEGIN {RS=" "; srand()} {printf "%8.6f%s\n", rand(), $0}' | \ sort -n | \ awk '{print substr($0, 9)}' } #defaults URL="http://www.d4maths.co.uk" n=6 m=49 if [ $# = 0 ]; then echo Lottery selection by $URL elif [ $# = 1 ]; then echo usage: $0 n m echo pick n numbers from m choices at random without replacement exit 0 else n=$1 m=$2 fi echo $n from $m if [ $n -gt $m ]; then echo domain error exit 0 fi a= i=0 while [ $i -lt $m ]; do i=$[ $i + 1 ] a="$a $i" done c=`echo $a | shuffle ` echo $c | awk -v n=$n 'BEGIN {RS=" ";ORS=" "} NR<=n {print}' echo " " exit 0
Back to the Top