******************************************************************************* *************** A N T *************** *************** (Artificial Neural Tool) *************** *************** *************** *************** Ignacio Labrador, *************** *************** Ricardo Carrasco and *************** *************** Luis Martinez_Laso. *************** *************** CIEMAT (Spain)--1995 *************** ******************************************************************************* INTRODUCTION. ------------- This is the directory containing the software called ANT for neural net computations. These programs include a Multilayer Perceptron utility with the training algorithm based on Back-propagation (nowadays, there is an abundant bibliography about these kind of neural nets). One of the mayor advantages of this software is its simplicity and easy use. You can implement your own application using an Artificial Neural Net in a short span of time. Although no previous knowledge is needed to use this software we strongly recommend some basic read on Artificial Neural Nets, in particular on Multilayer Perceptron structures, in order to improve the permormance of your applications. These programs have successfully been used to develop real-world applications as: * Heating process control of a the vacuum vessel prototype for the Spanish Stellerator Heliac TJ-II. * Magnets current analysis in a fusion machine. * Plasma position and control in a Fusion device. * Differential ecuations solving * Shape analysis of HgI2 crystals growing for radiation detectors. In any case, we will appreciate if you let us know about your application and about any problem or comments to the address: Authors: Ignacio Labrador, Ricardo Carrasco, and Luis Martinez. Address: CIEMAT, Av. Complutense 22, 28040 MADRID-SPAIN, Fusion Nuclear Unit. Ed.6. e-mail : ignacio@loki.ciemat.es Phone. : 34 +(9)1+3466644 fax : 34 +(9)1+3466124 REQUIREMENTS. ------------- This software has been developed in a OS9 V2.4 system in their native C language. A simple alphanumeric text terminal VT100 compatible is enough for running any application. SOFTWARE STRUCTURE. ------------------- ANT: readme is this file with the documentation. ant.c is de utility for train, improve and execute the neural net with a menu interface. makefile used to make all reallocatable files and executable. data_files (orientative names of the xor and square recognition examples) ant_init.xor, ant_init.square include the initial data of net configuration (read comments), and the weights calculated by the algorithm. ant_examp.xor, ant_examp.square include the examples of inputs and output used to train or improve the net. ant_in.xor, ant_in.square include inputs to be processed by the net, in order to obtain the net response. ant_out.xor, ant_out.square stores output results if desired, including the inputs given by the user too. ANT/DEFS: ant.h the header of defs and typedefs. antlib.h the header of functions definitions. ANT/LIB: antlib.c the source library with routines including the example$ antlib.l the linkable module of the library. makefile to make the library only. ANT/RELS: all reallocatable modules are stored The configuration values of the net (number of layers, number of neurons, slope of sigmoids per layer, error ...) are stored in the structure of type 'ant_net' defined in 'defs/ant.h'. Per each hidden and output layer an array of 'ant_neuron' structures (defs/ant.h) are allocated. Each neuron includes weights (one per dendrite), the error and its output. INSTALLATION. ------------ To install ANT: 1- OS9$ chd [to home directory] 2- OS9$ ftp loki.ciemat.es [or ftp 130.206.40.61] 3- ftp> login as anonymous 4- ftp> binary [to set transmision mode to binary] 5- ftp> cd pub/os9/ant 6- ftp> mget ant.ytar.Z compress ytar 7- in your machine uncompress (command) OS9$ load -d compress ytar [be sure the Microware version of compress is not used] OS9$ compress -dc ant.ytar.Z ! ytar x - 8- OS9$ chd ANT [to neural directory] 9- Edit makefile and change paths to libraries (if necessary) 10- OS9$ make Installation is now completed, ready to run the application. 11- Two examples are provided with a xor simple neural net, and a simple square recognizer, otherwise 12- Write a ant_init.dat file (any name is valid) for your application 13- Write a ant_examp.dat file (any name is valid) with the training examples. 14- OS9> load -d user_root/cmds/ant 15- OS9> ant (train or improve the net with examples), the weights are stored in the init file. Without options you get the menu interface for interactive processing, if you type ant -? you will get the help list of options. 16- Write a ant_in.dat file ( any name is valid) with the inputs to be processed by the net. 17- type> ant ( and process inputs, asking whether you want the outputs stored in a ant_out.dat type file or not for the menu mode) CONFIGURATING THE NET. --------------------- Configuration of the net is provided by the init file, in 'ant_init.dat'. The fields that you can change in this file are: ************** NET STRUCTURE ************ Comments: Sigmoid type(DIG(1) or SYM(2)) : 1 Number of Layers (hidden+out) : 3 Number of inputs : 3 Neurons in next layer : 5 Neurons in next layer : 4 Neurons in next layer : 1 Polarisation potential : 1.0000 Weight pertur. coef. : 0.8000 Weight noise (zero to one) : 1.0000 Convergence limit : 0.0010 ************** Sigmoid Slopes *********** S(1) : 0.5000 S(2) : 0.5000 S(3) : 1.0000 ***************************************** With 'sigmoid type' you can select the kind of sigmoidal function used in all nodes. We consider two types: DIGital (DIG), with horizontal asymptote in y=0 and y=1; and SYMetrical (SYM), with asymptote in y=1 and y=-1. The first one is defined by the function: y(x) = 1/(1+exp(-Sx)) and is indicated specially in the cases in which you want use the perceptron to classify input patterns. The second one is the hyperbolic tangent function: y(x) = tanh (Sx) and it is used when you want to obtain positive and negative analog values in the output units of the net. This second function has an added advantage: you can find it implemented in silicon integrated electronic circuits, even more, for nets of small size you can implement it electronically with comertial discrete components. In both functions, the 'S' factor allows you to select the slope of the functions. This factor is defined per each layer, and you can change it modifying the S(l) fields in the 'ant_init.dat' file, where 'l' is the layer label in the net. Obviously, you have to include the same number of 'S(l)' fields than the number of hidden layers plus the output layer. The effect of 'S' factor is larger in the output layer than in the hidden layers. From our experience, values of 'S' around 1.0 are good for hidden layers, while for the output layer 'S' must be larger when you want digital-like outputs to obtain an abrupt sigmoid, and smaller when you want analog outputs to get a soft slope. The field 'layers' in the init file correspond to the total number of layers in the net (output and hidden layers). The maximum value that we have used is 3. Usually most of the problems can be solved with two layers (one hidden layer). However when convergence was impossible with two layers we have used 3 layers. In the program the input layer is indexed as layer 0 and only the inputs are considered (no neurons are allocated). To set the number of neurons units in each layer, you must add (or delete) the corresponding number of fields like this: 'Neurons in next layer : 2' in function of the number of layers of your net; so, in the example above, we have a net with three layers, with 5 neurons in the first layer, 4 in the next one and 1 unit in the output layer. As you know there is no an exact rule to find the number of units in each layer. From our experience, we think that the best is to start with only two layers, and with a little number of units in the hidden layer (the units in the output layer is defined by the number of components that you can obtain as output vector). If in this conditions there is no convergence, you can increase the number of hidden units and try again from the beginning. If the convergence is impossible with 2 layers you ougth to increase the number of layers to 3. Each layer includes a fake neuron (labelled with the index 0) as polarization (bias unit). You can change the output value of this node modifying the 'polarization potential' field. We solve the most of our convergence problems with a value of 1.0; only sometimes we select it as 0.0 in order to eliminate the fake neuron. The 'Weight pertur. coef.' field is the factor that determine the proportion of variation of the weights in each propagation of the examples across the net. In order to reduce the training time, is advisable to start the training with a high value (a value around 1.0 is a high value), and reducing it in a progressive form to values near to 0.0. If the convergence time is not an important factor for you, is better to fix a small value (0.2 for example) from the beginning and wait patienly. With the 'Weight noise (zero to one) :' field you can fix the noise perturbation in the actualization of the weights in each interaction. With 'one' this perturbation is maximum and with 'zero' there is no perturbation at all. For the practical point of view we have not needed this noise perturbation (zero value) in most of the problems soved till now. With 'Convergence limit' field you can fix the maximun allowed quadratic error in the training process. It is advisable to train the net in different phases: first with a high error margin, and when the net has learnt perfectly all the examples given, reducing this parameter and train again (we call this phase 'improve'), until the error of the net is smaller than the new value. RUNNING THE UTILITY ------------------- The 'ant' utility can be invoked without options in the command line and the init file name are requested explicitly. When you run the program in this way the following menu will appear on the screen: ============================================================================== Monday, April, 25, 1995. 1:09:52 pm ___________________________________________________________ [T] TRAIN NET [I] IMPROVE NET [P] PROCESS AN INPUT [E] EXIT ___________________________________________________________ Select an option : ============================================================================== 1.-[T] TRAIN NET. This option allows to train the net from the very beginning. Therefore, the weight matrix is initialize with small random values. Before select this option is necessary: * Select the architecture of the net and all its parameters in the inicialitation file (i.e. ant_init.dat). * Edit a file with the examples that allow the training of the net (i.e. ant_examp.dat), containing the input vector and the corresponding desired output vector for each example. It is advisable to normalize all input and output vectors in this file; so, if you are using the tangent hyperbolic function (option SYM) you must give all the examples with values in the interval [-1,1], and otherwise, if you use the digital (DIG) simoid, this interval must be [0,1]. When you select this option and the net is training, something like: ]]]]]]]]]]]]]]]]]]e0 ]]]]]]]]]]]]e1 ]]]]]]]]]]]]]]]]]]]]]]]]]]e2 ]]]]]]]]]]]]]]e3 . . . ]]]]]]]]]]]]]]]]]]]]]]en will appear on the screen, where e0,e1,e2.... represent each example in the examples file 'ant_examp.dat', and the number of characters ']' in each line is the number of times that has been necessary to do the backpropagation in order to minimize the error for each example. If, for example, you are training the net with 50 examples (n=50), the sequence from e0 to e49 will be repeated until all the characters ']' disappear in all lines. When this happens, your net is capable to recognize all the examples given in the 'ant_examp.dat', and probably, it is capable to compute properly some generalize examples that where no in the training set. The weight values generated by the algorithm will be write in the 'ant_init.dat' file periodically during all the learning process, and when convergence is achieved and the learning process is finished. To stop the learning process there are two procedures: a) to stop saving the actual weights. b) to stop without saving. 2.-[I] IMPROVE NET. This option must be selected only when the option [T] was previously selected. This option is used when you want to start the learning process from a set of weights known previously. The operation mode of the program is the same that in the [T] option. 3.-[P] PROCESS AN INPUT. This option is useful when is necessary to test the response of your net once trained. To use it, is necessary to edit previously a file with the examples 'ant_in.dat' (only the inputs to the net) to be tested. The output is readable on the screen, or via an output file selectable by the user. 4.-[E] EXIT. Finish the session with ANT. The ant utility can be run as a line command, to do that just type ant . The options list can be obtained typing ant -? and the following printout is obtained. $ ant -? Syntax: ant [] Function: Run the Artificial Neural Tool (see readme file) Options: None Run in menu mode\n"); -examp= get net examples file from path (for -i and -t options) -i improve a trained net (examp file needed) -in= get net input file from path (needed with -p option) -init= get net init file from path (always needed) -out= get net output file from path (needed with -p option) -p process an input (an input and output file needed) -t train the net (example file needed)\n\n"); Hint: Try first to run the menu mode just typing ant As examples of the command lines we have: #to train a net, type: $ ant -t -init=ant_init.dat -examp=ant_examp.dat #to improve a trained net, type: $ ant -i -init=ant_init.dat -examp=ant_examp.dat #to process an input by the trained net, type: $ ant -p -init=ant_init.dat -in=ant_in.dat -out=ant_out.dat The input values are in ant_in.dat file and the results are stored in ant_out.dat file. Running the ant utility in command line mode is faster than menu mode, and is advisable when the convergence is slow. APPLICATION. ----------- Once a net is trained and you want to use their 'knowledge' in an application in your program you have just to use the init file with the net information and the function ant_process() included in antlib.l . The program ant_xor.c shows how such application can include the computation of a trained net. The code include the computation of an user defined input vector and a simple call to the ant_process() function returns the pointer to a vector with the net results. The template of a program in which the ant net is used is as follows: /************************************************************************* * * Application program: PROGRAM_NAME * * To compile this program use the command: * * $cc -v=/defs -l=/lib/antlib.l program_name * */ #include #include "ant.h" /* neural net header */ #include "antlib.h" /* functions prototypes */ main (argc, argv) int argc; char **argv; { ant_neuron **ant_neuronD; /* declare net structures */ ant_net ant_netD; double *input_vector, /* user input and output vector */ *output_vector; strcpy( (ant_netD.init),file_name ); /* get init file name */ /* fill paramter of the net data structure, layers neurons , weights ...*/ ant_neuronD = ant_init(&ant_netD); /* allocate vector size = number of input of the net (layer 0)*/ input_vector = ( double *) calloc( ant_netD.n[0] , sizeof(double)); /* . */ /* . */ /* user code */ /* . */ /* . */ /* fill input vector with the values to be processed */ /* ant_netD.n[0]-1 variable with the number of net inputs, * neurons in input layer l=0 */ /* propagate and compute the net output*/ output_vector=ant_process(input_vector,&ant_netD,ant_neuronD); /* ant_process() return the pointer to the vector results*/ /* ant_netD.n[ant_netD.l - 1]-1 variable with the number of net outputs, * number of neuron in output layer */ /* */ /* user code where the ant result is applied */ /* */ } This application program has to be compiled with the command: $ cc -v=/defs -l=/lib/antlib.l program_name