Practical Applications of Computational Intelligence for Adaptive Control:Neural Networks for Computer Modeling

[an error occurred while processing this directive]

Table of Contents

Algorithm 11.1 - The following algorithm can be used to train an NN.

1. Encoding

Randomly assign values to the network weights. Assign random values between -1 and +1 to the weights between the input and hidden layers, and the weights between the hidden and output layers.

2. Training

A. Forward Pass

i. Compute the hidden layer neuron activations:

h = F(iW₁)
h =	an array (vector) containing the activation signals associated with the hidden layer neurons
l =	an array containing the activation signals associated with the input layer neurons
W₁ =	a two-dimensional array (matrix) containing the weight values for the synapses between the input and hidden layer neurons

ii. Compute the output layer neuron activations:

o = F(hW₂)
o =	an array containing the activation signals associated with the output layer neurons
h =	an array containing the activation signals associated with the hidden layer neurons
W₂ =	a two-dimensional array containing the weight values for the synapses between the hidden and output layer neurons

B. Backward Pass

i. Compute the error associated with the neural network:

d = o(1 - o) (o - t)
d =	an array of error values associated with each output neuron
o =	an array containing the activation signals associated with the output layer neurons
t =	an array of values that represent the “correct answers;” these are target values the neural network is striving to replicate

ii. Compute the hidden layer error:

e = h (1 - h)W₂d
e =	an array of error values associated with each hidden layer neuron
h =	an array containing the activation signals associated with the hidden layer neurons
d =	an array of error values associated with each output neuron
W₂ =	a two-dimensional array containing the weight values for the synapses between the hidden and output layer neurons

iii. Adjust the hidden-to-output layer weights (W₂):

W₂ = W₂ + ΔW₂
ΔW₂ =	a two-dimensional array containing changes to be made to the weight matrix. These changes are computed according to the formula:
ΔW_2(t)	= αhd+φΔW_2(t-1)
α =	learning rate which affects the speed of convergence (but, as the convergence rate is increased, so too is the chance of the network converging to a local, sub-optimal solution)
φ =	the momentum term. This term is used to allow the changes made in the previous step to affect the changes in the current step. This is done as a way of discouraging convergence to a sub-optimal solution.

iv. Adjust the weights for the first layer of synaptic connections:

W₁ = W₁ + W_1(t)

where

W_1(t) = αie + φΔW_1(t-1)

Repeat the forward and backward passes on all pattern pairs until the output layer error (vector d) is within the specified tolerance for each data set and for each neuron.

Once the NN has been sufficiently trained, it can be used to describe the relationship between input data and output values: it can be used to model the system. To do this, a forward pass can be made through a stagnant network (one in which the backward pass is no longer active).

Neural networks can learn any arbitrarily complex nonlinear mapping. This is due to the introduction of the middle or hidden layer. However, there are potential drawbacks to NNs. First, one of the most cited disadvantages of NNs is that they can involve extremely long training times. In modeling situations for which the relationships between the parameters are subtle or elusive, it is not unreasonable to expect an NN to require weeks of training on computer workstations. However, when the alternative is no computer model to drive your adaptive control system, this training period is not an insurmountable obstacle. Second, NNs can “memorize” the training data but fail to extend generalizations to situations for which they have not been specifically trained. Often, this occurs when the NN has too many hidden nodes; a delicate problem since too few hidden nodes and the NN will not train while too many nodes causes the NN to memorize. Third, NNs do not give any indication as to why they do what they do. In other words, when an NN is used as a computer model, it receives input data and produces an output value. There is no easy way of understanding the complex relationships between the input values and output values that are contained within the NNs weight matrices. Thus, NNs are the quintessential “black box” device.

Neural networks are powerful tools for modeling data. In principle, they are easy to use: simply show the NN some training sets (data collected from the system being modeled) and watch it “learn to model the data.” However, in practice, the act of training an NN to model a physical system is more of an art than a science. There are several aspects of the NN’s architecture (most notably the number of middle nodes) and training (the learning rate, α, and the momentum parameter, θ) that affect the NN’s ability to model a system that must be set by a user pretty much by trial and error.

Thus, NNs probably should not be thought of as “the be-all and end-all” of computer modeling as they once were. Neither, however, should they be thought of as more trouble than they are worth. NNs are powerful modeling tools that can be used to model a variety of systems. Their effective use in a particular situation requires some knowledge both about the system being modeled and the NNs themselves. In the remainder of this chapter we will show three specific instances in which an NN has been used to effectively model physical systems.

Table of Contents

Products | Contact Us | About Us | Privacy | Ad Info | Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.