EarthWeb   
HomeAccount InfoLoginSearchMy ITKnowledgeFAQSitemapContact Us
     

   
  All ITKnowledge
  Source Code

  Search Tips
  Advanced Search
   
  

  

[an error occurred while processing this directive]
Previous Table of Contents Next


4.6 Reverse-Cross and Cross Entropy Concepts

In all the network trainings, it can be observed that simulated annealing is a stochastic strategy of searching the ground state by minimizing the energy or the cooling function. Pertinent to the Boltzmann machine Ackley et al. [57] projected alternatively a learning theory of minimizing the reversed-cross entropy or the cross-entropy functions, as briefed below:


Figure 4.3  A multilayered perceptron with hidden layers HL1, ..., HLN

A typical neural net architecture is structured macroscopically as layers or rows of units which are fully interconnected as depicted in Figure 4.3. Each unit is an information processing element. The first layer is a fanout of processing elements intended to receive the inputs xi and distribute them to the next layer of units. The hierarchical architecture permits that each unit in each layer receives the output signal of each of the units of the row (layer) below it. This continues until the final row which delivers the network’s estimate o′ of the correct output vector o. Except for the first row which receives the inputs and the final row which produces the estimate o′, the intermediate rows or layers consist of units which are designated as the hidden layers.

Denoting the probability of the vector state of visible neurons (units) as P′(vα) under the free-running conditions (with the network having no environmental input), and the corresponding probability determined by the environment as P(Vα), a distance parameter can be specified as an objective function for the purpose of minimization. Ackley et al. [57] employed reverse cross-entropy (RCE) as defined below to depict this distance function:

The machine adjusts its weight Wij to minimize the distance GRCE. That is, it seeks a negative gradient of the derivative (∂GRCE/∂Wij) via an estimate of this derivative. In reference to the Boltzmann machine, this gradient is specified by:

where Pij is the average probability of two units (i and j) both being the on-state when the environment is clamping the states of the visible neurons, and p′ij is the corresponding probability when the environmental input is absent and the network is free-running on its own internal mechanism as a cybernetic system. To minimize GRCE, it is therefore sufficient to observe (or estimate) Pij and p′ij under thermal equilibrium and to change each weight by an amount proportional to the difference between these two quantities. That is:

Instead of the reverse cross-entropy (RCE), a cross-entropy parameter (GCE) as defined below has also been advocated by Liou and Lin [58] as an alternative strategy for the aforesaid purposes:

4.7 Activation Rule

In the neural network, the relation between the net input (NETi) and its output value (oj) is written in a simple form as in Equation (4.5). When the neuron is activated by the input (NETi), the activation value (ai) of the neuron is altered (with respect to time) by a relation written as:

where τ is the time-constant of the neuronal activation. By specifying the reference activation level as ao, the output value oj of the neuron can be determined by the graded response of the neuron. Written in a functional form:

where F is a monotonic function which limits the output value between upper and lower bounds. It is, therefore, a squashing function which is S-shaped or sigmoidal. The reference activation level ao is termed as the gain factor. It is the first system parameter, and the random error en is a noise term whose variance is dictated by the (pseudo) temperature which can be regarded as the second system parameter.

4.8 Entropy at Equilibrium

Relevant to the combinational optimization problem, the simulated annealing algorithm specified by the conjecture, namely, the distribution (Equation 4.8), pai refers to a stationary or equilibrium distribution which guarantees asymptotic convergence towards globally optimal solutions. The corresponding entropy at equilibrium is defined as:

which is a natural measure of disorder. High entropy corresponds to chaos and low entropy values to order. Pertinent to the neural net, entropy also measures the degree of optimality. Associated energy of the state i, namely Ei with an acceptance probability pai has an expected valued <Ei>T which refers to the expected cost at equilibrium. By general definition through the first moment:

Likewise, the second moment defines the expected square cost at equilibrium. That is:

and a variance of the cost can be specified as:

Considering the neural complex as a large physical ensemble, from the corresponding principles of statistical thermodynamics, the following relations can be stipulated:

and

These conditions indicate that in simulated annealing, the expected cost and entropy decreases monotonically — provided equilibrium is reached at each value of the control parameter (T) to their final value, namely, Eiopt and , respectively.

Further, the entropy function under limiting cases of T are to be specified as follows:

and

where S and Sopt are the sets of the states and globally optimal states, respectively. (In the combinational problems S and Sopt denote the sets of solutions and globally optimal solutions, respectively.)

In statistical physics, corresponding to the ground state, So = log (1) = 0 defines the third law of thermodynamics.

When the annealing algorithm refers to the equilibrium distribution, the probability of finding an optimal solution (or state) increases monotonically with decreasing T. Further, for each suboptimal solution there exists a positive value of the pseudo-temperature Ti (or the control parameter), such that for T < Ti the probability of finding that solution decreases monotonically with decreasing T. That is,


Previous Table of Contents Next

Copyright © CRC Press LLC

HomeAccount InfoSubscribeLoginSearchMy ITKnowledgeFAQSitemapContact Us
Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights reserved. Reproduction in whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.