|
 |
[an error occurred while processing this directive]
8.2 Information-Theoretics of Neural Networks
Application of informational analysis to the neural complex has so far been limited to the information capacity heuristics of structures like Hopfield's net model of associative memory for the purpose of quantifying such network performance as a memory. The information capacity of a standard memory is determined explicitly by the number of memory bits in simple models, but where complex dynamics of activation patterns are encountered, probabilistic estimates of the information capacity are deduced on the basis of certain simplifying assumptions. For example, estimates of the capacity of Hopfield's network via statistical methods have been formulated and extended with more rigorous statistical techniques [97-100]. Further, Lee et al. [101] improvised an approach using Browns Martingale Central Limit Theorem and Guptas Transformation to analyze the complex dynamics of Hopfield's memory model rigorously through information-theoretic considerations. As an extended effort, Gardner and Derrida [102,103] considered the relevant aspects of storage capacity of neural networks in search of optimal values. Pertinent to neurons, the memory and thought-process are decided by the associative properties of the collective neuronal aggregates and the underlying entropy considerations. In this perspective, Caianello [64] outlined a theory of thought-process and thinking machines in terms of neuronic equations depicting the instantaneous behavior of the system and mnemonics equations representing the permanent or quasi-permanent changes in the neuronal activities. Such permanent changes in neural functioning (caused by experience) have been modeled as a degree of plasticity which decides the fixation of memory or memory storage involving time-dependent processes.
Shannons concept of information or entropy [104,105] in stochastic systems has also been extended to neuronal associative memory which has led to the information-theoretic modeling of neural activity manifesting as a neuronal potential train of spikes. Accordingly, the temporal and spatial statistics of this neuronal aggregate of signal elements have featured dominantly in the information-theoretic approaches formalized in the 1970's [106]. For example, Pfaffelhuber in 1972 [107] used the concept of entropy to describe the learning process as one in which the entropy is depicted as a decaying function of time.
The concept of entropy has also formed the basis of elucidating the information capacity of neural networks. For example, considering the associative memory as a plausible model for biological memory with a collective dynamic activity, the information capacity of Hopfield's network has been quantitatively elucidated.
Classical characterization of information in a neural network via stochastic system considerations (as conceived by Shannon and Weaver) simply defines a specification protocol of message transmission pertinent to the occurrence of neuronal events (firings) observed (processed) from a milieu of possible (such events) across the interconnected network. Thus in early 1970's, the information processing paradigms as referred to the neural system were derived from the stochastic considerations of (or random occurrence of) neural spike trains; relevant studies were also devoted to consider the probabilistic attributes of the neural complex; and, considering the randomly distributed cellular elements, the storage capacity of information in those elements (or the associated memory) was estimated. That is, the information-theoretic approach was advocated to ascertain the multiple input-single output transfer relation of information in the neuronal nets. The information capacity of a neural network was then deduced on the basis of the dynamics of activation patterns. Implicitly, such a capacity is the characteristic and ability of the neural complex viewed as a collective system. It was represented as a memory (that stores information) and was quantified via Hartley-Shannons law as the logarithm of number of strings of address lines (consisting of memory locations or units) distinguished. Memory locations here refer to the number of distinguishable threshold functions (state-transitional process) simulated by the neurons.
In the elucidation of optimal (memory-based) storage capacity of neural networks, the following are the basic queries posed:
- What is the maximum number of (pattern) examples that the neural complex can store?
- For a given set of patterns (less than the maximum value), what are the different functions that could relate the network inputs to the output?
- How do the statistical properties of the patterns affect the estimation of the network information capacity?
Further, the notions of information capacity of the neural complex (associated memory) are based on:
- Binary vector (dichotomous) definition of a neuronal state.
- Matrix representation of the synaptic interconnections.
- Identifying the stable state of a neuron.
- Defining the information capacity of the network as a quantity for which the probability of the stored state-vector patterns (as per Hebbian learning) being stable, is maximum.
|