Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation.
ABSTRACT: We introduce Equilibrium Propagation, a learning framework for energy-based models. It involves only one kind of neural computation, performed in both the first phase (when the prediction is made) and the second phase of training (after the target or prediction error is revealed). Although this algorithm computes the gradient of an objective function just like Backpropagation, it does not need a special computation or circuit for the second phase, where errors are implicitly propagated. Equilibrium Propagation shares similarities with Contrastive Hebbian Learning and Contrastive Divergence while solving the theoretical issues of both algorithms: our algorithm computes the gradient of a well-defined objective function. Because the objective function is defined in terms of local perturbations, the second phase of Equilibrium Propagation corresponds to only nudging the prediction (fixed point or stationary distribution) toward a configuration that reduces prediction error. In the case of a recurrent multi-layer supervised network, the output units are slightly nudged toward their target in the second phase, and the perturbation introduced at the output layer propagates backward in the hidden layers. We show that the signal "back-propagated" during this second phase corresponds to the propagation of error derivatives and encodes the gradient of the objective function, when the synaptic update corresponds to a standard form of spike-timing dependent plasticity. This work makes it more plausible that a mechanism similar to Backpropagation could be implemented by brains, since leaky integrator neural computation performs both inference and error back-propagation in our model. The only local difference between the two phases is whether synaptic changes are allowed or not. We also show experimentally that multi-layer recurrently connected networks with 1, 2, and 3 hidden layers can be trained by Equilibrium Propagation on the permutation-invariant MNIST task.
Project description:Among the recent innovative technologies, memristor (memory-resistor) has attracted researchers attention as a fundamental computation element. It has been experimentally shown that memristive elements can emulate synaptic dynamics and are even capable of supporting spike timing dependent plasticity (STDP), an important adaptation rule that is gaining particular interest because of its simplicity and biological plausibility. The overall goal of this work is to provide a novel (theoretical) analog computing platform based on memristor devices and recurrent neural networks that exploits the memristor device physics to implement two variations of the backpropagation algorithm: recurrent backpropagation and equilibrium propagation. In the first learning technique, the use of memristor-based synaptic weights permits to propagate the error signals in the network by means of the nonlinear dynamics via an analog side network. This makes the processing non-digital and different from the current procedures. However, the necessity of a side analog network for the propagation of error derivatives makes this technique still highly biologically implausible. In order to solve this limitation, it is therefore proposed an alternative solution to the use of a side network by introducing a learning technique used for energy-based models: equilibrium propagation. Experimental results show that both approaches significantly outperform conventional architectures used for pattern reconstruction. Furthermore, due to the high suitability for VLSI implementation of the equilibrium propagation learning rule, additional results on the classification of the MNIST dataset are here reported.
Project description:The brain processes information through multiple layers of neurons. This deep architecture is representationally powerful, but complicates learning because it is difficult to identify the responsible neurons when a mistake is made. In machine learning, the backpropagation algorithm assigns blame by multiplying error signals with all the synaptic weights on each neuron's axon and further downstream. However, this involves a precise, symmetric backward connectivity pattern, which is thought to be impossible in the brain. Here we demonstrate that this strong architectural constraint is not required for effective error propagation. We present a surprisingly simple mechanism that assigns blame by multiplying errors by even random synaptic weights. This mechanism can transmit teaching signals across multiple layers of neurons and performs as effectively as backpropagation on a variety of tasks. Our results help reopen questions about how the brain could use error signals and dispel long-held assumptions about algorithmic constraints on learning.
Project description:Due to offshore reservoirs being developed in ever deeper and colder waters, gas hydrates are increasingly becoming a significant factor when considering the profitability of a reservoir due to flow disruptions, equipment, and safety hazards arising from the hydrate plug formation. Due to low-dosage hydrate inhibitors such as kinetic inhibitors competing with traditional thermodynamic inhibitors such as methanol, accurate information regarding the hydrate equilibrium conditions is required to determine the optimal hydrate control strategy. Existing thermodynamic models can prove inflexible regarding parameter adjustment and the incorporation of new data. Developing a multivariate regression model capable of generalizing hydrate equilibria over a wide range of conditions, with results competing with thermodynamic models is worthwhile. A multilayer perceptron neural network of three hidden layers has undergone supervised training means of a backpropagation to accurately predict uninhibited hydrate equilibrium pressure for a range of gas mixtures with nine input features, excluding hydrogen sulfide and electrolytes, from a dataset of 1209 equilibrium points, 670 of which are multicomponent gases, sampled in a rigorous data sampling campaign from existing experimental studies. Statistical significance of results has been emphasized, with models validated using 10-fold cross-validation and holdout validation, facilitating hyperparameter optimization without overfitting, while stratified holdout ensures testing a wide range of conditions. The developed model has proven to outperform two popular thermodynamic models. Various scoring metrics are used, with an average cross-validated R2 of 0.987 ± (0.003). An R2 of 0.993 and mean absolute percentage error of 5.56% are yielded for holdout validation. Auxiliary models are included to determine the multicomponent prediction capability and dependency on individual data sources. Multicomponent data prediction has proven successful; results prove that the model accurately generalizes hydrate equilibria and is well suited to predicting unseen data. Positive results are largely insensitive to exact model parameters, thus indicating a robust, replicable methodology.
Project description:Sensory systems constantly compare external sensory information with internally generated predictions. While neural hallmarks of prediction errors have been found throughout the brain, the circuit-level mechanisms that underlie their computation are still largely unknown. Here, we show that a well-orchestrated interplay of three interneuron types shapes the development and refinement of negative prediction-error neurons in a computational model of mouse primary visual cortex. By balancing excitation and inhibition in multiple pathways, experience-dependent inhibitory plasticity can generate different variants of prediction-error circuits, which can be distinguished by simulated optogenetic experiments. The experience-dependence of the model circuit is consistent with that of negative prediction-error circuits in layer 2/3 of mouse primary visual cortex. Our model makes a range of testable predictions that may shed light on the circuitry underlying the neural computation of prediction errors.
Project description:Cine Phase Contrast (CPC) MRI offers unique insight into localized skeletal muscle behavior by providing the ability to quantify muscle strain distribution during cyclic motion. Muscle strain is obtained by temporally integrating and spatially differentiating CPC-encoded velocity. The aim of this study was to quantify CPC measurement accuracy and precision and to describe error propagation into displacement and strain. Using an MRI-compatible jig to move a B-gel phantom within a 1.5 T MRI bore, CPC-encoded velocities were collected. The three orthogonal encoding gradients (through plane, frequency, and phase) were evaluated independently in post-processing. Two systematic error types were corrected: eddy current-induced bias and calibration-type error. Measurement accuracy and precision were quantified before and after removal of systematic error. Through plane- and frequency-encoded data accuracy were within 0.4 mm/s after removal of systematic error - a 70% improvement over the raw data. Corrected phase-encoded data accuracy was within 1.3 mm/s. Measured random error was between 1 to 1.4 mm/s, which followed the theoretical prediction. Propagation of random measurement error into displacement and strain was found to depend on the number of tracked time segments, time segment duration, mesh size, and dimensional order. To verify this, theoretical predictions were compared to experimentally calculated displacement and strain error. For the parameters tested, experimental and theoretical results aligned well. Random strain error approximately halved with a two-fold mesh size increase, as predicted. Displacement and strain accuracy were within 2.6 mm and 3.3%, respectively. These results can be used to predict the accuracy and precision of displacement and strain in user-specific applications.
Project description:A visual world experiment examined the time course for pragmatic inferences derived from visual context and contrastive intonation contours. We used the construction It looks like an X pronounced with either (a) a H(*) pitch accent on the final noun and a low boundary tone, or (b) a contrastive L+H(*) pitch accent and a rising boundary tone, a contour that can support contrastive inference (e.g., It LOOKSL+H*like a zebraL-H%… (but it is not)). When the visual display contained a single related set of contrasting pictures (e.g. a zebra vs. a zebra-like animal), effects of LOOKSL+H* emerged prior to the processing of phonemic information from the target noun. The results indicate that the prosodic processing is incremental and guided by contextually-supported expectations. Additional analyses ruled out explanations based on context-independent heuristics that might substitute for online computation of contrast.
Project description:Neural networks are currently implemented on digital Von Neumann machines, which do not fully leverage their intrinsic parallelism. We demonstrate how to use a novel class of reconfigurable dynamical systems for analogue information processing, mitigating this problem. Our generic hardware platform for dynamic, analogue computing consists of a reciprocal linear dynamical system with nonlinear feedback. Thanks to reciprocity, a ubiquitous property of many physical phenomena like the propagation of light and sound, the error backpropagation-a crucial step for tuning such systems towards a specific task-can happen in hardware. This can potentially speed up the optimization process significantly, offering important benefits for the scalability of neuro-inspired hardware. In this paper, we show, using one experimentally validated and one conceptual example, that such systems may provide a straightforward mechanism for constructing highly scalable, fully dynamical analogue computers.
Project description:Pig heart lactate dehydrogenase was studied in the direction of pyruvate and NADH formation by recording rapid changes in extinction, proton concentration, nucleotide fluorescence and protein fluorescence. Experiments measuring extinction changes show that there is a very rapid formation of NADH within the first millisecond and that the amplitude of this phase (phase 1) increases threefold over the pH range 6-8. A second transient rate (phase 2) can also be distinguished (whose rate is pH-dependent), followed by a steady-state rate (phase 3) of NADH production. The sum of the amplitudes of the first two phases corresponds to 1mol of NADH produced/mol of active sites of lactate dehydrogenase. Experiments that measured the liberation of protons by using Phenol Red as an indicator show that no proton release occurs during the initial very rapid formation of NADH (phase 1), but protons are released during subsequent phases of NADH production. Fluorescence experiments help to characterize these phases, and show that the very rapid phase 1 corresponds to the establishment of an equilibrium between E(NAD) (Lactate) right harpoon over left harpoon H(+)E(NADH) (Pyruvate). This equilibrium can be altered by changing lactate concentration or pH, and the H(+)E(NADH) (Pyruvate) species formed has very low nucleotide fluorescence and quenched protein fluorescence. Phase 2 corresponds to the dissociation of pyruvate and a proton from the complex with a rate constant of 1150s(-1). The observed rate constant is slower than this and is proportional to the position of the preceding equilibrium. The E(NADH) formed has high nucleotide fluorescence and quenched protein fluorescence. The reaction, which is rate-limiting during steady-state turnover, must then follow this step and be involved with dissociation of NADH from the enzyme or some conformational change immediately preceding dissociation. Several inhibitory complexes have also been studied including E(NAD+) (Oxamate) and E(NADH) (Oxamate') and the abortive ternary complex E(NADH) (Lactate). The rate of NADH dissociation from the enzyme was measured and found to be the same whether measured by ligand displacement or by relaxation experiments. These results are discussed in relation to the overall mechanism of lactate dehydrogenase turnover and the independence of the four binding sites in the active tetramer.
Project description:In the recent decade, deep eutectic solvents (DESs) have occupied a strategic place in green chemistry research. This paper discusses the application of DESs as functionalization agents for multi-walled carbon nanotubes (CNTs) to produce novel adsorbents for the removal of 2,4-dichlorophenol (2,4-DCP) from aqueous solution. Also, it focuses on the application of the feedforward backpropagation neural network (FBPNN) technique to predict the adsorption capacity of DES-functionalized CNTs. The optimum adsorption conditions that are required for the maximum removal of 2,4-DCP were determined by studying the impact of the operational parameters (i.e., the solution pH, adsorbent dosage, and contact time) on the adsorption capacity of the produced adsorbents. Two kinetic models were applied to describe the adsorption rate and mechanism. Based on the correlation coefficient (R2) value, the adsorption kinetic data were well defined by the pseudo second-order model. The precision and efficiency of the FBPNN model was approved by calculating four statistical indicators, with the smallest value of the mean square error being 5.01 × 10-5. Moreover, further accuracy checking was implemented through the sensitivity study of the experimental parameters. The competence of the model for prediction of 2,4-DCP removal was confirmed with an R2 of 0.99.
Project description:Physical reservoir computing approaches have gained increased attention in recent years due to their potential for low-energy high-performance computing. Despite recent successes, there are bounds to what one can achieve simply by making physical reservoirs larger. Therefore, we argue that a switch from single-reservoir computing to multi-reservoir and even deep physical reservoir computing is desirable. Given that error backpropagation cannot be used directly to train a large class of multi-reservoir systems, we propose an alternative framework that combines the power of backpropagation with the speed and simplicity of classic training algorithms. In this work we report our findings on a conducted experiment to evaluate the general feasibility of our approach. We train a network of 3 Echo State Networks to perform the well-known NARMA-10 task, where we use intermediate targets derived through backpropagation. Our results indicate that our proposed method is well-suited to train multi-reservoir systems in an efficient way.