MIT and Harvard Researchers Explore the Possibility of Certain Decision Making in Neural Networks
Evidential distribution allows the neural networks to crunch the data and display confidence in the data’s quality.
With the invasion of artificial intelligence across the industry, deep learning neural networks are extensively applied by organizations to make informed decisions. First proposed by Warren McCullough and Walter Pitts, two researchers at the University of Chicago, neural networks enable learning of the system to perform some task by analyzing the training examples.
Due to their robust behavior of analyzing patterns in complex datasets, neural networks are heavily deployed across sectors that demand informed decisions. The Healthcare sector and the automotive industry have widely deployed neural networks to diagnose diseases and the safety of autonomous vehicles, respectively. So far, the results have been positive, but some instances reveal that decisions directed by the neural network cannot be trusted completely. This indicates one of the most frequent limitations encountered in deep learning neural networks. To remedy this limitation, researchers at MIT University explore the possibility of learning uncertainty in the neural behaviours through an approach where the neural networks can crunch the data and display confidence in the data’s quality.
The study titled “Deep Evidential Regressional” is a collaborative work by MIT’s Artificial Intelligence Lab and Harvard Graduate Programme in Biophysics and will be presented next month at the NeurIPS conference.
Alexander Amini, who is the principal researcher of the approach, states, “We need the ability to not only have high-performance models but also to understand when we cannot trust those models,”
Researchers proposed a method for training non-Bayesian neural networks so that a continuous target and its associated evidence for aleatoric and epistemic uncertainty can be estimated. Aleatoric uncertainty implies uncertainty in the data, whereas epistemic uncertainty means uncertainty in the prediction. The network designed by the researchers contributes to distributional capturing of the evidence for certain decision making.
Researchers imposed priors (initial beliefs about an event in terms of the probability distribution) during the training of neural networks so that the model can get regularized when its predicted evidence is not aligned with the correct output. Since the researchers have not applied sampling of neural networks during reasoning or on out-of-distribution (OOD) examples training, the model is capable of efficient and scalable uncertainty learning. The study cited that training a neural network to output the hyperparameters of the higher-order evidential distribution will aid in learning about epistemic and aleatoric uncertainty without the need for sampling
Out of Distribution testing
Research states that uncertainty estimation is required to understand whether the model has received out of distribution (OOD) datasets or when the model’s output cannot be trusted.
Researchers examined models’ ability to capture increased epistemic uncertainty on OOD data by testing the images from ApolloScape, an OOD dataset of diverse outdoor driving. The results show that evidential distributions without training on OOD data, capture increased uncertainty on OOD input data and model’s final decision on standard with epistemic uncertainty estimation baseline.
Researchers conclude that evidential distribution is widely applicable across regression tasks including temporal forecasting, property prediction, and control learning. Uncertainty estimation for neural networks can be deployed in safety-critical domains, such as autonomous vehicle control, medical diagnosis, or in settings with large dataset imbalances and bias such as crime forecasting and facial recognition.
Professor Daniela Rus at the MIT Computer Science and Artificial Intelligence Lab says “This idea is important and applicable broadly. It can be used to assess products that rely on learned models. By estimating the uncertainty of a learned model, we also learn how much error to expect from the model, and what missing data could improve the model.”