Explaining the process of how any of us might have arrived to a particular conclusion or decision by verbally detailing the variables, weights, and conditions that our brains navigate through to arrive at an answer can be complex enough. But what happens when we ask a neural network, which itself is based on the loose architecture of our own brains, how it arrived at a particular conclusion?
In neural networks and artificial intelligence research, the difficulty for the system to provide a suitable explanation for how it arrived at an answer is referred to as “the black box problem.” Just as we have to mentally burn through the complex sets of values we used to arrive at a conclusion, oftentimes, to fully explain the calculations we make would take hours to do thoroughly. For humans, this is one issue, but when it comes to the future adoption of neural networks, this creates a tricky problem beyond the realms like natural language processing, image recognition and more consumer-sided services since in research, medicine, engineering, and elsewhere, there needs to be a trace between the input and output; a validation beyond the fact that a “smarter” system has provided an answer.
For those areas where the practical applications of neural networks are clear, the problems backtracking the answer can be a deal-killer. For instance, if researchers in chemistry use a neural network to determine relative predictive benefits of a particular compound, it will be necessary to explain how the net came to its conclusions. But so far, doing that is next to impossible, and at best, is a time-consuming task that even still cannot provide a detailed breakdown fit for human comprehension. While there are other shortcomings at this point, including the fact that while neural nets are great at classifying and not as good as one might hope for mission-critical finite decision-making, even with added capabilities there, without a traceable route to how a decision was made, adoption will stagnate in some key areas.
So never mind, for a moment at least, that we are building systems that can arrive at complex answers, spin off their own questions, yet never be able to detail how they evaluated so many weights and measures, and consider the value of an approach that could extract that decision-making process—isolate it, and explain it. According to a group of researchers in China who are working on this “black box” problem, “The availability of a system that would provide an explanation of the input and output mappings of a neural network in the form of rules would be useful and rule extraction is one system that tries to elucidate to the user how the neural network arrived at its decision in the form of if/then rules.”
The group from Chongqing University has extended the capabilities of the rule-extracting TREPAN algorithm to pluck the decision trees from neural networks and refine them down to their constituent parts. This is no simple process either, considering the scope of large neural nets and the fact that for each of the distinct variables, there are many more coupled interactions. To make sense of all of this, at their most basic, neural networks create their own example-based rules as they go, but isolating these rules and their companion effects is the difficult part.
“Rule extraction attempts to translate this numerically stored knowledge into a symbolic form that can be readily comprehended,” explains Awudu Karim, one of the lead researchers behind the X-TREPAN effort. There have been other efforts at rule extraction, but they required special neural network architectures or training paradigms and cannot be applied to in situ neural nets, he explains. But by viewing the extraction of these rules as its own learning tasks, X-TREPAN can remove the complexity of looking at all the different weights and instead “approximiate the neural network by learning its input-output mappings. Decision trees are this graphical representation and this combination of symbolic information and graphical representation makes decision trees one of the most comprehensible representations of pattern recognition knowledge.”
While the researchers were able to show significant improvement in a few representative benchmarks over the native algorithm, they do note that increasing the size of the problem by more nodes does, as one might easily imagine, continue to compound the black box problem, even with a reasonable improvement in decision tree extraction. This comes at a time when the neural networks that we know about at companies like Google, Facebook, Baidu, Microsoft, Yahoo, and elsewhere are growing at a staggering pace. For things like image or speech recognition, this creates internal challenges that are more geared around user experience. But for neural networks to deliver on their real promise for the future of medicine, chemistry, nuclear physics, and elsewhere, it is difficult to imagine how scientists and researchers can live without either reproducible results (i.e. supercomputer simulations, which are already limited by scale and its associated problems of power, data movement, etc.) or at least an understandable map to explain results.
For now, it’s safe to say that neural networks are finding their development hub in areas where more fuzzy approaches are reasonably acceptable. But until the black box is truly closed and a standard set of systems and tools for backtracking to an answer (from a system that itself has generated its own rules and questions) neural networks will be limited to research and consumer-driven services. That’s not to say the day won’t come—so we will continue tracking how neural networks can hit the point of reproducibility or at least traceability.