Morgan Vawter, Chief Analytics Director, Caterpillar Inc
Prescriptive analytics driven by natural language processing and graph architecture.
Any leader following technical trends is aware that the world is abuzz with the latest wave of techno-hyperbole. Big data and prescriptive analytics are being overshadowed as artificial intelligence (AI) enters the mainstream conversation. While enthusiasm for what is next is certainly a necessary element for technology leadership, this enthusiasm could use some tempering. Or, we may experience AI’s little Ice Age or (less creatively) AI Winter 2.0.
This is not to say that AI technology has no merit and will not enjoy success and the value generation that comes with it. It just means that there should be a reasonable expectation in line with reality. Organizations should not expect to make short-term investments in hardware and software to deliver on the promise of AI. They should expect to make investments in people and develop new approaches to challenges in their domain over years. These new initiatives require a shift in thinking and new data models.
As an example, our organization has set about machine learning from a large repository of technical documents that were written by service technicians while diagnosing issues and repairing large industrial machines. “The data was especially good by the standards of the computational linguist,” notes Ryan Chandler, Senior Data Scientist at Caterpillar, “in that the text was short, 3-6 sentences per transaction. Many of these sentences had labels. The technician might write ‘Customer complaint: engine knocking’, ‘Analysis: performed oil test’, ‘Result: iron found in oil’, ‘Correction: replaced broken rocker arm, oil filter and oil’. There was a lot of data—nearly 27 million sentence fragments—and the text was describing a small domain of interest, namely the diagnosis and repair of equipment. This type of large-scale, labeled, domain-specific collection of short utterances does not come along every day.”
In many ways, this collection provided the ideal data source upon which to perform computational linguistics.
This is not to say that AI technology has no merit and will not enjoy success and the value generation that comes with it
As importantly, there was a clear sense that this resource could be valuable to guide the future diagnosis of equipment. That is, we could take action based on this knowledge. Over six years, we developed more and more sophisticated methods of analyzing this data—from performing word frequency distributions (the descriptive statistics of language), machine learning classification and statistical parsing, to finally the holy grail of natural language parsing, natural language understanding.
What it means for a computer to understand text is a topic for debate. However, there are a couple measures that are meaningful. First, if upon being presented with a new utterance, the computer can break the sentence down into a structure indicating the correct relationship between actors and objects (parsing), then tie down each word used to the proper dictionary definition (word sense disambiguation) it has comprehended the sentence correctly. Two sentences with the same content—“Priya’s shirt is blue” and “Priya has a blue shirt”—should parse and disambiguate to the same logical meaning. Half of the problem of natural language processing is mapping the sentence’s surface structure to its logical structure. Let’s say that the state of the art in this respect is pretty acceptable, perhaps 70 percent - 85 percent of sentences can be mapped correctly in some domains. This does not mean that the computer now can act on this knowledge in any way.
It’s the other half of the equation that is the real issue. How do we represent this knowledge and leverage it in a meaningful way? What architecture does one employ to integrate and contextualize knowledge and then reason from it? A natural question is, “how do we store knowledge in our brain?” If we knew how the brain worked at a hardware level, perhaps the question would be closed; however, we don’t know. We know that it appears that a brain is a highly interconnected network. Alfred Korzybski, the man who coined the phrase “the map is not the territory”, went on to say, “If words are not things, or maps are not the actual territory, then, obviously, the only possible link between the objective world and the linguistic world is found in structure, and structure alone.”
Taking this aphorism literally, we employ graph data structures (Neo4j) to embody this logical form of knowledge. These NoSQL alternatives to traditional relational databases allow us to build ontologies (shared structural conceptualizations of real world phenomena) and perform deduction (if the engine was removed, so was the piston because it is a subpart). Beyond that, we can connect cause and effect and further set expectations (semantic frames) through which we can induce the more important questions, such as the reason a technician is taking a specific course of action. This structure, which represents the experience of thousands of individuals over years, can now be integrated and leveraged for value by traversing this graph. Today we can read documents at scale and perform natural language understanding in specific areas of concern. This allows us to answer questions like, “are we seeing a trend in repair issues and, if so, do they have the same or related causes?” Then we can prescribe action (“If an engine is knocking what is the right test to perform?”) and predict (“What might be the problem?”). These types of solutions are the furthest thing from off the shelf AI. They embody the mind of the organization, its domain knowledge and, therefore, they are the product of the painstaking translation from (wo)man to machine.