By Rodika Tollefson, Contributor
If you’ve asked your smart speaker to check the weather or play your favorite tunes, then you’re already familiar with natural language processing (NLP).
This subfield of artificial intelligence (AI) helps humans interact more personally with computerized systems. Think Siri, Alexa, Google, or that friendly chatbot that greets you when you visit an online shop or a B2B website (like Domino’s pizza-ordering Dom or HubSpot’s sales assistant HubBot).
There’s much more to this technology, however, than helping you find a nearby pizzeria, turning off your lights, or gathering information to make an online clothing return. Natural language processing enables machines to read and understand human language, synthesize data, and derive meaning.
While Westworld-level science fiction may be light years away—and even conversational AI will likely not arrive for another decade—organizations across industries have been adopting various practical applications, as well as tackling breakthrough ones lurking around the corner.
Just ask Liam Kaufman, co-founder and CEO of Winterlight Labs. His 5-year-old startup, based in Toronto, Canada, is commercializing proprietary AI technology that detects and monitors dementia by using a person’s voice. The technology creates disease-specific biomarkers based on 500 variables it extracts from speech and language. Among other things, the algorithm identifies long pauses, repetition, and heavy use of pronouns, all cues that could indicate cognitive impairment.
Before the company’s launch, Kaufman administered the traditional pen-and-paper dementia assessments, which he found subjective. (For example, the test administrator for the Alzheimer’s Disease Assessment Scale-Cognitive Subscale—or ADAS-Cog—would need to rate the patient’s “word-finding difficulties” on a scale of 1 to 5 during an open-ended conversation.)
“Generally speaking, neurology and psychiatry lack tools to objectively measure behavior and cognition,” Kaufman says. “There’s a huge need.”
“People with Alzheimer’s have word-finding difficulties, and we can use natural language processing to quantify those difficulties.”
—Liam Kaufman, co-founder and CEO, Winterlight Labs
The demand is growing, too. In the United States alone, one in 10 Americans age 65 and older—or an estimated 5.8 million people—live with Alzheimer’s (the most common cause of dementia), according to the Alzheimer’s Association. That number is expected to quickly escalate as younger baby boomers reach age 65.
“People with Alzheimer’s have word-finding difficulties, and we can use natural language processing to quantify those difficulties,” Kaufman says.
Word-finding difficulties include using pronouns instead of nouns (“she” vs. “child”), giving a description of the word’s meaning instead of using the word, and pausing or hesitating when answering a question. Difficulty with language is a common symptom of early Alzheimer’s, though it can also be caused by other conditions.
The Winterlight Labs software analyzes human speech, including word choice, grammar, and syntax. It also works for monitoring how well patients respond to treatment and for assessing psychiatric conditions such as depression.
About half a dozen pharmaceutical companies in the U.S. and Europe are already using the technology. By the end of 2020, Kaufman expects more companies to follow suit, including in other countries like Japan.
Solving the Problem of Unstructured Data
Structured data—data in relational databases such as inventory systems or travel-reservation platforms—is highly organized and formatted, making it easy to search and analyze. Digitization, however, has created vast amounts of unstructured data, which doesn’t fit into a predefined format.
Unstructured data is anything from plain text and images to scientific documents and sensor telemetry. Every application, whether it’s email, collaboration software, or an enterprise resource planning platform, produces this data that’s difficult to deconstruct.
Within an organization, each person could generate thousands of words per day. A 2018 study by Igneous found that data-centric organizations have massive amounts of unstructured data: 59 percent of the IT leaders surveyed said this entailed more than 10 billion files.
Imagine a physician who has to sift through scores of clinical notes or a human resources recruiter scanning through thousands of candidate applications for one job. For humans, these tasks are time-consuming and inefficient—sometimes even unsurmountable and often error-prone. That’s where natural language processing comes in: It automates the kinds of tasks that require computers to understand human language.
Josh Miramant, CEO of data science company Blue Orange in New York City, uses compliance as an example. Global organizations do business in a regulatory environment that has multiple compliance agencies across the world and non-standardized documents in different languages.
“To scale out a human team and unify across multiple compliance agencies, the task becomes untenable because of the breadth and the size of documentation,” says Miramant, whose company focuses on improving business operations through machine learning and NLP.
The technology, however, is not without challenges.
It’s hard for humans to teach a machine, says Sameer Maskey, CEO and founder of Fusemachines, a machine learning company that offers AI education programs, as well as talent and consulting, also headquartered in New York City.
“Language is hard, and we, as a research community, still don’t fully understand how to make machines understand language…Humans are able to generalize, understand ambiguities and construct utterances very well, but machines still have a hard time.”
—Sameer Maskey, CEO and founder, Fusemachines
“Language is hard, and we, as a research community, still don’t fully understand how to make machines understand language,” says Maskey, who also teaches courses about NLP at Columbia University.
Consider a child who learns how to speak a language in two to three years—a computer may have hundreds of years’ worth of data and still not be fluent or understand as much as a 3-year-old does.
“Humans are able to generalize, understand ambiguities and construct utterances very well, but machines still have a hard time,” Maskey explains.
Current and Future Adoption Across Industries
Fusemachines’ educational platform has an AI tutor that acts as a teacher’s assistant. As students learn AI, they teach the concepts back to the AI tutor, and the human student answers the AI tutor’s questions to demonstrate knowledge.
“It doesn’t replace the human teacher, but it would lead toward a point where AI can provide significant help to students to learn new topics,” Maskey says.
These kinds of examples—of AI enhancing rather than replacing human abilities—exist across many industries; for instance:
– Airlines are integrating NLP and other AI technologies into aircrafts’ predictive maintenance process. This shifts the focus of the technical maintenance team to validating data rather than aggregating and analyzing it.
– Mortality due to sepsis—a deadly infection that kills 270,000 Americans every year— increases up to 8 percent for every hour of delayed diagnostics. Healthcare researchers have found that NLP can improve early prediction of septic shock.
– Retailers are using NLP and text-mining to analyze customer sentiment in order to improve brand loyalty and optimize marketing and sales.
One particular concept Maskey is excited about is “analyst in a box,” which he believes could become a productive tool in the next five years. Businesses from many sectors use human analysts to conduct research and answer questions of interest to executives, but the research is time-intensive. NLP could be applied to scan through data, synthesize reports, and generate findings much faster, reducing the research time from weeks to hours.
“The concept of an analyst in a box could be quite useful and transformative for businesses,” Maskey says.
Miramant of Blue Orange is optimistic about a day when NLP helps remove linguistic barriers in society, and accurate and seamless translation makes the transfer of human knowledge more efficient and ubiquitous.
“We’re about to solve the communication problem with technology in the next few years,” he says. “Sharing of knowledge, [identifying] misunderstandings that are lost, [extracting] understanding from complex and large data sources in multiple languages—that’s what we are on the edge of.”