The Learning-Knowledge-Language Innovation Hot Zone
Last month the noble prize-winning physicist Steven Weinberg passed away at the age of 82. Weinberg was one of the greatest physicists of our time. He is noted for two things:
- unifying the fundamental forces of physics and
- being able to explain physics to non-technical people.
This got me thinking, what am I doing to help unify the forces around building intelligent machines? How can I explain this in a single diagram?
This blog will summarize a new theory on where AI and innovation are emerging in many organizations. When three emerging technologies come together (the red triangle labeled Innovation Hot Zone), we can provide high-value innovation to many of the wicked challenges that face many organizations today. I call this the LKL Innovation Hot Zone, where the first L is for automated Learning (both machine learning and symbolic learning), the K is for Knowledge management and knowledge graphs, and the last L is for the large Language models such as BERT and GPT-3.
This summer, I had three events that impacted how I perceive the value of innovation in my work areas. First, I was one of the organizers and judges of a large hackathon that included over 500 5-person teams competing for recognition. Teams had just 48 hours to come up with new ideas on ways to combine new emerging technologies. For example, can we use Jupyter Notebook with FPGAs to build a working lighting-fast doctor recommendation system that compares millions of healthcare providers and returns the results in just a few milliseconds? It turns out you can! More on that in a future blog post.
Second, I managed a brilliant group of summer interns who did groundbreaking work on building course recommendation engines for my organization’s internal technical training division. This is also a work-in-progress blog that will be out soon.
And third, I was fortunate to be a co-author of several patent applications that almost all got PROCEED evaluations with our internal patent review boards. Learning what technologies can be combined for peer-reviewed innovation forces us to explain our ideas to a patent review board in under 10 minutes.
All three of these events helped me crystalize a theory forming in my mind since I saw the first OpenAI GPT-3 demos in 2020. I hope that the LKL Innovation Hot Zone theory will help people new to computer science decide exactly what courses to take and help people in the corporate world decide what their organization's continuing education systems should recommend.
But first, let's give each of these topics clear definitions, and then we will see how they work together to create innovation.
The Learning Dimension
Learning systems are those systems that continually improve as they are exposed to more data. They are continuous learning systems. They are characterized as having feedback loops that are part of their architecture. They are not one-and-done systems where a machine learning researcher throws a model over the wall to a business unit as says “good luck”! We need continuous education and systems thinking to solve hard business problems.
Learning includes multiple types of learning: machine learning, symbolic learning, generative learning, and genetic algorithms such as generative adversarial networks. Central to learning is the process of inference, where given partial data, we can construct the missing links in our knowledge.
And one of the key insights I have is that AI is a LOT more than just machine learning and deep neural networks. Granted, they have become proficient at recognizing patterns in images, speech, and language in the last several years. But these tools are not business solutions. Showing state-of-the-art progress on recognizing cats in videos is cool, but it seldom drives down the costs of your customer service centers.
The Knowledge Dimension
Large-scale Enterprise Knowledge Graphs (EKGs) are on the top of the triangle because they have become the “uber solutions” for many of the fastest-growing organizations leveraging AI. Enterprise-Scale Knowledge Graphs are at the heart of companies like Google, Facebook, LinkedIn, Twitter, Uber, Amazon, and Pinterest.
Knowledge management and knowledge graphs are the ways that we store information as we learn. Learning must continually feed our knowledge graphs. EKGs are not a set of spreadsheets buried ten folders down on your hard drive. Their content is always available for anyone to query, and it has high-quality connections with other knowledge sources. Great knowledge management tools go beyond the hundreds of little silos of information we see in many organizations today. They rely on little isolated fragments of knowledge trapped in tabular structures such as relational databases. To be successful, organizations need to move away from flatland and toward building integrated central nervous systems that quickly respond to changes in your environment and notify interested parties.
The Language Dimension
Language is the ever-growing stack of easy-to-use tools that allow us to work with unstructured data, including the impressive large-language models for Natural Language Processing (NLP). We need to remember that 80% of the information within organizations is often locked up in documents. It is not present in our tabular relational database management systems.
If you have not had a chance to see the ever-growing model zoo at the Hugging Face, you should spend some time there.
This week, the Hugging Face website had over 14 thousand models you can quickly download into your NLP stack. Many of them can do simple tasks like classify a document for sentiment in just five lines of Python code. NLP is accessible to everyone who can write “Hello World” in Python. Feel free to contact me if you need help getting started. I mentor kids as young as 10 years old that write fairly good Python code.
I also want to point out that we are quickly entering a world where NLP technologies and large language models like GPT-3 are starting to impact our daily productivity directly. Companies like Tab Nine and Kite started this trend a year ago by providing NLP-driven extensions to your software Integrated Development Environments (IDEs) based on language models like BERT. Within the last 30 days, we have seen GitHub release its Copilot extension to Visual Studio Code, and OpenAI demonstrated its stunning video on using NLP to write a space game using OpenAI Codex, which is based on GPT-3. Watching these videos should inspire us all to ask the question: “How can we combine this power with other technologies to create innovative solutions.”
The LKL Innovation Hot Zone
Now let's look at how these topics come together to power modern intelligent systems and where innovation is happening in our companies. We will try to take a holistic Graph System Thinking approach when we look at these topics.
The first concept we want to embrace in our Graph Systems Thinking approach is the idea of the AI flywheel pattern.
The general idea is that we take feedback from these recommendations to create better recommendations as we make recommendations. This is called a positive feedback cycle. Now the question is, where do we store this feedback? In a spreadsheet buried 10-folders deep? No — we store the feedback in our EKG! This is where we can continually use this data and even generate natural language explanations about WHY we made a specific recommendation. Learning, Graph and Language come together to form a valuable business solution.
Multidimensional Systems Thinking Wins Hackathons
When I was judging all the hackathon entries, one of the questions we must ask was, “Does this project show promise for new technology patents.” Every team gets rated on a score of 1 to 10. But many of the hackathon entries applied a single dimension of NKN theory. For example, a team did ML with no persistent knowledge graph or no NLP. Their work was trapped in a Jupyter Notebook that only a data scientist would understand. There were not many of these entries like this, but I suspect they didn’t score highly.
On the other hand, some teams brilliantly combined Learning, Knowledge, and Language. They used ML and language models to build chatbots to answer questions and retrieve information quickly. They used knowledge graphs to look for clusters of out-of-network referrals. And they combined FPGAs and graph embeddings to create provider recommendations in a tiny fraction of the time it used to take. These were the winning teams.
Our summer intern team also displayed how combining NLP with knowledge graphs, and machine learning solved problems we could not do in the past. We had to classify both course descriptions and our technology preferences with NLP to link them together. These links boosted our technology preferences that were preferred or acceptable and lowered the scores of discouraged or unacceptable technologies. Yes, Microsoft Access is a popular database, but it is discouraged in teams doing scale-out databases. You get the idea.
LKL Innovation Hot Zone theory tells us that we need to carefully include Learning, Knowledge, and Language in our curriculums at colleges and universities. It means we need to encourage our staff to take these classes in their continuing education. It means that if we want a vibrant patent portfolio, we need to find people who can combine all three of these key dimensions to create new patents that drive innovative product development. And it means that if we want to retain staff in the Innovation Hot Zone, we need to recognize these individuals and help them mentor other staff in these critical technologies.