What went wrong with artificial intelligence? This transformative technology was supposed to change everything. I’ve seen first-hand the incredible potential it has—both as a professor of computer science at the University of Michigan and as the founder of Clinc, ZeroShotBot, Myca.ai, a non-profit called ImpactfulAI, and several other AI-focused companies.
So, why has it devolved into overhyped solutions, marketing noise, and an endless spin of the same, tired ideas? Into poor user experiences, embarrassing bugs, and countless other misfires?
The answer is pretty clear when you consider how every business has been told it needs artificial intelligence to stay competitive. This mad dash is symbolic of the gold rush, as companies push and pull to be early adopters—to scrape every last dollar out of their ROI. Add to that the misconceptions about what it can do, the ebb and flow of innovation vs. standard techniques, the grandiose promises, the marketability of AI, and it becomes clear how we got here.
It makes me sad to see AI reduced to a gimmick. To be clear, I’m not saying AI doesn’t have an important role to play. It will define the future of technology in many ways. The challenge is looking beyond the noise.
That’s why I’m here to discuss the antidote. The four mental models I believe any business, decision-maker, or tech enthusiast interested in AI must take to see past all the hype, noise, and spin.
You know it when you see it. Less talk, more show.
What’s the most important rule of AI? Don’t believe it unless you can see and feel it.
Why do I think this is the most important mental model? The magic of AI still exists, there are places where innovation still occurs, and when it does, the results are undeniable. Having said that, you can’t escape the noise, the hype, the big promises.
Simple, purpose-built AI solutions have transformed many industries. AI is being used in healthcare to detect breast cancer, in agriculture for crop yield forecasting, in autonomous driving to improve safety. These solutions use deep learning and reasoning to draw conclusions from billions of analyzed pixels. There’s no denying these use cases. They’re clear as you can actually see it in action and see it working well.
Trusting this type of intuition must be applied in all realms.
Throughout my experience creating novel conversational AI technologies, I know the power of an unforgettable experience. When it’s real, you know it. It only take a few minutes of interaction to tell if another human is intelligent, and similarly, you know right away if a conversational AI is intelligent from actually interacting with it. You have to look past the canned experiences, the lofty promises, and see what AI looks like in practice—within your industry or use case.
And if something sounds fake or unbelievable? It probably is. Trust your senses, they will guide you through the noise.
You will have trouble with certain solutions: the training dilemma.
Maybe you beat the odds and found that perfect AI solution. It can happen, right? Take a step back and think about the bigger picture. How will you apply that solution to your needs?
A promising demo isn’t everything. You still have to adapt that AI for your use case, train it, deploy it, and improve it. The more niche and customized your use case is, the harder it will be to realize the AI quality demo’d into reality in your environment. When the quality of your AI requires specific training to your use case, production-grade AI is extremely complex and often requires a dedicated team of experts in machine learning, computer and data science, and training specialists. Each layer adds more complexity, making your solution more expensive, brittle, and likely to fail.
As chronicled through my journey as CEO of Clinc, I saw countless companies spend millions trying to create, configure, and train virtual assistants, only to fail. The learning curve is steeper than ever, and the stakes are even higher.
So, how can you successfully navigate the world of AI? It starts with asking the right questions, things like:
- Ok, this AI is good, but Can I wield it?
- How much customization does it require to solve my problems?
- Will I have to actually train the inner models in the process of tuning this solution?
And even if you know the answers to these questions, that same demo experience you saw may be untenable if you have to train the AI yourself.
You must be reasonable about the logistics of making AI possible. Be ready for these costs: engineers to work it, support to keep it running, and training specialists (data scientist / ML experts) to improve it.
Next, ask yourself how it ties into mission criticality. Can you afford for it to fail? What’s at stake if your AI spectacularly fails? What will happen if you change the model’s task?
AI is some of the most complex technology on the planet. Getting it right means defining your expectations and knowing your limitations.
The revolution only applies to certain types of problems.
Let me start by saying we are already in an AI revolution, thanks to advancements in deep learning, which uses data to model the way our brain’s neural network works. The catalysts for this initial success include the availability of data, advancements in deep learning models, and innovations in computing.
Despite this, not all AI problems can be solved by advancements in neural networks. Many companies may claim to use next-generation AI, but more often than not, it’s just noise in the AI hype cycle.
Here’s what I can tell you. The biggest advancements are occurring in areas where they use deep learning techniques and data to train a system, such as in Natural Language Processing (NLP), computer vision.
Think about it like this. If we see large amounts of data being used to extract patterns, that’s a direct representation of the AI revolution. This type of approach being the basis of new products like Myca.ai is where AI is leveraged in a transformational way.
So, where are things going wrong? Most companies are using old techniques to latch onto the AI hype cycle. Think about early chatbots and the frustrating user experiences they offered. These solutions used the old Stanford NLP library and similar classical computational linguistic approach that leveraged grammar, nouns, synonyms, dictionaries, and other linguistic mechanics to derive patterns.
The problem? This is the wrong approach in modern times. You can’t expect to innovate if you rely on antiquated techniques.
Now for the big question: how can you see through the noise and see if an AI solution is legitimate? I recommend you learn a selection of latest buzzwords to see if they apply to a given technology.
If they use computational linguistics, regression models, or decision trees, it’s antiquated.
If they use neural networks, transfer learning, adversarial networks, or attention models, it’s current.
You don’t need to understand how they work theoretically. Your focus is knowing what buzzwords to spot and inform yourself of trends through projects like ImpactfulAI. Look for things like convolutional neural networks, transformers, attention models, GANs to quickly identify if the underlying technology is part of the AI revolution.
The future is zero-shot learning (GPT-3).
2020 was a milestone year for the scientific community. A little something called Generative Pre-trained Transformer 3 (GPT-3) was developed by the OpenAI project. The language model it represents is based on 175 billion parameters and is more accurate than anything we’ve ever seen. For context, the older GPT-2 used 1.5 billion parameters.
This model was inspired by recent work on transfer learning, which was first popularized by the Bidirectional Encoder from Transformers (BERT) model, and is built on the belief that you can train an AI model really well once with massive amounts of data (say the entire internet) then use significantly less or no training data for a new task.
This transformational work popularized a new philosophy toward deep learning models as “few-shot learners,” “one-shot learning” and “zero-shot learning,” meaning only a few, a single, or no training examples are needed for the model to perform a completely new task.
Let that sink in for a moment. For the first time, we may be able to create a conversational AI for new types of problems without any training. With the AI philosophy introduced by GPT-3, you can ask any question and receive an incredibly accurate answer without the need of training. One of my next major endeavors is to introduce the first commercialization of such an approach through the development of Zero Shot Bot to revolutionize the conversational AI chatbot space, and the performance of this technology is breath taking.
I firmly believe that GPT-3 and now Zero Shot Bot serves as a bell weather of the next big game-changer for the next decade. It’s not a matter of if, but when. This Zero Shot to Few Shot approach is the answer to the user experience problems created by other antiquated AI technologies. In the context of the Internet, it can solve a host of interesting problems and doesn’t require any training.
And Microsoft agrees. They signed a staggering $1 billion licensing agreement with OpenAI largely due to how impressive this model is.
This philosophy is now in the air as of just a year ago. Beyond GPT-3 and Zero Shot Bot, no products exist yet to my knowledge. But mark my words, industry-changing technology and commercializations of this Zero Shot approach is coming.
Zero Shot Bot and other platforms that use zero-shot learning removes the hardest part about deep learning AI—the training.
If you ask me, that’s the magic spark of innovation that commercial AI has been missing for years.