The reason that conversational interfaces are so exciting right now isn’t that users are excited to tap 120 extra times to order a pizza. It’s because they provide a way to translate the output of truly exciting technology powered by machine learning into something understandable by humans.
Have you ever had a conversation like this?
Me: I’d like to order a pizza
Chatbot: What kind would you like?
Me: What toppings do you have?
Chatbot: I have pepperoni, sausage, Canadian bacon, ham, pineapple, mushrooms, peppers. Would you like to hear more?
Chatbot: I also have ground beef, onions, spinach, tomatoes, cherry tomatoes, parmesan, and extra cheese, Would you like to hear more?
Me: No. I’d like a pepperoni and mushroom pizza.
Chatbot: What size would you like?
Chatbot: What is your billing address?
Me: Forget it. I’m just going to go to the website.
That is, the conversational analogue provides an accessible window into outputs from algorithms that are very hard to understand. And better yet, since chatbots require machine learning algorithms in their own right, the domain knowledge for building a chatbot is similar to the domain knowledge for building whatever is behind it.
When researchers looked at how computer vision algorithms determine the type of window treatments in a room, compared to humans, the algorithms spent much less time looking at the windows. Instead, they spent a lot of time looking to see if the picture contained a bed. As such, it took a problem that the algorithms had trouble with (confusing bedsheets for curtains) and turned it into an easier one — is there a bed in the room? And even though that decision process makes no sense to a human, it performed more accurately than algorithms that more closely mimic human thinking.
This distinction between human thinking and algorithmic optimization shows up more practically when you try to use a maps app for navigation. How many times have you sat at an intersection, waiting to make an unprotected left turn onto a busy street, wondering why the app decided that three extra left turns to save 45 seconds was a good tradeoff?
Chatbots provide a method-of-least-resistance option for abstracting away this kind of “thinking” from people. In return, they often add lots of unnecessary steps to whatever process the user is trying to complete. While creating interfaces for interacting with AI is a new problem, there are already some other techniques that show promise.
My company has thought a lot about this issue. We didn’t want to tell our users to have a forced “conversation” with an AI in order to evaluate the effectiveness of the conversations they are having with other people via email. Part of the solution was to use a different category of algorithms when putting together our overall estimate and explanation for the likelihood that the email they send will get a response.
Our best predictions came from throwing pure neural networks at the problem. The difficulty with that is that the output of those neural networks is complete gibberish to humans. The neural networks provide a matrix of weights that are applied to different parameters that the algorithms determine on their own. If you look at that matrix of weights, there is no way to make sense out of it. You just feed new data into it, let the machine make its calculations, and report a result.
By contrast, when we used classifiers from the decision tree family of algorithms, we were able to have a data scientist manually review the outputs of the machine learning algorithms for common combinations. We were able to manually translate the machine learning output in those frequent cases into something understandable to people. Rather than put a conversational agent between the analysis and the user, we traded off a slight bit of accuracy for a dramatic decrease in the amount of time users have to spend interacting with the product to receive the value.
Even outside of new interaction models, there are ways that AI can work inside existing interfaces to make choices easier. In the pizza example above, an understanding of my tastes based on data about previous purchases is very powerful. Just having the interface become aware of “the usual” changes my ordering. It means “which of these 3,065 combinations is optimal for you tonight?” becomes “does this pizza that you wanted several times before sound like something you want to eat tonight?” — a much easier question to answer. (That’s one reason the Domino’s skill for Amazon Alexa is geared to reorder your favorite pizza and not much else.)
AI is capable of learning our preferences and putting them together to adapt to situations that are similar, though not the same, as before. For example, if I regularly fly to Seattle, the AI should already know that I like flying a particular airline from San Jose rather than San Francisco on one of the later flights. And I like staying in Bellevue near my friend, so it should suggest a hotel that has availability. “Does this trip sound good to you?” sounds great to me. Book it! This all helps me to save time.
Creating interfaces for people to interact with machine learning will be a challenge. At the moment, since the technology is so new, one of the only patterns we know is what Siri showed us on the iPhone.
Most of the conversations with Siri tend to become an elaborate back and forth. Talk to the device, and it will ask whatever questions it needs to ask to get you what you want. But what that interface turns into is a slightly more usable command-line interface. And while Bash has its adherents even today, Windows won. Conversational interfaces will not be able to replace the rich visual interactions that we’ve become used to over the past decades.
The next few years will reveal all kinds of powerful ways that humans and machines can work together to produce creative works that go beyond what either of us could do alone. The challenge for those of us working on that technology is to make sure that we spend as much time creating interaction patterns and making interfaces that work for both humans and AI as we spend working on the core AI technology alone.