Known for its hyperactive toys, the company spent years developing technologies to tackle its greatest challenge yet–subtlety.
If there’s a robot uprising anytime soon, it seems unlikely to start in our living rooms. Robotic vacuums like Roomba sell well because they are so handy. But other types of home robots–pets and companions from Sony’s Aibo robo-pooch to the recently shuttered Kuri (backed by Bosch)–have flopped due to both prices and expectations that have been set unreasonably high.
If any company can eventually bring us a domestic robot like Rosie from The Jetsons, Anki is a good bet. Started by three Carnegie Mellon Robotics Institute graduates in 2010, the company has racked up over $200 million in venture funding. More important, it’s attracted customers. Anki has already sold 1.5 million robots by taking what it sees as the easiest route into the home: toys. The star is a manic little bulldozer-looking bot called Cozmo that drives around a tabletop and plays simple games with light-up cubes it carries about. Cozmo was the best-selling toy (by revenue) on Amazon in the U.S., U.K., and France in 2017, according to one analysis.
Anki cofounders (from left) Boris Sofman, Hanns Tappeiner, and Mark Palatucci [Photo: courtesy of Anki]
With claimed revenue of almost $100 million last year, Anki says it could already be “cash-flow positive” if it chose. But it’s instead plowing the money into a 10- to 15-year goal to get us from Roomba to Rosie. “We’ve always known from the beginning that this is not a toy company,” says CEO and cofounder Boris Sofman.
So I’ve been stalking Anki for over a year, anticipating the next phase of its steady march to the robotic future. In June, the company was finally ready to talk and show me a new product that was still in the awkward stages of development. After a lot of introductory remarks, Sofman finally plops the new robot on the table.
It’s a slightly larger, gray version of Cozmo, named Vector.
And at first blush, it’s a letdown. I immediately recall the scene in This Is Spinal Tap, when, due to a typo in the instructions, a stage piece that should have been an 18-foot-tall replica of Stonehenge was instead an 18-inch miniature. Likewise, I’ve been expecting something bigger–figuratively, and literally.
But then I listen to Sofman’s pitch. The rehashed outside appearance allowed Anki to focus on radically more advanced internals and helped keep costs down. Cozmo lists for $180; Vector will run $250 and do a whole lot more when it ships in October. (To target early adopters, Anki is launching the robot on Kickstarter at a discount price of $200.)
The fundamental advance for Vector is that he’s autonomous in a way that Cozmo isn’t. Anki’s first robot was a bit like the Mechanical Turk–a robot built in 1770 to play chess against humans. It was a hoax, of course. A person hid inside the cabinet that the animatronic figure was mounted on, controlling its moves. For Cozmo, the person in the box is a Wi-Fi-connected smartphone running an app that controls the robot.
By contrast, Vector has his own mind. (While Anki has long insisted that Cozmo is gender neutral, it’s made no pretense with Vector. Everyone I meet with unfailingly refers to the robot as “he” or “him.”)
“We basically took that whole thing,” says Anki’s computer vision technical director Andrew Stein, gesturing to my iPhone, “and shoved it inside of his head.” Vector’s brain is a quad-core Qualcomm Snapdragon 212 chip. It’s far from top of the line for a phone, but within the budget that Anki set for Vector’s parts. “The thing that was way too expensive two or three years ago is now in our range,” says Stein.
Vector with Anki’s director of program management Meghan McDowell [Photo: Sean Captain]
FROM TOY TO PET
While he may resemble Cozmo, Vector is designed to serve a very different role, as an always-on companion for everyone–rather than an occasional diversion for the kids. He requires a lot more intelligence to read his environment and pick up cues from the humans he shares a home with.
Cozmo springs to attention when you call its name, making twittering sounds, and lifting its bulldozer-like arms up and down. If you ignore Cozmo, the bot gets more in your face, or makes loud, obnoxious snoring sounds.
Vector feigns much higher social awareness. When I meet a rough version of the robot at Anki’s lab, he’s just hanging out. Cartoon eyes, represented on a 184 x 96-pixel screen, appear to casually scan around the room. (The robot actually sees through a 720p wide-angle camera mounted just below the screen.)
Touch sensors allow Vector to respond to a pat on the head. [Photo: Sean Captain]
Those eyes appear to open wide when Meghan McDowell, Anki’s director of program management, calls, “Hey Vector, come here.” The robot drives off his charger toward her, looks up toward me, and makes some of its characteristic twittering noises. If we maintain eye contact, Vector will become animated, making more gibberish sounds and perhaps raising his arms for a fist-bump (actions inherited from Cozmo). We could also play a game, such a hand of blackjack, with cards displaying on his tiny face/screen. At one point, McDowell pets the touch sensors on top of Vector, causing his eyes to roll around in mock bliss.
But when we ignore Vector, he gets the hint and does his own thing, driving around the tabletop to find and stop just before its edges (using infrared sensors), or deliberately bumping into things like a cup to see if he can push them.
This isn’t aimless play. With a laser scanner and other sensors, Vector is building a digital representation of his environment using a sophisticated process called simultaneous localization and mapping (SLAM)–a technology also used in high-end robot vacuums. Vector also has a four-microphone array on top, allowing him to discern the direction of sounds, and his camera continuously watches for action. “We want him to be inquisitive, to map his environment,” says McDowell. “But you wanna keep him on all the time in your home, so we don’t want him to be annoying.”
ALEXA ON WHEELS
Vector can do some useful things that Cozmo can’t. Connected to a home network and the internet over Wi-Fi, he offers Alexa-style utilities like displaying weather information for requested cities, setting a timer, and speaking answers to questions like, “What is the capital of Idaho?”
Still, he’s a long way from attaining the empathetic personality and useful capabilities of a robot such as the Jetsons’ beloved housemaid. That’s to be expected, says Anki cofounder and president Hanns Tappeiner. “We’re essentially inching our way toward that goal,” he says.
Vector shows his feelings about cloudy weather. [Animation: Sean Captain]
Though Anki’s aspirations still extend far beyond what Vector currently delivers, the new bot’s processor, sensors, and other components enable artificial intelligence technologies that were out of reach a few years ago, and certainly when serious engineering work on Cozmo began in 2013.
That earlier robot, for instance, is hard-coded to detect a few specific objects: its cubes and its charger. And it uses commodity software to discern the faces of humans, cats, and dogs–routine technology that appeared in point-and-shoot cameras over a decade ago.
Vector, however, runs a neural network that’s being trained to understand the entire world around him–an ongoing process that will continually expand his visual intelligence through online updates. The big achievement for launch: Vector detects people, even when faces aren’t visible.
Vector can spot a person by their torso, then look up to find a face. [Animation: Sean Captain]
“If you’re not at the right angle, or they’re not facing you, how does the robot know you’re there?” says Stein. A dog or cat wouldn’t need face-to-face contact to know it’s human had come home, for instance, and neither should Vector. So Stein’s team trained a convolutional neural network (CNN)–a popular deep-learning AI technology that mimics the brain’s visual cortex. Using the often blurry and distorted footage that Vector’s camera captures as he moves around, Stein has been teaching the CNN to detect people from the back or the side, for instance, up to about 10 feet away.
“Even if he’s looking down and can only see my torso, [he should] realize, hey, there’s probably a head floating above that torso,” says Stein. “And Cozmo has no idea. I’m just a blob of stuff just like everything else.”
Vector’s people-awareness seems to be working in the various robots I meet during my visits. After McDowell calls one in the lab and he rolls over to her, for instance, he then pivots toward me and looks up, his cartoon eyes widening to indicate that he’s seen me.
Warm colors on this “heat map” indicate where the AI has identified possible objects. [Photo: courtesy of Anki]
One of the next vision challenges is to understand human body poses–what’s happening when arms and legs are in particular positions, for instance. “That’s going to benefit us as we’re building robots that are driving around the home and are going to need to understand people as they move around,” says Stein.
Another challenge is what Anki calls “objectness”–discerning that something is a discrete object even if the neural network has never encountered its kind before. This is a further step in exploring and understanding an environment. “If I want to recognize 100 specific objects, I would argue that’s an easier problem than making a vision system that just knows what an object is,” says Stein. “It’s a more abstract concept . . . It’s a philosophical question.”
To illustrate, he shows me some “heat map” video from the neural network training. The software highlights areas that may represent discrete objects, mistaking a wood grain pattern on the tabletop for a three-dimensional entity.
Sophisticated as Vector’s vision system is becoming, it’s just one input to the robot’s complex simulation of emotional intelligence. Cozmo is a clown that zips around, makes noise, make faces, and plays games. It does pick up basic stimuli, such as hearing its name or seeing a face it’s been taught through the companion app, but it’s ultimately an unsubtle attention hog.
“That was our first push into a characterful robot, so I think we went a little bit over the top,” says Brad Neuman, Anki’s AI tech director. His task is to build a robot that has both character and some social intelligence. A key part of that is what Anki calls “stimulation.”
“When stimulation is low, the robot is chill,” says Neuman. Vector is studiously observing but not acting out. “Then if you start making noise, or make eye contact with the robot, and certainly if you say ‘Hey Vector,’ that spikes [stimulation] way up,” he says. But Vector also picks up subtler actions–peripheral movement and noises, for instance, or the room lights turning on and off. “If he gets stimulated enough, he’ll drive off his charger and start to socialize with you,” explains Neuman. “Say your name, greet you, give you a fist-bump, potentially.”
Like Cozmo, Vector mostly makes gibberish sounds when playing or hanging out. So it’s a bit unsettling when he speaks for the first time. Vector has a retro robot-sounding voice–deep but soft, a bit tinny and echoey. I’m probably projecting, but the matter-of-fact tone sounds a bit sarcastic when I go over 21 on blackjack, and he says, “You busted.”
Neuman shows me a visualization of Vector’s Emotion Engine–a graph of input levels over time. A green line representing stimulation rises as more and more things are introduced to a virtual Vector in a simulated test environment we’re watching.
Those stimulations have a limited lifetime. As things quiet down, the lines drop, and Vector gets the hint that he should go back to a chill mode. That’s what happened as McDowell and I were chatting with each other and Vector set off exploring on his own.
Vector’s Emotion Engine, sped up to show how the levels of happy, confident, social, and stimulated rise and fall in response to events. [Animation: Anki]
Vector doesn’t just get excited or bored, though. There are four dimensions to his emotional state: the level to which he is stimulated, happy, social, and confident. Hearing his name stimulates Vector, for instance, but it also makes him more social.
Vector’s confidence is affected by his success in the real world. The hooks on his arms sometimes don’t line up with those on his cube, for instance, and he can’t pick it up. Sometime he gets stuck while driving around. These failures make him feel less confident, while successes make him more confident and more happy.
Vector’s behavior follows a hierarchy. “The highest level is what kind of things should the robot be doing right now,” says Neuman. “Should he be quiet? Should he be engaging? Should he be sleeping? Is his battery super-low, and he needs to recharge?” Different behaviors flow from these high-level states, in response to events and the states of his Emotion Engine.
Vector isn’t following a simple script, then. He’s improvising, based on a soup of different, ever-changing inputs and a wide variety of possible actions. All that creates the illusion of life, but also a challenge to rein in.
Neuman had originally wanted to build a more complex intelligence, in which Vector’s personality evolved through a rewards system that reinforced certain behavior patterns. “And once you work with design people and product people you learn, no, you have to be able to impose certain constraints on the system,” says Neuman.
For instance, Vector needs to consistently indicate when he’s sending data like voice commands up to the cloud–by pausing and blinking his LEDs. This clarifies why the robot has suddenly stopped moving and also that data is being sent to a third-party speech-recognition service. (Anki says it does not archive audio, but compiles anonymized stats of what questions and phrases people use.)
This is one of Vector’s “global interrupts”–triggers that stop whatever he’s currently doing and set him on a different path. Neuman compares it to hearing the doorbell ring when eating dinner. That interruption causes you to put the fork down and go to the door.
The most powerful interrupt is the wake phrase “Hey Vector,” which he understands even without pinging the internet. But through an online natural language processing service, the robot also needs to understand other phrases–including “Hey Vector, shut up!”–that indicate he’s getting annoying and should switch into a more chill mode. “Ideally, nobody is ever [going to decide], ‘Oh he’s too loud. I’m going to turn him off, put him in the drawer, and be done with it,’” says Neuman.
Vector alerts you to needs, like having difficulty getting home to his charger when the battery is low.
One of Neuman’s key goals for the coming year is to minimize the times when users have to be so blatant for Vector to learn and adapt to the way people behave. “So if you want to interact with the robot, he should be there and interacting with you and be very responsive,” he says. “But if you just want to look at him every now and then–and have him almost like a bird in a cage instead of a bird that stands on your shoulder and runs around your couch–you can do that. You can choose to interact more sparingly, and he respects that.”
“Team bird” is one of three main contingents at Anki, says McDowell. Another group sees Vector as resembling a cat, for his sense of independence. “But I kind of feel like ‘team dog’ because he does help you,” she says, “and he wants to help you.”
Vector’s helpfulness is pretty limited so far. For about the same price, a Roomba can clean your floors. And for a lot less money, Alexa or Google Home can play music, control connected appliances, provide traffic reports, and much more.
But with a powerful processor, a Linux operating system, and internet access, Vector has room to grow. Anki promises to keep expanding Vector’s capabilities. A context-aware security camera or a voice interface to home automation systems are conceivable upgrades, for instance.
Vector bested me in several hands of blackjack.
Vector may also get upgrades from a dedicated following of coders. As with Cozmo, a popular teaching tool in university robotics classes, Anki will also encourage tinkerers to write new code that expands Vector’s capabilities. Anki will provide a Python software development kit (SDK) for Vector, as it has for Cozmo; and it may add a C# SDK so coders can write mobile apps that interact with the robot.
Though Vector may eventually offer Alexa-like utility, that will not be the main reason for buying one. The selling feature is this illusion of a living presence in your life–not as active as a bird, cat, or dog–but also easier to feed and care for.