Machine vision AI is going to change everything about shopping.
We’re all only about ten years away from sauntering into stores, grabbing whatever it is we want, then quick-stepping out like we stole it.
It’ll be possible because many shops will be ringed with machine vision-enabling cameras and sensors that keep tabs on what you take while inside and then charge it to the corresponding app as you leave. Analysts say the big shift is being ushered in by retailers trying to stave off the online shopping explosion. People tend to cite crowds and lines as reasons they avoid stores, so the hope is that tech will be the savior of the remaining brick and mortar mainstays. But while that checkout change might thrill some customers, it’ll also dramatically change employment for low-skilled retail jobs and comes with a host of privacy concerns.
“Consumers right now have been leaving stores, they’re shopping online a lot more,” said Yory Wurmser, a senior analyst at eMarketer. “And at the same time, the vast majority of shopping is still happening in stores, so there’s a need to stop hemorrhaging all this traffic.”
If machine vision checkout all seems too futurist to contemplate, there are already signs that the checkout lane change is headed our way. Online ordering juggernaut Amazon released a video last December of its surprise project, the Amazon Go store. In the video, beta-testing employees at the Seattle campus stroll inside a small grocery store, then let cameras, sensors, RFID tags and more work to associate people with their Amazon accounts. It charges the app as they walk back out.
The idea caught hold, but it’s also presumed to be quite expensive, since Amazon reportedly built sensors into the floor and walls to make the experience relatively seamless. Cheaper, “Amazon Lite” versions of this kind of tech are already popping up in various places around the world.
“We’re already seeing this in China,” said Brendan Miller, a principal analyst at Forrester. “There’s a lot of these mobile stores, the very small footprint like 200 or 300 square foot that are unattended. They’re almost like an unattended kiosk that people can walk through.”
“Too expensive to build” usually means an opportunity for a startup, which is exactly what’s happening at Standard Cognition’s Lab in Santa Clara, California. Its co-founders are working on an AI checkout system that aims to be sensor-free and rely on video cameras, only.
On the day I visited the mock convenience store, employees were training the machine vision AI for ‘density,’ to make sure the cameras can keep track of the individual people in a horde of holiday shoppers.
Founder and CEO Jordan Fisher, co-founder and COO Michael Suswal, brand new employee Jeff Hsu and I spent a few minutes huddled up together, weaving between each other and the store wall, before unraveling to see whether the AI kept track of us the whole time. An easy way to spot how it’s doing is by checking whether we all kept our assigned colors. Mine started as purple but swapped it with Suswal’s by the end of our faux shopping party. Its founders say these sorts of tests are great for the system because they spot things to adjust every time they let beta testers into the lab.
Here’s how the system works. Its proprietary AI system is built, in part, with TensorFlow, Google’s open-source AI kit. The lab has 20 cameras overhead, which are used to track people and products. Fisher says it first trains the AI on products with a two-minute long video capturing event they call the “SKU-dance.” Once an item is filmed from every possible angle, the AI will be able to identify it in the future.
Then, the cameras are trained to see when someone’s picked a known object up and track it from there. People are still being invited in to shop the shelves and track if the program can keep up, which is why I joined the density scrum.
Standard Cognition got its start in Y Combinator, and is now funded at $6 million a few months after graduating.
The total package is comprised of cameras, the AI program and a private offline server. Every camera requires a dedicated GPU on the backend server. The cost to put three or four cameras and an offline server into say, a 7-Eleven-sized building will be about $30,000. But the subscription service to monitor it all will cost more.
Its servers will be off the cloud for faster processing, and to avoid potential privacy concerns from actual customers. That’s also why they don’t use facial recognition technology.
But systems like these will be able to do more than just facilitate a painless check-out. One thing employees realized early on was that the ID capabilities meant they could offer inventory management in addition to payment. They also added a theft deterrent system, because the cameras can piece together how a shelf looked before you came by, to deduce whether something has been stolen.
So of course, I tested it. And while it definitely knew I put a package of Oreos underneath my sweater, it had no clue about my Coke nab, because I slid the bottle up my sleeve while it remained sitting on the shelf.
But as cost-efficient as a machine-vision only system would be, it also brings definite limitations in the kinds of stores it could actually be used in. Analyst Brendan Miller, Forrester, is quick to point out machine vision wouldn’t be as useful in telling different sizes of clothing apart.
“Imagine being in a department store, choosing jeans… and there are all these different sizes,” Miller said. “We know within a department store all that gets mixed together, and so having this kind of video technology, being able to identify the exact size and location of a specific jean in a messy shopping environment, it’s not quite there yet, but I think the technology could be augmented with things like RFID tags.”
The idea is that Standard Cognition will wrap this all up and offer a machine-vision checkout / inventory management / alarm system package. That’s because employees are also training the AI to spot medical emergencies, or whether someone has a weapon. The system could alert store managers to weapons or emergencies in milliseconds, because that’s how quickly it can recognize objects.
It will also land first in a very specific kind of store. It does best with consumer-packaged goods, those items that are already whole and ready to go. Determining weights of bulk items would be another level of complication waiting for them in a grocery store that the founders aren’t ready to tackle just yet.
“We’re still figuring that out. We’ve actually talked to quite a few different grocery stores, and each one kind of has a different preference for what they want that experience to be for the shoppers,” Fisher said.
From here, employees need even more data, the kind you only get in a true store environment. They are checking into San Francisco storefronts, but will likely fall back on using the lab location, since it’s so near Santa Clara University. They’ll ask for beta-testers to shop their wall of goods for real in about six months. The plan is to then mine that data like crazy, improve, then hopefully begin installing machine vision checkout in other stores in little more than a year.