What Robots Can’t Grasp

These engineers are training the next generation of robots to pick up just about anything.

By Marty Graham, Contributor

There are some things robots just can’t grasp. Literally. And a team of engineers and data scientists at the University of California, Berkeley’s AUTOLAB is creating more of these unwieldy objects all the time. The effort isn’t an exercise in mechanical cruelty; rather, these weird-looking objects—what professors Ken Goldberg and Jeff Mahler call “adversarial” objects—are part of a trial-and-error approach to helping the robots at AUTOLAB develop the know-how to pick up a range of oddly-shaped items. And that’s an increasingly important skill these days.


Demand for robots has increased every year, according to the Robotic Industries Association (RIA). While retail giants and auto manufacturers have historically represented the highest demand, more companies outside the vehicle sector are beginning to install robots. Of the nearly 36,000 robots purchased in 2018, 16,702 were shipped to non-automotive companies—a 41 percent increase compared to 2017.

With this more pervasive installation of robots comes expanded use cases beyond the repetitive, highly-controlled tasks they’ve been relegated to in warehouses and on factory floors, such as loading, unloading, and shelving pallets of goods or welding automobile parts. But they have their limits.


While robots are already completing precise assembly work, repeating a very limited set of tasks over and over again, situations—such as shifting rapidly to sort and pick up random objects to fill a retail order—stymie them. Consider, for example, the few seconds it takes for a human to remove a pizza from an oven and turn the oven off; uncork a bottle of wine, and find and fill a glass; grab a plate and slice the pizza; then place a slice on the plate. This series of actions requires grasps of hard, soft, and even floppy objects, hot and cold objects, as well as liquids and solids. While humans instinctively understand how to grasp any object—even one we’ve never seen before—robots have to be taught this skill: The robot has to perceive the object with its sensors, model it appropriately, determine a strategy for picking it up, then execute the desired action. And while a commercial robot could likely be created to complete such demands, it’d be expensive and it would require extensive training. That’s where AUTOLAB’s work comes in.

Learning Through Failure

Goldberg, Mahler, and their post-graduate students began working on AUTOLAB’s Dexterity Network (Dex-Net) in 2015. The groundbreaking venture develops and refines robot “picking” strategies and, just as importantly, has improved the machine learning behind the picking calculations.

“Dex-Net can be used to train a robotic system for handling a variety of items without advance knowledge (e.g. CAD models, mass, or images),” says Mahler. “One of the advantages is that it can be rapidly adapted to different hardware systems consisting of various arms, grippers, and 3D depth cameras, enabling faster customization of robotic learning systems.”

The first iteration of Dex-Net involved creating a system for grasping one object at a time with parallel jaws—think two fingers or pliers. Their current work—Dex-Net 4.0—extends the system, training robots so they grasp a wider variety of objects piled in heaps that make the picking more challenging. Dex-Net 4.0 now includes both the parallel-jaw gripper and a newly-added pneumatic suction arm—each with its own neural network. The robot’s central programming provides size and shape information via its sensor system, but lets the two arms’ separate neural networks decide whether an object should be handled by grip or suction.

“Part of the AUTOLAB philosophy is to probe for failure modes that provide deeper insight into a method…The results behind Dex-Net were only possible with countless hours of meticulous experimentation, and healthy skepticism.”

—Jeff Mahler, postdoctoral scholar, University of California, Berkeley and CEO, co-founder of Ambidextrous

While AUTOLAB researchers applaud their advances, what really interests them are the failures: the unintentionally adversarial objects the Dex-Net robot couldn’t pick up or hold on to.

“Part of the AUTOLAB philosophy is to probe for failure modes that provide deeper insight into a method,” Mahler says. “That’s where the adversarial objects came from. The results behind Dex-Net were only possible with countless hours of meticulous experimentation, and healthy skepticism.”

All along, AUTOLAB has been designing and creating thousands of adversarial objects—some as virtual simulations and many others 3D-printed. The physical objects are small—around 10cm—as they’re meant to thwart a robot with 5cm grippers, Mahler says.

Some objects look like familiar shapes, but with a peculiar twist—like a cube where part of one surface has been shaved, creating a new plane that easily slips out of the grippers of a cube-picking robot. Others seem surfaced from a nightmare: melted, twisted five-legged objects. Regardless of appearance, all the objects look symmetric—or at least understandably proportioned—but, in reality, are not.

Quicker Picking

Perhaps AUTOLAB’s biggest breakthrough is that Goldberg, Mahler, and the thousands of adversarial objects they’ve created have dramatically reduced robot training time by using simulations instead of painstaking labeling and image-learning.

Artificial intelligence is already improving robots’ picking abilities. Sensors on the robots read SKU numbers, for instance, which allow robots to then understand target object’s size and shape. This innovation, however, requires training the robots’ software algorithm with millions of images that were labeled, often individually, by humans—a slow and tedious process that reduces productivity.

When Dex-Net first began, the source of data was hand-labeled images or examples collected from a physical system. In both cases, researchers collected millions of data points in a process that required a year or longer. That’s no longer the case.

“The idea behind Dex-Net is to automate the collection of training data by using simulation,” says Mahler. “We use analytic models based on physics and geometry to automatically determine whether or not a robotic grasp would successfully pick an object up. We also use a technique called domain randomization to randomize parameters of the simulator such as object mass, friction, and camera parameters, which aids in transferring learning from simulation to reality.”

“The result is that we can collect millions of useful data points in less than a day,” he adds. In commercial settings, where time is money, that’s a powerful innovation.

The Dex-Net robot handily won the Amazon Picking Challenge—Amazon’s annual event that benchmarks picking progress—with an astonishing 200 to 300 picks per hour, a tremendous increase from the standard 70 to 95 picks per hour.

The team is making sure its small-scale efforts will level-up to handling appliances and cars and makes much of the training data and tools available in an open source library for training other robots.

While launching their own company called Ambidextrous a year ago, Mahler and Goldberg are still in the AUTOLAB refining what they’ve learned and working on the next big idea: leveraging advances in deep learning. “We are developing new methods [of teaching] robots to perform tasks, such as surgical needle insertion, rope-tying, and assembly,” Mahler says.

Such teaching advances may elevate robots from repetitive tasks on assembly lines to more intricate tasks, like suturing in an operating room. And that puts entirely new use cases for these dexterous robots within grasp.