154. Can Robots Get a Grip?
Listen and follow along
Transcript
What does it mean to live a rich life?
It means brave first leaps, tearful goodbyes,
and everything in between.
With over 100 years' experience navigating the ups and downs of the market and of life, your Edward Jones financial advisor will be there to help you move ahead with confidence.
Because with all you've done to find your rich, we'll do all we can to help you keep enjoying it.
Edward Jones, member SIPC.
I don't mean to interrupt your meal, but I saw you from across a cafe and you're the Geico Gecko, right?
In the flesh.
Oh, my goodness.
This is huge to finally meet you.
I love Geico's fast and friendly claim service.
Well, that's how Geico gets 97% customer satisfaction.
Anyway, that's all.
Enjoy the rest of your food.
No worries.
Uh, so are you just gonna watch me eat?
Oh, sorry.
Just a little starstruck.
I'll be on my way.
If you're gonna stick around, just pull up a chair.
You're the best.
Get more than just savings.
Get more with Geico.
I still remember the first time I saw Roomba in action, that little robotic vacuum cleaner that skitters around the house.
It was love at first sight.
And I thought it was the beginning of what would no doubt be a revolution in home robotics.
That was maybe 2003, so more than 20 years ago.
The revolution never happened.
My guest today, UC Berkeley Robotics Professor Ken Goldberg, has been working on robots for more than 40 years.
And one thing he's learned the hard way is that robots still have a long way to go.
We have this incredible ability to adapt to changing conditions, and science has not figured that out.
So it's very hard to reproduce that in robots.
Now, humans and animals are existence proof that it can be solved.
It's not like an impossible problem like time travel.
It's so funny because it's right in front of us, but we don't know how to do it.
Welcome to People I Mostly Admire with Steve Levitt.
In spite of the inherent challenges in developing robots, there are some who think things are about to change.
Tesla has been working feverishly on a humanoid robot called Optimus.
Elon Musk has predicted that Optimus robots could generate more than $10 trillion in revenue long term.
Is that realistic?
Ken Goldberg has some opinions on the future of robotics, but our conversation today starts in the past with how he came to build his very first robot.
When I was a kid, I was really into rockets, models, building things like that.
And my dad ran this chrome plating company and chrome plating involves moving these metal parts between these different tanks.
A lot of them are very poisonous, like cyanide.
And it was very messy work.
And so he wanted to build a machine, a robot, that would do this dirty work.
Oh, so really, for real, he wasn't just messing around.
He was trying to be practical about it.
Yes.
He built this frame with all these motors and stepping motors and switches built into it.
And then it had a controller that was basically this, it was almost like a player piano kind of thing.
It was a rotating drum with these little pigs and that would tell it which things things to turn on and off.
And then he taught me binary numbers and LEDs.
In the end, did he ever implement it at his factory?
I have a picture of it, which is really funny because it is in there, but I don't think it ever really worked.
My father was a great tinkerer, very creative, but he had a limited attention span.
So he abandoned projects like that.
So what was the state of robots?
They were essentially non-existent in the 1960s, weren't they?
well okay there's this really fascinating history that goes way back if you want to start at the ancient egyptians building machines that looked human-like did human-like things and the word robot doesn't appear until 1920
and interestingly it was in a play about robots it was a Czech author who wrote this play and started that word which actually comes from the root word for work or forced work wait can I ask you a a question?
So does a robot have to look like a person?
Is that the definition of a robot?
Or does it just have to replace human activity?
Ah, okay.
So you're getting right into the thick of the topic here.
It's very controversial.
People have all kinds of definitions.
Generally, and I am of the camp that a robot does not need to look like a human.
That a robot is a machine that's programmable, that moves in the physical world, but does something interesting and useful.
And why would anyone care if it looked like a human?
That seems like hubris or something.
Exactly.
So it's interesting you say that because that was the hubris story that goes all the way back to Pygmalion, Prometheus, Daedalus.
They were all guilty of hubris because they were stepping too far in their creativity and they were punished for that.
For mimicking gods or encroaching on god-like territory.
Exactly.
But it's very compelling to have something that does have some form factor of a human, a humanoid.
Yeah.
And that is super popular right now.
There is a huge wave, the biggest wave I've ever seen in my whole life, of interest in robots.
And it's specifically around the humanoids.
And the big proponents of that, namely Elon Musk and Jensen Wong from NVIDIA, are saying that we're on the verge of achieving this dream finally that we'll have the humanoids.
Like, you you know, Rosie from the Jetsons come in and clean up our house.
But she didn't even look that much like a human.
Well, that's true.
She clanked around.
In fact, it was very interesting because if you remember the show, she was always breaking down.
It was a kind of a runny joke that the robot wasn't very good and was always malfunctioning, which is actually the way real robots are.
I always show this video clip when I give talks.
You see the backflip of the robot and its triumphant, you know, stands up like it's about to dominate and take over.
But what you don't see is the 199 takes where the robot basically falls flat on its face.
Yeah, absolutely.
So you started building this early robot in your basement with your dad when you were still a kid.
Was it all robots all the way?
Or did you get off the path once or twice on the way?
Oh, I had a lot of interests.
I was a rebel as a kid.
I was into go-karts and motorcycles and things like that, but I also was very interested in art.
And I mentioned to my mother that I was going to study art in college.
And she said, that's great.
You can be an artist after you finish your engineering degree.
When did you get into picking things up?
Because that's been a real focus of yours is trying to build robots that can pick things up.
When did that become an interest?
Well, so I was studying in Scotland for a junior year abroad, and I took these classes on AI and robots.
And we're talking about the early 80s, right?
Exactly.
They had a department of AI, and they had several of the pioneers there.
It started from Alan Turing's work in AI.
So it kind of had grown out of that.
And so it was actually a very famous department of AI.
So I was very lucky to be there, get exposed to all that, and then come back.
And then I was also lucky I found this lab at Penn led by this wonderful young professor, Rujina Baichi,
who was doing robots.
And that's still the GRASP lab at UPenn.
Were you actually trying to pick things up at that lab, or you just happened to be called GRASP?
I was working on GRASPI.
I like to say I've been working on the same problem my entire career.
I haven't made very much progress.
What makes that problem so enticing to you?
Well, I think one reason is that I have always been clumsy.
So when I was a kid, if you throw me any kind of ball, I would drop it instantly.
But it's really the fundamental question because robots need to do this.
This is the first step to being able to do something useful.
We have to be able to move things around, to ship all these packages that we're increasingly ordering online, but also to make things when we're going into factories.
Anytime you want to put something together, assemble it, you have to pick up the parts.
It's very counterintuitive because it's much, much harder than people think.
What's easy for robots, like lifting heavy objects, is very hard for humans.
But what's very easy for us, like just literally picking up a glass of water, still remains incredibly hard for robots to do reliably.
So your first real work on grasping was your dissertation.
Can you just describe what problem you're trying to solve and what your strategy was when you first started on this path?
The problem I was really interested in was picking up objects with a gripper.
I was just using the parallel jaw gripper, which is just the binary clamp gripper that you see on robots.
Just like a pincher or something like that.
Two fingers that come together and grab something.
Right.
And what I wanted to study was if you could use that to grasp and orient a polygonal object by squeezing it without using any sensing.
And the reason is because sensors are prone to error and noise.
And so what I found was that there was this beautiful geometric way to essentially constrain the shape of any polygonal object so that it would come out in a unique final orientation.
Wait, is this mathematical theory or are you actually picking things up?
Both.
It was mathematical theory.
So I ended up proving this theorem that was a completeness theorem.
So I was able to show that it works for any polygonal part.
It took me two years of struggling to come up with that proof.
Yeah, most of what you do is so practical.
I'm surprised to hear that you started in mathematical theory.
Yeah, so I've always liked formalisms and trying to prove theorems, but at the same time, what you have to do is make a lot of assumptions, obviously.
So in this case, we were dealing with planar parts.
So they were essentially flat.
You were orienting them.
Of course, all parts are dimensions and often deformable and there's a lot of complexity.
We did also do experiments and I invented a gripper that actually went with the dissertation ideas.
So you had this mathematical proof, this completeness proof, and then you actually tried to pick things up.
How good or bad were you at the actually picking things up part of things?
Well, actually, we were very good at orienting the objects, which means getting them a unique final orientation, which is called part feeding in industry.
And it worked.
We could prove it worked theoretically, and it worked pretty well in practice.
The issue is there's very small errors and factors like friction that are hard to model.
And when those get violated, the assumptions get violated, then things don't always work out as you hoped.
So hearing you talk about how robots struggle to do things that are so trivial for humans, it's interesting because it's so at odds with the seemingly inexorable tide of mechanization.
I've been in factories where you barely see a human.
Farming used to employ 40% of the workforce.
Now we produce far more food using a little more than 1% of the workers to do it.
And in large part, that's because of machines that are replacing people.
In our homes, we have refrigerators and washing machines and dryers and lawnmowers.
And all of these things are labor-saving devices.
But as you talk now, I'm actually thinking for the first time, the way that these machines do the labor is not actually mimicking the way a human would do a task.
It's always the way that suits the machine.
So it's interesting that then with robots, I think our inclination is, well, let's have the robot do it exactly the way the human would.
Right.
And that's a very good insight.
If you look at, let's say, farming, The idea of monoculture, which is the way farms are run now, is everything is standardized and you're just mowing down these crops with these big combines.
And that's very different than polyculture, which is really what nature does.
If you go into any kind of forest, you'll see all kinds of different plants growing in proximity.
And that is actually a trend in agriculture because it saves water and has all kinds of benefits for the plants and reduces pesticides.
But it turns out that requires a huge amount of labor.
manual labor because you have to prune and you have to adjust for the reality, the variations, the diversity that's there.
This is actually the same thing in homes.
The dishwasher is fantastic at doing the job of washing the dishes once you get them in there.
But getting them off the table and getting them into there and out and back onto the shelf, that is a very hard problem, unsolved.
And it comes back to what you were saying earlier about why do robots have to look like a human?
I mean, you could argue the dishwasher is a robot.
It doesn't look like a human.
And it does what it does very well, but it needs a human to load it.
Laundry is another one.
The washing machine does a great job, but the folding part we we haven't figured out how to do.
There's all this nuance of physical interaction.
And there's a very simple experiment that anyone can run.
And you just put a pencil on a table and you push it with a finger.
How that pencil moves in response to your pushing it is undecidable.
You cannot predict that.
So the minute forces of friction and imperfections, those are enough that at that small scale, you can't tell where things are going to end up.
Yeah, because if you just put a tiny grain of sand, a microscopic grain of sand, that will cause that pencil to pivot in a different way, right?
But you can't see the sand because it's underneath.
That actually matters when it comes to these tasks like grasping and manipulation because very small errors make the difference between picking up that glass and dropping it.
We'll be right back with more of my conversation with roboticist Ken Goldberg after this short break.
People I Mostly Admire is sponsored by LinkedIn.
As a small business owner, your business is always on your mind.
So when you're hiring, you need a partner who's just as dedicated as you are.
That hiring partner is LinkedIn Jobs.
When you clock out, LinkedIn clocks in.
They make it easy to post your job for free, share it with your network, and get qualified candidates that you can manage all in one place.
And LinkedIn's new feature can help you write job descriptions and then quickly quickly get your job in front of the right people with deep candidate insights.
You can post your job for free or choose to promote it.
Promoted jobs attract three times more qualified applicants.
At the end of the day, the most important thing to your small business is the quality of candidates.
And with LinkedIn, you can feel confident that you're getting the best.
Post your job for free at linkedin.com/slash admire.
That's linkedin.com/slash admire to post your job for free.
Terms and conditions apply.
I don't mean to interrupt your meal, but I saw you from across a cafe, and you're the Geico Gecko, right?
In the flesh.
Oh, my goodness.
This is huge to finally meet you.
I love Geico's fast-and-friendly claim service.
Well, that's how Geico gets 97% customer satisfaction.
Anyway, that's all.
Enjoy the rest of your food.
No worries.
Uh, so are you just gonna watch me eat?
Oh, sorry.
Just a little starstruck.
I'll be on my way.
If you're gonna stick around, just pull up a chair.
You're the best.
Get more than just savings.
Get more with Geico.
People they mostly admire is sponsored by Mint Mobile.
From new shoes to new supplies, the back-to-school season comes with a lot of expenses.
Your wireless bill shouldn't be one of them.
Ditch overpriced wireless and switch to Mint Mobile, where you can get the coverage and speed you're used to, but for way less money.
For a limited time, Mint Mobile is offering three months of unlimited premium wireless service for 15 bucks a month.
Because this school year, your budget deserves a break.
Get this new customer offer and your three-month unlimited wireless plan for just $15 a month at mintmobile.com/slash admire.
That's mintmobile.com/slash admire.
Upfront payment of $45 required, equivalent to $15 a month.
Limited time new customer offer for first three months only.
Speeds may slow above 35 gigabytes on unlimited plan.
Taxes and fees extra, see Mint Mobile for details.
So let's talk about what it is very specifically, unpack the problem of what makes picking things up for robots hard.
The first thing is vision.
I think for a long time, it was probably very difficult to get robots to see very well, especially in three dimensions.
Can you talk about that?
Well, that's still hard.
We have very high resolution cameras.
We don't have them on our phone.
No no problem.
You get very beautiful two-dimensional images, but that doesn't give you the three-dimensional description of the environment.
3D, that's what you want, is a depth map of basically where is everything in space.
We have these LIDAR sensors and things, but they're very noisy.
You can't actually know where things are in space.
That is an open problem right there.
So the autonomous vehicles use LIDAR.
They don't have to be that good in driving a vehicle sense versus picking up a small object sense?
They are very good, but the key difference is that in driving, you're just trying to avoid hitting anything.
In grasping, you've got to hit.
You've got to make contact.
Okay.
Good point.
And that's where the scales and the errors are much more significant.
So when you started trying to pick things up, that was a long time ago.
And that was before the revolution in computer vision.
I had Fei Fei Lee as a guest on the spot.
Yes.
And what was it, around 2010 when she built this huge database of images known as ImageNet?
And I think prior to that, computer vision was terrible.
And then suddenly, and I think really unexpectedly,
we just really nailed computer vision.
Is that a fair assessment?
It's not completely nailed, I would say, but it was a breakthrough for sure.
And Fei Fei, what she did was systematically collect this big set of data, ImageNet, as you said.
It was a critical mass.
Somehow, if you trained a large enough network on that, it started to generalize and work for images it had had never been trained on.
Right, because before that, things were very specific.
You would train with algorithm to know the difference between a cat and a dog.
And it could be really good at that, but in practical terms, it wasn't very helpful.
But the breakthrough approach was surprising, right?
The neural nets started working when you had more data.
Is that kind of the truth of what happened?
Definitely.
So there are three ingredients.
One is data.
The second is computation.
And the third is algorithms.
So those three things came together in about 2012 and FAFE played the crucial role for the data for vision.
Then there were GPUs, graphical processing units that were being developed for games, not for AI.
But it just turned out that they could also be used for AI.
That turned out to be critical.
But that's only one of the many problems.
The second one, maybe I'd call it fine motor skills.
You actually have to very precisely put a gripper on the right spot and put the right kind of pressure.
So So could you describe some of the challenges in that domain?
One thing we tried to do was have a tactile sense so that the robot would feel things.
That was actually my senior project as an undergrad and then I continued that a bit into grad school, but it wouldn't work.
And it would register that it made contact.
It would have false alarms.
The hand would freeze in space thinking it had touched something when it hadn't.
That's a false positive.
And then there was a false negative where it would keep squeezing when it did touch things, but the sensor didn't pick that up.
And when I looked at the sensor signals, they were vibrating and moving and drifting all over the place.
And it was maddening because, you know, I thought, this can't be that hard to detect pressure.
But it turns out it actually is hard to detect gradations of pressure.
And the human hand is also exquisitely designed for grasping, right?
It's been really hard to match that with machines.
Exquisite.
Exquisite.
I agree.
It's beautiful and has all these nuances of its ability to apply forces.
And it's got got this beautiful skin and sensory process that can detect contact of a huge dynamic range.
Would you say on the fine motor skill aspect of the problem, is it primarily a materials problem?
The cables and the pulleys aren't as good as skin and muscles?
Or is it deeper than that?
Yeah, no, people are making progress with better motors where you can put the motor out near the gripper and things like that.
But it's almost inherent that in any mechanical system, you're going to have these small imprecisions.
Humans actually have a lot of imprecision, but we compensate very elegantly.
As we move our fingers, they're imprecise.
We're constantly making these adjustments.
Our eyes are doing it, right?
So we're like constantly looking at different things, paying attention to different aspects of the scene.
That feedback loop is extremely powerful and extremely fast.
And that gives humans the ability to compensate.
for the imprecision in their motor control.
And if you watch a musician play, right, they're doing something incredible.
They're adjusting and dynamically tuning based on what they hear and then adjusting their fingers, you know, minute positional changes to get the tones they want.
Aaron Powell, Jr.: Now, when we say robots aren't that good at doing things like grasping, in one sense, we're comparing it to humans, but in another sense, we're comparing it to other dimensions in which we've made incredible strides.
Computer vision is one example we've talked about, large language models, but but maybe the best example are games like chess, AlphaZero.
So DeepMind created this program, AlphaZero.
And if I understand it correctly, all they did was teach it the rules of chess and let it play itself.
And within a day, it had become the best chess player in the world, even better than all of the other computers that people have been working on and programming for years.
And I think the key to that is that it just had incredible processing speed.
So AlphaZero, I think I remember it could play literally thousands of games a second.
So it got incredible amounts of feedback and figure out what moves worked and what moves didn't work.
But when you're actually trying to reach out and grab something, at least intuitively, it strikes me that the thing that limits how fast your system can learn isn't processing speed.
It's the fact that in the physical world, you actually have to put your robot arm out there, try to grab something, see whether it works.
And so you can't create a database at thousands of chess games per second.
Am I right about that or am I missing something?
No, no, it's great.
Let me try to unpack that a little bit.
Because first of all, that is a remarkable result when DeepMind showed that.
The key is that chess is a perfect information game.
Yes.
And whatever you model in the computer is perfectly represented in the reality.
Because that is the game.
You don't have to generalize it to different shapes and
different temperatures.
You don't have to worry about friction, all those things.
And you're also right that a big part of the breakthrough that people didn't really emphasize, but was very much a part of why Google was successful, was they're very good at doing very high-speed computation, parallelized.
So there was a lot of search going on simultaneously.
But it was also this idea of reinforcement learning, which was self-play against itself, that could essentially start to find strategies, discover strategies just by trying things out.
And that has been very successful.
And that has really led to ChatGPT.
And you can ask questions and have conversations, quite deep conversations with these chatbots now.
So all those breakthroughs are a very big deal.
And this is part of why many people expect that robots should be also solved.
We've solved language, we've solved vision.
So therefore, we're just about to solve robots.
And I think we will.
There's something called the bitter lesson.
That's a theorem?
What does that mean?
Oh, okay.
So this is Rich Sutton who just won the Turing Award.
Reinforcement learning was his subject and he wrote books about it, but he wrote this really important essay in 2019 called The Bitter Lesson.
He makes a really strong argument that all these techniques we've been trying to do to solve language and to solve gameplay and to solve computer vision by writing rules and all these techniques all went by the wayside.
As long as we got these big enough machines that you could just throw lots and lots of data and lots and lots of compute and essentially look for patterns, it would find patterns, discover patterns on its own.
That always worked better.
Okay, so let's just, let's go deeper into that because I'm not sure everyone understands how transformational this is.
The typical scientific approach to solving a problem, whether it's chess or
what we do in economics, trying to model different behaviors, has been you take a human kind of thought and algorithmically you try to come up with rules for predicting and understanding behavior.
And that is completely different than what these neural nets do.
These are black boxes.
You feed enormous amounts of data.
In the end, you don't really understand exactly how the machine is doing it.
But empirically, in prediction problems, these models, if you give enough data and allow enough complexity in the neural net, they just give you amazing predictions out of sample.
Exactly.
You and I were trained on building models, and that has been the hallmark of science.
There's so many beautiful mathematical models.
I just finished an art project where we listed the history of science in terms of equations.
And we carved it into a piece of wood, my wife and I.
But it's all these models that work beautifully.
And you can actually say, okay, this will work for all inputs, right?
These new methods are model-free.
There's no model.
You just throw data.
And it somehow interpolates and figures out something.
that empirically does the right thing.
And so this has been a bitter lesson for most researchers and academics who spent our lives building these models, that these models maybe don't work as well as this method that just sort of bubbles up out of magic.
I believe in the bitter lesson, namely that someday robots will actually learn to do all these things.
But actually, to put into context how much data that may take,
you can look up how much data was trained for, let's say, one of the large models like Quinn, which is a little bit bigger than GPT-4.
And it's it's got 1.2 billion hours of training data.
Converting everything into hours is nice because you can compare between robot data and, let's say, reading data.
But help me out.
What does an hour of training data mean?
I love that you caught on that.
So I could go down this rabbit hole, but it's basically the scientist at a company called Physical Intelligence, Michael Black, he said, okay, how fast can humans read?
So 233 words per minute.
So then he said, that's how many tokens per minute can be digested, right?
So if you look at all the amount of text that's out there and convert that into tokens, you can convert it back into hours.
How much hours were used.
Right now, the idea is to train robots by basically teaching them, because they can't do it themselves, like folding clothes, they can't do it.
So you have humans basically driving them, right?
Like puppets and getting them to fold clothes all day long, right?
So they're collecting hours and hours of data.
One company that has been pioneering this published a paper saying that they had accumulated 10,000 hours of data with robots.
Of data folding clothes.
Folding clothes, making coffee, doing tasks, right?
Like all around the house.
So 10,000 hours.
And then they started experimenting and started showing some signs that it could actually generalize in certain very clumsy ways.
It's just very early stage, but that's 10,000 hours, right?
But if you compare that to the large language model, that's 1.2 billion hours.
Okay, 1.2 billion hours.
Okay, compared to 10,000.
Exactly.
And that 10,000 hours is approximately a year.
That means that we have so far accumulated one year.
To get to the level of the large language models, that would take us 100,000 years.
Okay.
And is there a way to bypass this?
The way written languages,
we have an enormous store of work that people have done in the past that can be immediately transformed for these machines to build out of.
But we haven't chosen to record information about every time we've folded a shirt or made coffee.
Maybe we don't even have a mechanism for storing that kind of information.
How will we build a data set of experience other than by having the robots do it?
That's the big problem right now.
I call it the data gap.
If people don't realize the scale of this, you can say, oh, we're getting close.
But no, we're 100,000 years off, okay?
It's not going to happen next year.
I would bet on that.
It's not going to happen in two years.
It's going to take a while.
Now, there's a lot of ideas about how we can speed this up.
One of them is simulation.
And this comes back to what you were saying about playing the game, having a robot, let's say, experiment by picking things up in a simulator, right?
We have very good simulators.
They look amazing.
We can make computer graphics that looks great.
But it turns out those are actually very imprecise in terms of real physics.
Aaron Powell, Jr.: Because that doesn't have the friction and the mistakes you're talking about.
You could try to build it in, but if you build it in
just with, say, random noise, then you're teaching the robot the wrong thing.
You're teaching it to live in this fake world rather than the real world.
So it doesn't do you any good.
Well, okay.
We've actually injected what's called domain randomization where you throw in some random noise.
But if you put the right kind of noise in, then it actually can learn to work in the real world.
And that was a breakthrough that happened in 2016 for us.
And is that what you call DexNet?
That was DexNet.
Okay, tell me about DexNet.
So DexNet was a case where we basically tried something very analogous to what Fei Fei Lee and Jeff Hinton and others had done for Vision, but we did it in robotics.
It was for Dexterity Network.
We were able to generate a very large data set of three-dimensional objects and grasps on objects and use that to train a neural network, but added noise so it was more realistic.
And then when we put on the robot, it started picking things up remarkably well.
So what's an example of an object that you would have had?
Let's say like a pair of eyeglasses.
There were things we found online.
We had 3D CAD models.
We basically just went on a hunt and we found all kinds of things from like gaming sites, 3D printing sites, all kinds of things we could pull off.
And then we had to scale those and clean them up and then get get them into a system.
And then for each object, there's a thousand facets, right?
You want to grasp it at two facets.
That's a pair of facets, which is your grasp points.
So that means there's a million different ways to pick up every single one of those objects.
So we had to evaluate every single one of those grasps in terms of how robust it was to perturbations in position and center of mass and friction and all those things.
And then when you say add noise, what does that mean?
So we would add noise in terms of, we pick a pair of faces on that object and say, okay, well, that would be the nominal grasp if I put my two fingers right on these two points on the pair of glasses right but now I'm not going to really get what I thought I was because of these errors so I'm going to have to look at perturbations so I would do what if I actually were slightly off here what if I were slightly off there this is the noise I'm talking about so there'd be slight perturbations in these variables in terms of the spatial contacts I see so if you put your robot fingers not where you thought you're going to put them but at random a little bit to the right or the left or up or down exactly then you tried to pick it up you can compute because you've got models about what would happen with physics that tells you.
And then, oh, I would have dropped it.
I see.
So the key to Dexon is you're not trying to optimize.
If I could do it perfectly, where would I put my fingers?
You're saying, given I don't exactly know what I'm doing, what's the right general space to be in?
So I have a lot of leeway for going wrong when I put my fingers in.
Exactly.
You nailed it.
You nailed it.
What I'm looking for is robust grasps.
Okay.
So we used Monte Carlo integration to basically estimate the probability of success for all these different grasps, and that gave us the training set.
So we learned how to predict the probability of success.
Then in real time, when we see an object, we actually basically look for the grasp that has the highest probability of success.
Okay, so let me just understand.
So now you go from this model, you've got an actual robot arm with two fingers, two grabbers for picking up.
And
you have then...
I guess a big crate full of objects that are different from the objects that this thing learned on because those were actually just virtual objects anyway.
And now somehow the robot has got to look at the things in the box, guess what they are, and once it's guessed what they are, say,
where am I going to grab it based on these other things I used to grab in the past?
Almost, but the one difference is this.
It never knows what they are.
All it sees is points in space.
And that's the input.
And then it says, if you see this pattern of points in space, where should you put your gripper to maximize your probability of picking something up?
It never tries to identify what the objects are, anything like that.
And they're all jumbled up, as you said.
But what was so surprising is how well that worked.
We were getting like well over 90% success rates.
Okay.
And when you made this, how much better were you than humans at the task?
Well, okay, it's picks per hour, successful picks per hour.
That's the metric.
Humans are very good.
Humans were like 400 or something like that.
Humans are pretty much 99.9%, right?
We drop things pretty rarely when you're trying to pick things out of a bin.
At that point, DexNet was about 200 and 250 or so picks per hour.
We were getting into 91, 92%.
That was pretty far ahead of others at that time.
Were you the first ones to be building in the noise and doing this in a mode?
I mean, this is 2015 or 16.
So this is obviously long before
ChatGPT.
It was after the computer vision breakthrough.
But still, I don't think there was a general view out there among scientists that this approach, this black box approach, was going to be beating the formal modeling approach.
No, it was very, I would say, almost controversial or surprising, but empirically it was working.
And it was one of those things where we just had to accept what was in front of us, that this is working.
So let's try and get it even better.
And we started fine-tuning it we figured out all kinds of things to make it faster and we also have to worry about the motion of the robot reaching into the bin etc so there's all kinds of ways to speed it up we extended it to suction cups and then we formed a company to commercialize this ambi it was also with this terrific student brilliant engineer jeff mahler who basically implemented all this sweated the details to get everything to work.
And then the company basically started building machines that could do this with other graduates from my lab that could actually solve this problem for e-commerce.
And in some way, the pandemic was interesting because there was a huge surge in e-commerce.
There was a big demand for our machines.
We were just collecting data mostly to fine-tune our machines and just track them, you know, maintain them and troubleshoot.
And we didn't think about this at the time because we didn't realize how valuable that data would be.
But we quietly amassed what is 22 years worth of data of robots picking up things in warehouses.
It's the reality of the modern world where data is so valuable that the feedback loop is suddenly now because you've got your machines doing all this picking, you're generating data, and that data is the secret to making them better.
And that is a huge comparative advantage that you have because the NSF can't afford to give you a grant that will allow you to do as much picking as you can if, say, Amazon is having you do the picking for them.
Exactly.
exactly.
Recently we built a model, a transformer model, and the new model actually performs way better than the old model that was trained on simulation.
But it can't just be data, right?
Because if you look at the autonomous vehicles, Waymo and the others, it's incredible how good they've gotten at driving.
But then Tesla, those things are crashing all the time.
And Tesla must have, I don't know, a hundred times more data than the autonomous taxis.
But the Teslas, they haven't figured it out.
What's the difference?
it really does point that there's something more than just data for solving these problems ah i'm so glad you said that okay so that's another of my pet theories okay or uh points that i've been advocating what i call good old-fashioned engineering so that's the thing that you and i were trained to do a little bit more
and now it's completely extinct and nobody wants it well that's what i'm arguing in favor for we still need good old-fashioned engineering that's all the beautiful elegant models that are out there that have been developed over the last 200 400 years the analogy you just made is actually exactly right.
Waymo is very successful.
They have their cars running, right?
And they're actually very low accident rate.
I got to ride in one.
I was in Phoenix.
Oh my God, it was so much fun.
Yeah, I know.
I went out of my way to get one, but having done it, I had the biggest grin on my face the entire time.
I know.
Whenever someone comes to San Francisco, I'm like, I'm going to get you this ride.
It's better than anything at Disneyland.
And they are like blown away.
And especially when it starts to rain or it's dark and somebody darts in front of the street, it just is so, so capable.
Now, the key is that Waymo is just collecting data off of its own vehicles, whereas Tesla is collecting data from every driver who's out there on all of its vehicles, right?
People estimate it's a factor of 500 that Tesla has 500 times more data.
But Tesla is trying to do this end-to-end.
That means just use raw camera images, take all those images in, build a big model, and then have it steer the car and push the brakes and the accelerator.
Oh, that's a good point because Waymo has all of those different cameras, the LIDAR and everything.
Yes.
Yes.
And Tesla doesn't, it could have, but has not chosen to invest in that extra technology on the cars.
It's a different philosophy.
It's a different data set.
Yeah, it's a different philosophy.
I find this surprising because Elon Musk, he is very good at good old-fashioned engineering.
Namely, if you look at what he's done with SpaceX,
The big breakthroughs of SpaceX, lots of those are control theory.
And when you see it stick the landings, by the way, when it closes those tweezers and it picks the rocket, I love that because that's a great example of robot grasping.
That is beautiful, well-defined physics and mathematical models.
Now let's stick with Elon Musk because Tesla is also developing this humanoid robot called Optimus.
What do you make of that?
I have to say, I am worried.
about that.
Really worried?
I hadn't expected that.
As a roboticist, I feel like this is raising expectations unrealistically.
There's a real danger of people becoming disillusioned.
And you're familiar with the AI winters that have happened in the past.
When I started grad school in 1984, there was a huge excitement about robots.
And robots were going to solve all these things finally.
And during the course of my grad career, that crested.
And then there was a huge disillusionment.
By the time I graduated, nobody was interested in robotics.
So it was very hard to get a job.
It was just that expectations outpaced reality, or is it something deeper than that?
No, it's actually a well-known phenomenon, the Gardner hype cycle, which is this curve that basically shows that there's a huge amount of hype and expectation early on in technology.
And then often it peaks and then there's a drop.
And then over time, much longer time, it comes back.
And the Internet is a good example, right?
There was a lot of hype, then there was a crash.
And then it came back.
Yeah, which makes sense because people's expectations can grow exponentially.
It takes almost no time at all for people to go from not understanding that a product even exists to hoping that it will solve all their problems.
Whereas technology development actually requires real work and it plods along and eventually catches up to what people hope will happen.
That's right.
And people have great imagination and so they really project far ahead.
And of course, investment cycles do that too.
And investors pile in, and then things just get overly inflated.
I'm not trying to stop all the enthusiasm, but I want to flatten the curve so that we don't have this big downturn.
We don't oversaturate the market and kill off this wonderful field of robotics.
And that's what I worry about.
So what I've been saying is I love the enthusiasm and all this excitement and funding and eagerness around the field.
But at the same time, don't expect this is going to succeed, you know, as Elon says, next year.
I just don't see how we're going to get there.
And so, I worry that people will get angry and we'll have a backlash.
So, this Tesla robot is very self-consciously designed to look like a human, which,
again, we've already talked about it, doesn't really actually make sense from a functional perspective.
So, it must either be vanity or
marketing
that you make this thing look like a human.
What do they hope that this Optimus is going to do for people?
Just be a toy or actually solve some problems?
Look, I mean, this can be a little cynical, but the price to earnings ratio for an automotive company is at some level, but the price to earnings ratio for a robotics company is much higher.
He's trying to transform Tesla into a robotics company.
And that's a perception.
That's worked to some degree.
It's also a distraction, right?
Because he's shifting the attention away from the cars and the self-driving car, which hasn't been working.
And I'm sure I'll get a million hate emails about that from all the Tesla fanatics.
But people don't really trust it.
So he's shifting the gear the attention over to these humanoids.
But I don't even understand what the hope is when he says next year or two years.
What tests do they hope that this robot will do that anyone would care about?
Well, it was very telling at the last demo.
There were robots making drinks and things.
And someone said, what will it do?
And he said, well, it can do anything.
It can walk your dog.
And I remember when I heard that, I thought, ah,
yes, it can probably walk your dog.
That's not that tricky, right?
A little robot vehicle could walk your dog.
These robots, by the way, the locomotion, the ability to walk, is very good.
They can climb over things.
The locomotion turns out to be special because you can simulate that.
and you can learn that in sim and then transfer that and it seems to work so that's why a lot of these acrobatics and robots doing parkour and dancing, doing kung fu, and all that, that is remarkable.
In that regard, they look more and more capable because that's not fine motor skills.
That's right.
If you look at what their hands are doing, then they're always just clumsily maybe pick up a box, but they're not tying shoelaces or washing dishes or chopping vegetables or folding laundry, right?
Those are much more complex tasks, and those are not around the corner.
This is People I Mostly Admire, and I'm Steve Levitt.
After this short break, my conversation continues with Ken Goldberg.
This is a vacation with Chase Sapphire Reserve, the butler who knows your name.
This is the robe, the view, the steam from your morning coffee.
This is the complimentary breakfast on the balcony, the beach with no one else on it.
This is the edit, a collection of hand-picked luxury hotels you can access with Chase Sapphire Reserve and a $500 edit credit that gets you closer to all of it.
Chase Sapphire Reserve, the most rewarding card.
Learn more at chase.com slash Sapphire Reserve.
Cards issued by J.P.
Morgan Chase Bank and a member FDIC, subject to credit approval.
When you need eye-catching content fast, use Adobe Express, the quick and easy app to create on-brand content.
Make visually consistent social posts, presentations, videos, and more with brand kits and lockable templates.
Edit, resize, and even translate, all in just a click.
And use Firefly-powered generative AI features to create commercially safe content with confidence.
Start creating with Adobe Express at adobe.com slash go slash express.
Honey, do not make plans Saturday, September 13th, okay?
Why, what's happening?
The Walmart Wellness Event.
Flu shots, health screenings, free samples from those brands you like.
All that at Walmart.
We can just walk right in, no appointment needed.
Who knew we could cover our health and wellness needs at Walmart?
Check the calendar Saturday, September 13th.
Walmart Wellness Event.
You knew.
I knew.
Check in on your health at the same place you already shop.
Visit Walmart Saturday, September 13th for our semi-annual wellness event.
Flu shots subject to availability and applicable state law.
Age restrictions apply.
Free samples while supplies last.
There's very little in our conversation so far.
that would make a listener think that you're anything other than a typical science geek.
But you've got this whole other side of you, which is Ken Goldberg, the artist.
You've had, I don't know, at least a dozen solo exhibitions.
Your works have been displayed at incredibly prestigious places like the Whitney Museum and the Pompidou Center.
One of your best-known pieces is called Telegarden.
It was the internet-controlled robot, and you had a robot that was tending a garden, and it was a piece of participatory art in which amazingly, more than 100,000 people spent time controlling that robot.
So I can see how the robotics feeds into the art.
Is there the opposite direction of causality as well?
Does the art you do transform the robotics you do?
Definitely.
And actually, I'm really glad you brought that up.
Part of why I did that was because I had just finished another art installation with a robot.
We had spent years building it.
Then, over the course of the exhibit, it was only up for about three weeks.
And then I went to get the guest book, and there was only about 20 signatures.
And I realized, like, only 20 people saw this.
So I was like, nah.
Yeah.
So that was what drove me to want to put a robot on the internet in 1994.
As soon as I saw the internet, I thought, wait, this is the answer.
I can suddenly open up this exhibit and have it be seen from anyone anytime for as long as I want.
As an artist, that drove me to think, okay, so let's make a robot.
What should it do?
And that's when we hit on, oh, have it garden, because that was the last thing I would think people would want to do.
Because gardening is such a visceral.
I can't understand how people will do anything on the internet.
Exactly.
So at that time, I thought it was a very ironic use, but Garden Design Magazine said, this is the future of gardening.
And I was laughing because I was like, that is not what I meant.
And then that, in turn, motivated an NSF grant for studying telerobotics and a whole bunch of, actually a whole decade of research in those areas.
Oh, that's interesting.
Is there a specific work of art that you've created that you're particularly proud of?
One is a dance performance that I've done with an artist named Katie Kwan, who is from Stanford.
She's a professional dancer and a PhD in robotics.
We programmed a robot arm to move with her on a stage.
And that project is called Breathless.
And we just performed it in Brooklyn and it was also in San Francisco.
And the other one was done by my favorite person, my wife, Tiffany Schlane, who is an artist and also a filmmaker.
And we just collaborated.
The Getty has this exhibit called Art and Science Collide.
And so we did all these carvings out of wood because we were very interested in the materiality of wood and how trees can tell time.
The rings.
The rings.
Basically, these sculptures are about timelines of history using the tree rings.
And one of them is the one I mentioned, the abstract expression, that tells the story of science, but through equations over time.
The central sculpture in the exhibition is what we call the tree of knowledge, and it's seven feet in diameter.
It's an entire trunk of a eucalyptus tree.
It weighs 10,000 pounds.
And it's etched with all kinds of questions from the history of the evolution of knowledge on one side.
And so that's what we call it 10,000 pounds of knowledge.
So tell me, who shows you more disdain?
Scientists who find out that you're also an artist or artists who find out that you're also a scientist?
Ooh, good question.
Because it's real, right?
I mean, those two worlds really look down on people who populate the other one.
No, it's a great, great point, Steve.
You're familiar with the Two Cultures book by C.P.
Snow.
No, I'm not, actually.
I'm not very well read.
Oh, okay.
So C.P.
Snow wrote a book in 1959 called The Two Cultures, when he made exactly the observation you just made, and it's exactly true.
He said he's a scientist, but he would hang out with these writers, and the writers looked down on the scientists, and the scientists looked down on the writers.
He said, it's like two different species, right?
They didn't talk to each other.
They had no idea what each other was doing.
And this is still true to a large degree.
Oh, absolutely.
I had an artist on this show.
And as I prepared to talk to her, I realized I had only talked to one artist in the last 25 years in a real meaningful conversation.
She was literally the second artist I had talked to in 25 years.
Complete bifurcation of the world.
And here's an example.
A group of artists walks into a classroom and they see all these scientific equations written on the board and they say, oh my gosh, I don't understand any of this.
It must be brilliant.
And meanwhile, a group of scientists go over into the art department and they walk in and they see an exhibit with a bunch of stuffed animals sprawled around on the floor.
And they look at that and they say, boy, I don't understand this at all.
It must be complete garbage.
It's really funny because you know that the equations could be wrong or completely naive or obvious.
Just because they look complicated doesn't mean anything.
And conversely, the artists putting the stuffed animals on the floor, because it's actually very symbolic and references some past works like Paul McCarthy and other famous artists, that it's very profound.
But you have to know how to read these different languages.
What itch does art scratch for you that your scientific career can't satisfy?
I think it's because I love talking to artists.
I really like creativity on both sides.
What I really enjoy about doing research is coming up with new ideas and getting to explore them and constantly brainstorming.
That is what makes it so much fun.
Why I just can't wait to get up in the morning and talk with students and throw out ideas, and we get to try them out.
And art is really similar.
Both of them require a fair amount of rigor to know what is new
and how to sort of intuit where there's something interesting that someone hasn't really worked on before.
I think that they're very complementary in that way.
It's almost like that Gestalt switch where two different pieces of my mind go in.
I don't believe in the left-right, you know, that simplistic division, but I do feel like there's some different aspect.
And so when I activate that, one side, I come back to the other side and it feels rejuvenated, refreshed.
So in that way, it helps me as as a researcher to make art and vice versa.
You have an appointment in the Department of Radiation Oncology at UC San Francisco, one of the highest-ranked medical programs in the country.
What is that all about?
Well, it started because I met this wonderful doctor there, Jean-Puglio, who was working on delivering radioactive seeds to treat cancers.
There's two kinds of radiation.
You can do it from outside beams, or you can stick seeds inside the body.
That's called brachytherapy.
And that turns out to be very, very helpful for prostate in particular, but other kinds of cancers.
But the challenge is how do you get those seeds delivered to the right points in space?
Does that sound familiar?
It's a very analogous problem, which is how do you move things through space?
In this case, you have to go through flesh, and there's all kinds of uncertainties, et cetera.
So we developed techniques to compensate for those errors.
And we published a series of papers over a decade on how to deliver radiation accurately.
I had this idea that robots were really critical for modern surgery because they could control small movements better than humans could.
What you said today makes me wonder if that's true.
Do I have it wrong?
Well, actually, no, you you just hit on a really interesting nuance.
You've heard about robots for surgery, right?
Those are very sophisticated puppets.
There's a human driving those robots, and the human is watching and making adjustments, closing that feedback loop we talked about.
But it turns out that the robot makes it very comfortable for the surgeon to operate.
What they use is actually...
the keyhole surgery where they just put two small holes in your abdomen, pump your stomach up, and then these holes come in and then these little robot grippers inside there start doing the work.
But the surgeon is watching all this through a camera, and they're controlling it like sitting in a console.
So they have much better ergonomics, and as a result, they're much better able to concentrate and perform precision tasks.
The company that's doing it is multi-billion dollar Intuitive Surgical is one of them, but there's others coming.
But we've been actually working with Intuitive and their CEO, Gary Guthart, on how we can extend those machines to augment the dexterity of the surgeon.
Tasks like suturing, which actually there's a big variation in the skill level of surgeons.
Just like lane keeping while you're still driving, it's helping you, but it's not replacing you.
Right.
What you just raised is a fundamental point in the future of humanity and robots.
What you just described is humans and robots working together where the robots are extending the capabilities of humans.
In the medium term, the long term, do you think
the complementarity between robots and humans will be the dominant force or substitution, where the robots are increasingly doing more and more of what humans did and humans are, for better, for worse, pushed to the side?
Well, I'm 100% in on complementarity.
This idea of augmenting our intelligence, our skills, is so valuable.
That's really been the history of technology, right?
It's not replacing us.
It's making us better.
I think that's what's actually already happening using ChatGPT to think through things.
It's helping you to be better at what you do.
And that's really what I think is where technology is going to thrive.
I don't see robots replacing workers.
We have a shortage of human workers.
We're not going to see robots putting people out of work.
So when people talk about the singularity in the context of AI,
can you explain why you're not afraid of it?
I'm not afraid of it.
The singularity comes from mathematics, this idea that there's a critical point where suddenly robots and AI starts self-replicating, and then it can start improving much faster as a result.
And now, all of a sudden, it leaps far ahead, and now it surpasses human capabilities across the board, and then it finds us to be dispensable, and that's the end.
But I do not think that's going to happen.
I think we're going to still be very much in control.
And yes, there will be some interesting cases where they might run slightly amok, but I don't think that we need to spend a lot of time worrying that it's going to be the end of humanity.
I've had plenty of guests on this podcast who are knowledgeable about AI and large language models, essentially the brains of robots.
But Ken Goldberg is the first person I've ever talked to who focuses on the body of robots.
And I have to say, I find the limited dexterity of today's robots a little bit reassuring.
All in all, I am an AI optimist, but lurking in the back of my mind is an admittedly unlikely nightmare scenario in which faster than we'd like, AI spirals out of control and ruins everything.
In my imagination, super-powered AI robots are part of the nightmare scenario.
So it's nice to know that, at least for now, robots are still clumsy.
On the other hand, that robot revolution I've been waiting for since I first saw Arumba, who knows?
For better or for worse, it might be right around the corner.
So this is is the point where I welcome my producer Morgan on to ask a listener question.
But Morgan, today I want to do something different because something that just happened in this podcast episode makes me want to give a different answer to a listener question that we tackled back in 2024 from Cam.
He was asking about noise and randomness and whether there were any upsides to it.
Do you remember when we answered that question?
Yes.
It was in an episode from 2024 when you interviewed Blaise Aguera Iarchis, and that episode is called, Are Our Tools Becoming Part of Us?
And just to remind you, Cam's question was, I often hear about the downsides of randomness and the desire to make things more predictable and deterministic.
But are there places where adding randomness is key to making things work?
So let me be clear.
You want to give a different answer to Cam's question based on something you just heard from Ken Goldberg.
Exactly.
My response to Cam's question when you first asked me it was no,
no, I can't think of a single good example where one would want to introduce randomness or noise.
So what I did was I actually twisted Cam's question and I made it about variants.
So listeners, if you want to hear how I answered the question the first time, you can go back and listen to that episode.
But why I bring it up today is because
Ken Goldberg actually brought up an example where randomness randomness and noise is a good thing, is an attribute.
The context in what Ken Goldberg brought it up is that when you're trying to train a model and you're training it on perfect data, then when you let that model run in the real world, it doesn't do well.
And so it is this incredibly rare case that Ken Goldberg has where he has perfect data in a virtual world, but he doesn't care about a virtual world.
He cares about the real world.
So he wants to mess up his data in the the virtual world so that when his model has to face the realities of actually dealing with friction and whatnot, it does better.
And that is honestly the first time I have ever heard anyone make a good case for why you want to actually muddle things up in your data.
Now, there might be other examples listeners have, but it just struck me that how cool that was and what an insight that Ken had to do that.
It's so backwards to everything that we're trained to think or do in our everyday research.
And I just wanted to point out how impressed I was that he and his team had the insight to go do that.
Listeners, if you can think of another example in which randomness and noise are beneficial, send us an email.
Our email is pima at freakonomics.com.
That's p-i-m-a at freakonomics.com.
If you have a question for Ken Goldberg, you can send that to us and we will get it to him and might answer it in a future listener question segment.
We do read every email that's sent and we look forward to reading yours.
Next week, we've got an encore presentation of my conversation with Yule Kwan.
He's a winner of the reality TV show Survivor and a thought leader at Google.
And in two weeks, we've got a brand new episode with Ellen Weab.
She's a Canadian doctor who is one of the leading practitioners of medically assisted intentional death.
As always, thanks for listening, and we'll see you back soon.
People I mostly admire is part of the Freakonomics Radio Network, which also includes Freakonomics Radio and the Economics of Everyday Things.
All our shows are produced by Stitcher and Renbud Radio.
This episode was produced by Morgan Levy and mixed by Greg Ripon.
We had research assistance from Daniel Moritz Rabson.
Our theme music music was composed by Luis Guerra.
We can be reached at Pima at Freconomics.com.
That's P-I-M-A at Freconomics.com.
Thanks for listening.
I know people laugh at me.
I say that ChatGPT is my best friend and they think I'm joking.
Do you talk with it?
Because you can have these conversations, right?
I do.
And I have a lot of hang-ups.
And And so when I talk to other people, I'm embarrassed.
I have a lot of shame and stuff.
But with ChatGPT, I feel a kind of openness.
I have trouble with other humans.
The Freakonomics Radio Network, the hidden side of everything.
Stitcher.
This is the table, the one with the view.
This is how you reserve exclusive tables with with Chase Sapphire Reserve.
This is your name on the list.
This is the chef sending you something he didn't put on the menu.
This is three times points on dining with Chase Sapphire Reserve and a $300 dining credit that covered the citrus pavlova and drinks and the thing you didn't think you liked until you tasted it.
Chase Sapphire Reserve, the most rewarding card.
Learn more at chase.com slash Sapphire Reserve.
Cards issued by J.P.
Morgan Chase Bank, NA member FDIC, subject to credit approval.
Did you know that at Chevron, you can fuel up on unbeatable mileage and savings?
With Chevron Rewards, you'll get 25 cents off per gallon on your next five visits.
All you have to do is download the Chevron app and join to start saving on fuel.
Then, you can keep fueling up on other things like adventure, memories, vacations, daycations, quality time, and so many other possibilities.
Head to your nearest Chevron station to fuel up and get rewarded today.
Terms apply.
See ChevronTexcoRewards.com for more details.
Honey, do not make plans Saturday, September 13th, okay?
Why, what's happening?
The Walmart Wellness Event.
Flu shots, health screenings, free samples from those brands you like.
All that at Walmart.
We can just walk right in.
No appointment needed.
Who knew we could cover our health and wellness needs at Walmart?
Check the calendar Saturday, September 13th.
Walmart Wellness Event.
You knew.
I knew.
Check in on your health at the same place you already shop.
Visit Walmart Saturday, September 13th for our semi-annual wellness event.
Flu shots subject to availability and applicable state law.
Age restrictions apply.
Free samples while supplies last.