The Black Box: In AI we trust?
This is the second episode of our new two-part series, The Black Box.
For more, go to http://vox.com/unexplainable
It’s a great place to view show transcripts and read more about the topics on our show.
Also, email us! unexplainable@vox.com
We read every email.
Support Unexplainable by making a financial contribution to Vox! bit.ly/givepodcasts
Learn more about your ad choices. Visit podcastchoices.com/adchoices
Listen and follow along
Transcript
Support for this show comes from OnePassword.
If you're an IT or security pro, managing devices, identities, and applications can feel overwhelming and risky.
Trellica by OnePassword helps conquer SaaS sprawl and shadow IT by discovering every app your team uses, managed or not.
Take the first step to better security for your team.
Learn more at onepassword.com/slash podcast offer.
That's onepassword.com slash podcastoffer.
All lowercase.
Most AI coding tools generate sloppy code that doesn't understand your setup.
Warp is different.
Warp understands your machine, stack, and code base.
It's built through the entire software lifecycle from prompt to production.
With the powers of a terminal and the interactivity of an IDE, Warp gives you a tight feedback loop with agents so you can prompt, review, edit, and ship production-ready code.
Trusted by over 600,000 developers, including 56% of the Fortune 500, try Warp free or unlock Pro for just $5 at warp.dev slash top code.
I went to see the latest Mission Impossible movie this weekend, and it had a bad guy that felt very 2023.
The entity has since become sentient.
An AI becoming super intelligent and turning on us.
You're telling me this thing has a mind of its own?
And it's just the latest entry in a long line of super smart AI villains.
Open the pod bay doors, hell.
Like in 2001, A Space Odyssey.
I'm sorry, Dave.
I'm afraid I can't do that.
Or ex-Machina.
Ava, go back to your room.
Or maybe the most famous example, Terminator.
They say it got smart.
A new order of intelligence decided our fate in a microsecond.
But AI doesn't need to be super intelligent in order to pose some pretty major risks.
Last week, on the first episode of our Black Box series, we talked about the unknowns at the center of modern AI, how even the experts often don't understand how these systems work or what they might be able to do.
And it's true that understanding isn't necessary for technology.
Engineers don't always understand exactly how their inventions work when they first design them.
But the difference here is that researchers using AI often can't predict what outcome they're going to get.
They can't really steer these systems all that well.
And that's what keeps a lot of researchers up at night.
It's not Terminator.
It's a much likelier and maybe even stranger scenario.
It's the story of a little boat.
Specifically, a boat in this retro-looking online video game.
It's called Coast Runners, and it's a pretty straightforward racing game.
There are these power-ups that give you points if your boat hits them.
There are obstacles to dodge.
There are these kind of lagoons where your boat can get all turned around.
And a couple of years ago, the research company OpenAI wanted to see if they could get an AI to teach itself how to get a high score on the game without being explicitly told how.
We are supposed to train a boat to complete a course from start to finish.
This is Dario Amade.
He used to be a researcher at OpenAI.
Now he's the CEO of another AI company called Anthropic.
And he gave a talk about this boat at a think tank called the Center for a New American Security.
I remember setting it running one day just telling it to teach itself and I figured that it would learn to complete the course.
Dario had the AI run tons of simulated races over and over but when he came back to check on it the boat hadn't even come close to the end of the track.
What it does instead, this thing that's been looping is it finds this isolated lagoon and it goes backwards in the course.
The boat wasn't just going backwards in this lagoon.
It was on fire, covered in pixelated flames, crashing into docks and other boats and just spinning around in circles.
But somehow the AI's score was going up.
Turns out that by spinning around in this isolated lagoon in exactly the right way, it can get more points than it could possibly ever have gotten by completing the race in the most straightforward way.
When he looked into it, Dario realized that the game didn't award points for finishing first.
For some reason, it gave them out for picking up power-ups.
Every time you get one, you increase your score, and they're kind of laid out mostly linearly along the course.
But this one lagoon was just full of these power-ups, and the power-ups would regenerate after a couple seconds.
So the AI learned to time its movement to get these power-ups over and over by spinning around and exploiting this weird game design.
There's nothing wrong with this in the sense that we asked it to find a solution to a mathematical problem.
How do you get the most points?
And this is how it did it.
But, you know, if this was a passenger ferry or something, you wouldn't want it spinning around, setting itself on fire, crashing into everything.
This boat game might seem like a small, glitchy example, but it illustrates one of the most concerning aspects of AI.
It's called the alignment problem.
Essentially, an AI's solution to a problem isn't always aligned with its designer's values, how they might want it to solve the problem.
And like this game, our world isn't perfectly designed.
So if scientists don't account for every small detail in our society when they train in AI, it can solve problems in unexpected ways, sometimes even harmful ways.
Something like this can happen without us even knowing that it's happening, where our system has found a way to do the thing we think we want in a way that we really don't want.
The problem here isn't with the AI itself.
It's with our expectations of it.
Given what AIs can do, it's tempting to just give them a task and assume the whole thing won't end up in flames.
But despite this risk, more and more institutions, companies, and even militaries are considering how AI might be useful to make important real-world decisions.
Hiring, self-driving cars, even battlefield judgment calls.
Using AI like this can almost feel like making a wish with a super annoying, super literal genie.
You got real potential for a wish, but you need to be extremely careful.
This reminds me of the tale of the man who wished to be the richest man in the world, but whose dead crushed at the mountain of gold coins.
I'm Noam Hasenfeld, and this is the second episode of the Black Box, Unexplainable series on the unknowns at the heart of AI.
If there's so much we still don't understand about AI, how can we make sure it does what we want the way we want?
And what happens if we can't?
Thinking intelligent thoughts is a mysterious activity.
The future of the computer is just hard to imagine.
I just have to admit, I don't really know, but you're confused, Dr.
How do you think I feel?
Activity.
Intelligence.
Can the computer think?
Let's go.
So given the risks here, that AI can solve problems in ways its designers don't intend, it's easy to wonder why anyone would want to use AI to make decisions in the first place.
It's because of all this promise, the positive side of this potential genie.
Here's just a couple examples.
Last year, an AI built by Google predicted almost all known protein structures.
It was a problem that had frustrated scientists for decades, and this development has already started accelerating drug discovery.
AI has helped astronomers detect undiscovered stars.
It's allowed scientists to make progress on decoding animal communication.
And like we talked about last week, it was able to beat humans at Go, arguably the most complicated game ever made.
In all of these situations, AI has given humans access to knowledge we just didn't have before.
So the powerful and compelling thing about AI when it's playing Go is sometimes it will tell you a brilliant Go move that you you would never have thought of, that no Go Master would ever have thought of, that does advance your goal of winning the game.
This is Kelsey Piper.
She's a reporter for Vox who we heard from last episode.
And she says this kind of innovation is really useful, at least in the context of a game.
But when you're operating in a very complicated context like the world, then those brilliant moves that advance your goals might do it by having a bunch of side effects or inviting a bunch of risks that you don't know, don't understand, and aren't evaluating.
Essentially, there's always that risk of the boat on fire.
We've already seen this kind of thing happen outside a video game.
Just take the example of Amazon back in 2014.
So Amazon tried to use an AI hiring algorithm, looked at candidates, and then recommended which ones would proceed to the interview process.
Amazon fed this hiring AI 10 years worth of submitted resumes, and they told it to find patterns that were associated with stronger candidates.
And then an analysis came out finding that the AI was biased.
It had learned that Amazon generally preferred to hire men, so it was happily more likely to recommend Amazon men.
Amazon never actually used this AI in the real world.
They only tested it.
But a report by Reuters found exactly which patterns the AI might have internalized.
The technology thought, oh, Amazon doesn't like any resume that has the word women's in it.
An all-women's university, captain of a women's chess club, captain of a women's soccer team.
Essentially, when they were training their AI, Amazon hadn't accounted for their own flaws in how they'd been measuring success internally.
Kind of like how OpenAI hadn't accounted for the way the boat game gave out points based on power-ups, not based on who finished first.
And of course, when Amazon realized that, they took the AI out of their process.
But it seems like they might be getting back in the AI hiring game.
According to an internal document obtained by former Vox reporter Jason Del Rey, Amazon's been working on a new AI system system for recruitment.
At the same time, they've been extending buyout offers to hundreds of human recruiters.
And these flaws aren't unique to hiring AIs.
The way AIs are trained has led to all kinds of problems.
Take what happened with Uber in 2018, when they didn't include jaywalkers in the training data for their self-driving cars, and then a car killed a pedestrian.
Tempe, Arizona police say 49-year-old Elaine Herzberg was walking a bicycle across a busy thoroughfare frequented by pedestrians Sunday night.
She was not in a crosswalk.
And a similar thing happened a few years ago with a self-training AI Google used in its Photos app.
The company's automatic image recognition feature in its photo application identified two black persons as gorillas and in fact even tagged them as so.
According to some former Google employees, this may have happened because Google had a biased data set.
They may just not have included enough black people.
The worrying thing is if you're using AIs to make decisions and the data they have reflects our own biased processes, like a biased justice system that sends some people to prison for crimes where it lets other people off with a slap on the wrist or a biased hiring process, then the AI is going to learn the same thing.
But despite these risks, more companies are using AI to guide them in making important decisions.
This is changing very fast.
Like there are a lot more companies doing this now than there were even a year ago.
And there will be a lot more in a couple more years.
Companies see a lot of benefits here.
First, on a simple level, AI is cheap.
Systems like ChatGPT are currently being heavily subsidized by investors, but at least for now, AI is way cheaper than hiring real people.
If you want to look over thousands of job applicants, AI is cheaper than having humans screen those thousands of job applicants.
If you want to make salary decisions, AI is cheaper than having a human whose job is to think about and make those salary decisions.
If you want to make firing decisions, those get done by algorithm because it's easier to fire who the algorithm spits out than to have human judgment and human analysis in the picture.
And even if companies know that AI decision-making can lead to boat on fire situations, Kelsey says they might be okay with that risk.
It's so much cheaper that that's like a good business trade-off.
And so we hand off more and more decision-making to AI systems.
for financial reasons.
The second reason behind this push to use AI to make decisions is because it could offer a competitive advantage.
Companies that are employing AI in a very winner-take-all capitalist market, they might outperform the companies that are still relying on expensive human labor.
And the companies that aren't are much more expensive, so fewer people want to work with them and they're a smaller share of the economy.
And you might have huge like economic BMIs that are making decisions almost entirely with AI systems.
But it's not just companies.
Kelsey says competitive pressure is even leading the military to look into using AI to make decisions.
I think there is a lot of fear that the first country to successfully integrate AI into its decision-making will have a major battlefield advantage over anyone still relying on slow humans.
And that's the driver of a lot in the military, right?
If we don't do it, somebody else will, and maybe it will be a huge advantage.
This kind of thing may have already happened in actual battlefields.
In 2021, a UN panel determined that an autonomous Turkish drone may have killed Libyan soldiers without a human controlling it or even ordering it to fire.
And lots of other countries, including the US, are actively researching AI-controlled weapons.
You don't want to be the people still fighting on horses when someone else has invented fighting with guns.
And you don't want to be the people who don't have AI when the other side has AI.
So I think there is this very powerful pressure, not just to figure this out, but to have it ready to go.
And finally, the third reason behind the push toward AI decision-making is because of the promise we talked about at the top.
AI can provide novel solutions for problems humans might not be able to solve on their own.
Just look at the Department of Defense.
They're hoping to build AI systems that, quote, function more as colleagues than as tools.
And they're studying how to use AI to help soldiers make extremely difficult battlefield decisions, specifically when it comes to medical triage.
I'm going to talk about how we can build AI-based systems that we would be willing to bet our lives with and not be foolish to do so.
AI has already shown an ability to beat humans in wargame scenarios like with the board game diplomacy.
And researchers think this ability could be used to advise militaries on bigger decisions like strategic planning.
Cybersecurity expert Matt DeVos talked about this on a recent episode of On the Media.
I think it'll probably get really good at threat assessment.
I think analysts might also use it to help them through their thinking, right?
They might come up with an assessment and say, tell tell me how I'm wrong.
So I think there'll be a lot of unique ways in which the technology is used in the intelligence community.
But this whole time, that boat on fire possibility is just lurking.
One of the things that makes AI so promising, the novelty of its solutions, it's also the thing that makes it so hard to predict.
Kelsey imagines a situation where AI recommendations are initially successful, which leads humans to start relying on them uncritically, even when the recommendations seem counterintuitive.
Humans might just assume the AI sees something they don't, so they follow the recommendation anyway.
We've already seen something like this happen in a game context with AlphaGo, like we talked about last week.
So the next step is just imagining it happening in the world.
And we know that AI can have fundamental flaws, things like biased training data or strange loopholes engineers haven't noticed.
But powerful actors relying on AI for decision-making might not notice these faults until it's too late.
And this is before we get into the AI like being deliberately adversarial.
This isn't the Terminator scenario with AI becoming super intelligent and wanting to kill us.
The problem is more about humans and our temptation to rely on AI uncritically.
This isn't the AI trying to trick you.
It's just the AI exploring options that no one would have thought of that get us into weird territory that no one has been in before.
And since they're so untransparent, we can't even ask the AI, hey, what are the risks of doing this?
So if it's hard to make sure that AI operates in the way its users intend, and more institutions feel like the benefits of using AI to make decisions might outweigh the risks, what do we do?
What can we do?
There's a lot that we don't know, but I think we should be changing the policy and regulatory incentives so that we don't have to learn from a horrible disaster and so that we like understand the problem better and can start making progress on solving it.
How to start solving a problem that you don't understand.
After the break.
As a founder, you're moving fast towards product market fit, your next round, or your first big enterprise deal.
But with AI accelerating how quickly startups build and ship, security expectations are also coming in faster.
And those expectations are higher than ever.
Getting security and compliance right can unlock growth or stall it if you wait too long.
Vanta is a trust management platform that helps businesses automate security and compliance across more than 35 frameworks like SOC2, ISO 27001, HIPAA, and more.
With deep integrations and automated workflows built for fast-moving teams, Vanta gets you audit ready fast and keeps you secure with continuous monitoring as your models, infrastructure, and customers evolve.
That's why fast-growing startups like Langchain, Ryder, and Cursor have all trusted Vanta to build a scalable compliance foundation from the start.
Go to Vanta.com/slash Vox to save $1,000 today through the Vanta for Startups program and join over 10,000 ambitious companies already scaling with Vanta.
That's vanta.com/slashvox to save $1,000 for a limited time.
time.
Support for this show comes from Robinhood.
Wouldn't it be great to manage your portfolio on one platform?
With Robinhood, not only can you trade individual stocks and ETFs, you can also seamlessly buy and sell crypto at low costs.
Trade all in one place.
Get started now on Robinhood.
Trading crypto involves significant risk.
Crypto trading is offered through an account with Robinhood Crypto LLC.
Robinhood Crypto is licensed to engage in virtual currency business activity by the New York State Department of Financial Services.
Crypto held through Robinhood Crypto is not FDIC insured or SIPIC protected.
Investing involves risk, including loss of principal.
Securities trading is offered through an account with Robinhood Financial LLC, member SIPIC, a registered broker dealer.
So here's what we know.
Number one, engineers often struggle to account for all the details in the world when they program an AI.
They might want it to complete a boat race and end up with a boat on fire.
A company might want to use it to recommend a set of layoffs only to realize that the AI has built-in biases.
Number two, like we talked about in the first episode of this series, it isn't always possible to explain why modern AI makes the decisions it does, which makes it difficult to predict what it'll do.
And finally, number three, we've got more and more companies, financial institutions, even the military considering how to integrate these AIs into their decision making.
There's essentially a race to deploy this tech into important situations, which only makes the potential risks here more unpredictable.
Unknowns on unknowns on unknowns.
So what do we do?
I would say at this point, it's sort of unclear.
Seagal Samuel writes about AI and ethics for Vox, and she's about as confused as the rest of us here.
But she says there's a few different things we can work on.
The first one is interpretability, just trying to understand how these AIs work.
But like we talked about last week, interpreting modern AI systems is a huge challenge.
Part of how they're so powerful and they're able to give us info that we can't just drum up easily ourselves is that they're so complex.
So there might be something almost inherent about lack of interpretability being an important feature of AI systems that are going to to be much more powerful than my human brain.
So interpretability may not be an easy way forward, but some researchers have put forward another idea, monitoring AIs by using more AIs, at the very least just to alert users if AIs seem to be behaving kind of erratically.
But it's a little bit circular because then you have to ask, well, how would we be sure that our helper AI is not tricking us in the same way that we're worried our original AI is doing.
So if these kind of tech-centric solutions aren't the way forward, the best path could be political, just trying to reduce the power and ubiquity of certain kinds of AI.
A great model for this is the EU, which recently put forward some promising AI regulation.
The European Union is now trying to put forward these regulations that would basically require companies that are offering AI products in like especially high-risk areas
to prove that these products are safe.
This could mean doing assessments for bias, requiring humans to be involved in the process of creating and monitoring these systems, or even just trying to reasonably demonstrate that the AI won't cause harm.
We've unwittingly bought this premise that they can just bring anything to market when we would never do that for other similarly impactful technologies.
Like, think about medication.
You got to get your FDA approval.
You've got to jump through these hoops.
Why not with AI?
Why not with AI?
Well, there's a couple reasons regulation might be pretty hard here.
First, AI is different from something like a medication that the FDA would approve.
The FDA has clear agreed-upon hoops to jump through, clinical testing.
That's how they assess the dangers of a medicine before it goes out into the world.
But with AI, researchers often don't know what it can do until it's been made public.
And if even the experts are often in the dark, it may not be possible to prove to regulators that AI is safe.
The second problem here is that even aside from AI, big tech regulation doesn't exactly have the greatest track record of really holding companies accountable, which might explain why some of the biggest AI companies like OpenAI have actually been publicly calling for more regulation.
The cynical read is that this is very much a repeat of what we saw with a company like Facebook, now Meta, where people like Mark Zuckerberg were going to Washington, D.C.
and saying, oh, yes, we're all in favor of regulation.
We'll help you.
We want to regulate too.
When they heard this, a lot of politicians said they thought Zuckerberg's proposed changes were vague and essentially self-serving, that he just wanted to be seen supporting the rules.
Rules which he never really thought would hold them accountable.
Allowing them to regulate in certain ways, but where really they maintain control of their data sets.
They're not being super transparent and having external auditors.
So really, they're getting to continue to drive the ship and make profits while creating the semblance that society or politicians are really driving the ship.
Regulation with real teeth seems like such a huge challenge that one major AI researcher even wrote an op-ed in Time magazine calling for an indefinite ban on AI research, just shutting it all down.
But Seagal isn't sure that's such a good idea.
I mean, I think we would lose all the potential benefits it stands to bring.
So drug discovery, you know, cures for certain diseases, potentially huge economic growth that if it's managed wisely, big if, could help alleviate some kinds of poverty.
I mean, at least potentially, it could do a lot of good.
And so you don't necessarily want to throw that baby out with the bathwater.
At the very least, Seagal does want to turn down the faucet.
I think the problem is we are rushing at breakneck speed towards more and more advanced forms of AI when the AIs that we already currently have, we don't even know how they're working.
When ChatGPT launched, it was the fastest publicly deployed technology in history.
Twitter took two years to reach a million users.
Instagram took two and a half months.
ChatGPT took five days.
And there are so many things researchers learned ChatGPT could do only after it was released to the public.
There's so much we still don't understand about them.
So what I would argue for is just slowing down.
Slowing down AI could happen in a whole bunch of different ways.
So you could say, you know, we're going to stop working on making AI more powerful for the next few years, right?
We're just not going to try to develop AI that's got even more capabilities than it already has.
AI isn't just software.
It runs on huge, powerful computers.
It requires lots of human labor.
It costs tons of money to make and operate, even if those costs are currently being subsidized by investors.
So the government could make it harder to get the types of computer chips necessary for huge processing power.
Or it could give more resources to researchers in academia who don't have the same profit incentive as researchers in industry.
You could also say, all right, we understand researchers are going to keep doing the development and trying to make these systems more powerful, but we're going to really halt or slow down deployment and like release to commercial actors or whoever.
Slowing down the development of a transformative technology like this, it's a pretty big ask, especially when there's so much money to be made.
It would mean major cooperation, major regulation, major complicated discussions with stakeholders that definitely don't all agree.
But Seagal isn't hopeless.
I'm actually
reasonably optimistic.
I'm very worried about the direction AI is going in.
I think it's going way too fast.
But I also try to look at things with a bit of a historical perspective.
Seagal says that even though tech progress can seem inevitable, there is precedent for real global cooperation.
We know historically there are a lot of technological innovations that we could be doing that we're not because society just seems like a bad idea.
Human cloning or like certain kinds of genetic experiments, like humanity has shown that we are capable of putting a stop or at least a slowdown on things that we think are dangerous.
But even if guardrails are possible, our society hasn't always been good about building them when we should.
The fear is that sometimes society is not prepared to design those guardrails until there's been some huge catastrophe like Hiroshima, Nagasaki, just horrific things that happen.
And then we pause and we say,
okay, maybe we need to go to the drawing board, right?
That's what I don't want to have happen with AI.
We've seen this story play out before.
Tech companies or technologists essentially run mass experiments on society.
We're now prepared.
Huge harms happen.
And then afterwards, we start to catch up and we say, oh, we shouldn't let that catastrophe happen again.
I want us to get out in front of the catastrophe.
Hopefully, that will be by slowing down the whole AI race.
If people are not willing to slow down, at least let's get in front by trying to think really hard about what the possible harms are and how we can use regulation to really prevent harm as much as we possibly can.
Right now, the likeliest potential catastrophe might have a lot less to do with the sci-fi terminator scenario than it does with us and how we could end up using AI, relying on it in more and more ways.
Because it's easy to look at AI and just see all the new things it can let us do.
AIs are already helping enable new technologies.
They've shown potential to help companies and militaries with strategy.
They're even helping advance scientific and medical research.
But we know they still have these blind spots that we might not be able to predict.
So despite how tempting it can seem to rely on AI, we should be honest about what we don't know here.
So hopefully the powerful actors who are actually shaping this future, companies, research institutions, governments, will, at the very least, stay skeptical of all of this potential.
Because if we're really open about how little we know, we can start to wrestle with the biggest question
Are all of these risks worth it?
That's it for our black box series.
This episode was reported and produced by me, Noam Hasenfeld.
We had editing from Brian Resnick and Catherine Wells with help from Meredith Hodenat, who also manages our team.
Mixing and sound design from Vince Fairchild with help from Christian Ayala.
music from me, fact-checking from Tian Wen.
Manding Wen is a potential werewolf, we're not sure.
And Bird Pinkerton sat in the dark room at the octopus hospital listening to this prophecy.
3,000 years ago, we were told that one day there would be an octopocalypse, and that only a bird would be able to ensure the survival of our species.
You are that bird, Pinkerton.
Special thanks this week to Pawen Jain, Jose Hernandez-Orrayo, Samir Rawashte, and Eric Aldrich.
If you have thoughts about the show, email us at unexplainable at vox.com, or you could leave us a review or a rating, which we'd also love.
Unexplainable is part of the Vox Media Podcast Network, and we'll be back in your feed next week.
This month on Explain It To Me, we're talking about all things wellness.
We spend nearly $2 trillion on things that are supposed to make us well.
Collagen smoothies and cold plunges, Pilates classes, and fitness trackers.
But what does it actually mean to be well?
Why do we want that so badly?
And is all this money really making us healthier and happier?
That's this month on Explain It To Me, presented by Pure Leaf.
Support for this show comes from Pure Leaf Iced Tea.
When you find yourself in the afternoon slump, you need the right thing to make you bounce back.
You need Pure Leaf Iced Tea.
It's real brewed tea made in a variety of bold flavors with just the right amount of naturally occurring caffeine.
You're left feeling refreshed and revitalized so you can be ready to take on what's next.
The next time you need to hit the reset button, grab a Pure Leaf iced tea.
Time for a tea break?
Time for a Pure Leaf.