AI Ethics at Code 2023

28m
Platformer's Casey Newton moderates a conversation at Code 2023 on ethics in artificial intelligence, with Ajeya Cotra, Senior Program Officer at Open Philanthropy, and Helen Toner, Director of Strategy at Georgetown University’s Center for Security and Emerging Technology. The panel discusses the risks and rewards of the technology, as well as best practices and safety measures.
Recorded on September 27th in Los Angeles.
Learn more about your ad choices. Visit podcastchoices.com/adchoices

Listen and follow along

Transcript

Support for the show comes from Saks Fifth Avenue.

Sacks Fifth Avenue makes it easy to shop for your personal style.

Follow us here, and you can invest in some new arrivals that you'll want to wear again and again, like a relaxed product blazer and Gucci loafers, which can take you from work to the weekend.

Shopping from Saks feels totally customized, from the in-store stylist to a visit to Saks.com, where they can show you things that fit your style and taste.

They'll even let you know when arrivals from your favorite designers are in, or when that Brunello Cacchinelli sweater you've been eyeing is back in stock.

So, if you're like me and you need shopping to be personalized and easy, head to Saks Fifth Avenue for the best follow rivals and style inspiration.

Hi, everyone.

This is Pivot from New York Magazine and the Vox Media Podcast Network.

I'm Scott Galloway.

We have a bonus episode for you today from the 2023 Code Conference.

Platformer's Casey Newton, also Kara's roommate, leads a fascinating panel on the ethics of artificial intelligence.

Oh my God, that's a page turner.

Enjoy the episode.

Please welcome back the founder of Platformer, Casey Newton.

All right.

Hi, everybody.

How are we doing?

Good?

All right.

I've very much been enjoying the programming today.

If there was a theme that I keep seeing over and over again, it said AI is going to help businesses.

It's going to be disruptive.

There's a lot of money to be made.

It'll be sort of fascinating to watch it play out.

With this next session, I want to propose something in addition to all of that, which is that it's developing quite fast and in ways that are scary and might pose significant risks to all of us.

And over the past year, I've gotten to speak to a couple of individuals who have really helped to educate me on this subject, and I'm hoping that they can do the same for you.

So please welcome Ajaya Kotra, the senior program officer at Open Philanthropy, and Helen Toner, the Director of Strategy and Foundational Research Grants at Georgetown University's CSET, and also a board member of OpenAI.

Welcome to the stage.

Have a seat.

All right, so I want to spend most of our time talking about the medium and the long-term risks from AI, but I also want to start by acknowledging that there are some risks and harms that are being presented today that maybe aren't getting enough attention.

So I think my first question for the two of you is: what sort of harms and risks do you see unfolding right now that maybe we are paying less attention to as we focus on the more exciting business use cases?

Yeah, I mean, we're seeing all kinds of ways that AI is being used in practice, of course, and also ways that it's being used in where it perhaps shouldn't be.

So I think a classic example for me is police starting to use facial recognition systems to actually make arrests and determine how they're pursuing their cases, even when we know that these systems don't work well, we know that they don't work well, especially for minority groups.

And there was a story just a month or two ago of a woman who was eight eight months pregnant and held in a cell while having contractions purely because of a facial recognition match for a crime that she did not commit.

So I think

we can talk about lots of ways in which the tech is getting more advanced that are potentially quite scary.

But I also think it's important to recognize there are ways the tech is not working well that are already harming people.

And in some ways, those things are connected as well, where a lot of the issues that we have now and we might have in the future come from the fact that we really don't understand how these systems work

in detail.

Yeah, absolutely.

Well, let's turn, and because we have such limited time, let's then immediately turn to maybe the medium term and talk about AI safety.

As a practical matter, how do we figure out in advance if AI is going to hurt us?

This is a question that the two of you spend a lot of time thinking about.

Yeah, so

from my perspective, my day job is funding technical research that could help make especially advanced AI systems safer.

And one thing that poses a technical logistical policy challenge is that particularly for large language models and other frontier systems,

we don't really have a systematic way of predicting in advance what capabilities will emerge when you train a larger model for longer, with more data, with more compute.

So we have these scaling laws that predict their performance on certain sort of narrow benchmarks will improve in a kind of predictable way.

So you might know that it'll go from having 83% performance to 92% performance on some

artificial sort of coding task.

But you don't know how that translates to real-world relevant capabilities.

So you don't know if going from 83% to 92% means now you're able to effectively find and exploit software vulnerabilities in real-world systems.

I want to say as well how kind of crazy that is.

So, we just heard from this wonderful bird app, which is a great example of

typical AI use cases that we've had in the past, where you have an AI system that is being trained for one specific purpose.

You have a metric for how good it is at that purpose.

I'm sure that our previous speaker could tell you all about the accuracy of that bird recognition system.

When you train your system, you look at the accuracy, you know how good it is, you know what it can do.

But these large language model systems that we have now, it's it's not that way at all.

The only thing they're being trained to do, or almost the only thing, is predict the next word.

And so you can look at your system once you finish training it as an AI scientist and say, okay, here's how good it is in some giant database at predicting the next word.

But what we see in these companies is that they then spend months after that, months, figuring out, so what does that actually mean?

Like, can it do like grade school math?

Can it do college-level math?

Can it like code a simple app, a complicated app?

Can it like help

some kid in a basement learn how to hack like, I don't know, a water plant or something?

Like maybe, we don't know.

And you sort of figure it out as you go.

And so to me, I think the we don't know how to tell in advance if they're safe is sort of the mind-blowing piece.

Well, we don't know how to tell in advance what they do, period.

Yes.

Like it turns out that in order to get really, really good at predicting what the next word will be in a complicated piece of text,

it the simplest way to do that involves forming models of the world.

And we're starting to do interpretability to actually demonstrate this, which is really cool.

There was a paper a year ago that showed that if you train a transformer model, which is similar to large language models, on just a sequence of moves in the game of Othello, just a board game, just text without ever showing it the actual board, it actually figures out and forms a model of the board in its brain, in the weights of its neural network, in order to

do well at predicting what Othello move will come next.

And it like taught itself the rules of Othello in order to do well at predicting what comes next.

And you see that it's doing that with the world in order to learn to predict the next word well, but you don't know what models it's building.

You don't know how good they are, and you don't know

in which domains they're better and worse.

Right.

So just to underline that line that,

I just want to underline that these text-based games are teaching themselves how to play and win games.

That's just something I like to sit with from time to time as I fall asleep.

But Helen, what you just said, this idea of, well, we've invented some technology, we're going to put it out in the world and see what happens.

That's sort of how most of the people in this room develop products, right?

So, where is the point that this starts to feel like a safety issue?

And as you guys look to fund research into this, how do you try to make it safer?

Yeah, I mean, I think in a lot of ways, for a lot of types of things we build, it's really good to have this iterative loop of: we build something, we're not totally sure how it works, we're not totally sure how risky it is, put it out into the world, let people use it.

That's what OpenAI did with ChatGPT.

I think there haven't been enormous disasters yet, except for your co-host almost having his marriage broken up, as we all know.

It's a very close call, as I understand it.

But I think at some level, you don't really want to start experimenting by just throwing the thing out there and seeing what happens.

So if you have

some of the research scientists, some of the top research scientists in AI saying, oh, this system could make it really accessible to build a bioweapon.

Or, oh, this system could actually just get out of hand and escape our control entirely, deceive us so that we think that it's doing great things and take over data centers.

And you have really serious

experts telling us that this might be a possible, not with the technology we have now, but maybe with the next generation in six to 18 months, maybe with the generation after that in the next three to five years.

I think you don't just say, let's just give it to the consumers and see what they do with it.

I think at that point, you really need to take a step back and sort of think about what you're doing.

Right.

Yeah.

And the challenge with this, I feel like I struggle to think of good analogies for this, so I've thought of bad analogies, which is, is, you know, imagine if in 2018 OpenAI is a toaster baking company, and they discovered that the toasters just get better and better as they make them bigger and bigger.

And eventually, you know, you go from making toasters to making a stick of dynamite to making a nuclear bomb.

And the process you follow to build this product is the same at each time, and the product looks superficially the same at each stage.

But in 2019, you're making a toaster, and in 20, whatever you're making a nuclear bomb.

At what stage do you say, okay,

hold on.

Now

it's like too powerful.

You have to think of it in a different way.

Like, I think the thing that's tough about not knowing what capabilities will emerge when you make the systems bigger is that you don't know what kind of system you're even dealing with.

Like in society, we have less stringent norms and rules and laws and regulations around making toasters than around making weaponry.

But if you don't know if what you're making is a toaster toaster or a weapon, then you don't know how you should think about it.

Right.

And I would say, like, just based on the capabilities these things have today, they seem much closer to a weapon than a toaster.

And we know that because of all of the training that they have to do on these models before they let people like me use them.

We'll take a quick break, and when we come back, more of Casey Newton's conversation on the ethics of AI from code.

Thumbtack presents.

Uncertainty strikes.

I was surrounded.

The aisle and the options were closing in.

There were paint rollers, satin and matte finish, angle brushes, and natural bristles.

There were too many choices.

What if I never got my living room painted?

What if I couldn't figure out what type of paint to use?

What if

I just used thumbtack?

I can hire a top-rated pro in the Bay Area that knows everything about interior paint, easily compare prices, and read reviews.

Thumbtack knows homes.

Download the app today.

We're back with this bonus episode recorded live at the 2023 Code Conference panel on artificial intelligence.

Earlier this year, a group of scientists said, okay, everybody, this stuff is moving way too fast.

We need to slow down.

We need to figure out if it's a toaster or not.

And I think it's safe to say that the industry said, absolutely not.

We are going to, if anything, we're going to release things even faster.

In such a world, what do we need to do to

who needs to agree on this question of slow it down or regulate?

Should we look to the industry to self-regulate at all?

Does the government need to come in?

Do average people have a role to play here?

I think there's a really interesting concept that's being kind of adopted gradually over the last month or two, and I think is gaining steam, called this idea of responsible scaling policies.

And the idea here is that it's trying to kind of bridge this gap of, like, look, if we have a toaster, I don't want to be building my toaster in a bunker that has the army outside.

That's ridiculous.

At the same time, if I'm building a nuclear weapon, I don't want it to be happening in some

nice leafy office in Somai in San Francisco.

And so, the idea of these responsible scaling plans is getting, and so Anthropic, which is obviously a major AI company, just put out the first responsible scaling plan I'm aware of, or policy that I'm aware of.

The idea is that you need to state clearly, you as an AI company, you need to state clearly, here are the kinds of capabilities we're ready to deal with.

If our AI system can describe to you how to make beautiful toast, can tell you what birds are in your backyard, we don't have that many protective measures.

We don't need many protective measures.

But here are some other capabilities that might emerge as we make our systems smarter, maybe related to developing weapons, maybe related to kind of evading human control,

replicating autonomously, things like that.

Here's how we'll detect those capabilities, and here's what we will do if we find them.

And here's the level of protections we would need to have in place before we would actually build or deploy a system like that.

And I think this concept is pretty new, and I think still under development, so we'll see where it goes.

But I really like it as a sort of,

I think no one really, or most people don't find it very satisfying to say, let's pause for six months.

Let's just all freeze in place, play musical statues, and then hopefully by the time the music starts again, things will be better.

That doesn't feel very useful.

And so what I like about this sort of responsible scaling concept is you're really trying to set out what's the concern, what would we do about it, how would we tell if we have to do that.

Yeah.

And another thing I really like about the framework of responsible scaling policies is that they sort of are designed to force you to pause when you need to and for as long as you need to, which might be much longer than six months.

So the way the workflow works is before you train an AI system, you are trying to forecast what will this system be capable of and what will it not be capable of.

And in the middle of training, you have devised tests.

So if you think it's not going to be capable of

hacking into

your servers,

You actually have a test in which you tell it to try to hack into your server, hopefully something easier than what you're actually worried about.

You test it in the middle of training and you see if you were wrong.

And if you realize, oh, this is more capable, more powerful than we thought it was, then you need to pause until you have figured out how to deal with the level of capability that it does actually have, which

maybe it takes you three months to beef up your security, run more tests, and confirm to your satisfaction that it isn't able to get past your greater defenses, or maybe it takes you three years.

I like that.

So fewer sort of arbitrary slowdowns and more sort of like rigorous measures of the capabilities.

Helen, you're on the board of OpenAI, and one of the unusual things about the OpenAI board is that a majority of members can vote to shut down the company if they get too worried about what they see.

That is a wild responsibility.

How do you prepare yourself to make a decision like that?

And do you think it will come to that?

So, I don't believe we have the authority to shut down the company.

We certainly, like any board, have the authority to hire and fire the CEO, which is closely related.

Obviously, Sam Altman has played an incredible role in OpenAI's success.

Yeah, so it's a very unusual company structure.

As you may know,

the sort of for-profit side is a limited profit model or capped profit model.

And so, the board of the non-profit, which

governs the entire rest of the organization, I think of us as having kind of two roles.

The simple version of the role would be: the for-profit arm has a capped profit model.

That means when all the investors have hit their caps, any remaining profit flows into the nonprofit organization.

So the sort of simple version of our role as a board would be kind of sit back, wait, hope that OpenAI makes ungodly amounts of profit.

And if they do, then the nonprofit will have large amounts of capital to distribute, hopefully in ways that are beneficial for humanity.

The more complicated piece of the mission for the nonprofit is sort of governing the company along the way and also keeping an eye on the kinds of issues we're discussing, thinking about, you know, are there

product decisions or investment decisions that maybe aren't in line with the company's mission of ensuring that AGI is beneficial to all of humanity?

And if so, you know, what kind of processes or checks could we put in place in order to kind of steer things in a more beneficial direction?

And of course, like any board, you know, we understand the management team, the company has so much more day-to-day visibility into the details of what's going on.

So we try to mostly stay out of things and not micromanage and be unhelpful.

But we also try to be engaged enough so that we can be involved in the big decisions.

Got it.

Well, so over the past year, we've seen, I would say, two major approaches to releasing AI tools.

One is the more closed-source way, which OpenAI, Anthropic, and Google have all done.

Another is to make the underlying source code open source, which Meta has done.

Both sides argue that theirs is the safer approach.

So, who's right?

I think it comes back again to are you building a toaster or are you building a nuclear bomb?

And do you know which you're building?

Well, if they're building a toaster, I sort of don't care.

Like, so like, what if they're building a nuclear bomb?

I think that

the sort of at the policy level, my ideal situation for what I would want an AI company to do, if any AI companies are listening, is to train your model, have in place some testing mechanisms where you figure out what its capabilities are, try to red team

what dangerous things it would be possible to do with what levels of access to your model, what would it be possible for a malicious actor to do with just sampling access where they can just chat with the model, what level of mischief could they get up to if they could fine-tune the model and further train it,

what could they get up to if they had open source access to the model, and sort of rigorously test all of those things and release to the degree that you've determined is safe, maybe with the involvement of the board or the involvement of the government going down the line.

And I think what that will conclude, kind of on the object level, is that yes, some models totally should be open sourced, and other models maybe shouldn't be trained at all.

Maybe they're too dangerous to be sitting there in your servers, even if you release them to nobody at all, because people might steal them.

And then other models are in between where they're safe to release with appropriate security measures in certain ways, but not to fully open source.

Another an image I got from a guy called Sam Hammond, which I found really helpful here as well, is an idea of what we want, because obviously open sourcing things has enormous benefits.

It really spreads opportunity and allows lots of people to tinker with these models, build things that are useful for their community or their particular use case.

Also distributes the power that there are a lot of concerns about power getting too concentrated.

So an image that I really liked was this idea of could you have vertical deceleration and horizontal acceleration, meaning the most advanced systems that are developing these new capabilities we don't understand well, we can't predict well, maybe we should slow down how rapidly those are sort of put out onto the internet, never to be recalled, because once something is up on

being torrented, you're not going to get it back.

So maybe those models should be, we should just ease up on the pace there, give it a few years so we get a little time to get used to them.

At the same time,

Could we be accelerating the extent to which models that are safe, that have lots of beneficial use cases, accelerating how much those can be used by different communities, giving people resources so that they can

better understand how to make use of them, how to fine-tune them, how to build them into products?

I really like that image because I do think that there are

real benefits as well.

And I think there are plenty of AI systems.

Lama 2, for instance, I'm not sure that Meta had the most responsible process for releasing that, but I also doubt that it's going to cause really major harms.

And so it's great that a larger community gets access to that.

So can we kind of have the best of both worlds worlds is my hope.

And as a as a fan of science and a funder of many scientists, it is really frustrating that these fascinating objects of scientific study are behind closed doors so often.

And so I would really like it to be the case that there's

a process for making this decision that like minimizes both false positives and false negatives that like ensures that the company isn't sitting forever on something that it would actually be beneficial to the world to open source while at the same time ensuring that it isn't open sourcing the recipe for building a bioweapon or a nuclear bomb or something.

One more break.

Stay with us for more on the ethics of AI.

Support for Pivot comes from LinkedIn.

From talking about sports, discussing the latest movies, everyone is looking for a real connection to the people around them.

But it's not just person to person, it's the same connection that's needed in business.

And it can be the hardest part about B2B marketing, finding the right people, making the right connections.

But instead of spending hours and hours scavenging social media feeds, you can just tap LinkedIn ads to reach the right professionals.

According to LinkedIn, they have grown to a network of over 1 billion professionals, making it stand apart from other ad buys.

You can target your buyers by job title, industry, company role, seniority skills, and company revenue, giving you all the professionals you need to reach in one place.

So you can stop wasting budget on the wrong audience and start targeting the right professionals only on LinkedIn ads.

LinkedIn will even give you $100 credit on your next campaign so you can try it for yourself.

Just go to LinkedIn.com/slash pivot pod.

That's linkedin.com/slash pivot pod.

Terms and conditions apply only on LinkedIn ads.

And we're back with the conclusion of Casey Newton's 2023 code panel on the ethics of AI.

You know, a variety of recent polls have shown that Americans are fairly worried about the negative impact of AI, what it might do to their jobs or for safety issues.

Should AI developers be taking those opinions into account when developing?

I mean, I think yes.

And I think also the history of not taking public opinion into account when the public feels strongly about something that is affecting them,

that can be a pretty bad idea.

It can go pretty badly for you if you just totally disregard that as a factor.

I think the way to take it into account as well is to make sure that the AI systems that we're building are not just selling targeted ads more effectively or like getting people to binge watch videos more.

But the reason that many AI researchers are excited about AI is because it's supposed to make the future better.

It's supposed to give us clean energy and better medical solutions.

And so I think taking the public's opinion into account doesn't have to mean slowing down or not doing AI.

It can also mean like really making good on these sort of promises of what a positive future could look like.

Right.

All these folks I'm sure will have opportunity to integrate AI into their businesses in some way, you know,

literally this week and at any time in the next couple years.

What would you just sort of say to them, or what do you think they should keep in mind as they're evaluating technologies to bring into their businesses from that safety perspective?

I guess I would say that different AI products are very different.

And

the kinds of AI products I think about are large language models in particular.

And with large language models, I think

it's a kind of nerve-wracking and also scientifically fascinating situation we're in, in which I would claim that the people building these systems don't fully understand how they work and what they can do and what they can't do.

And you kind of need to play with them yourself to get a feel for it.

And we're pretty far from having a systematic science of these things.

Yeah, my suggestions, I would have two suggestions.

One is

remember that you get what you measure for.

And so, you know, whatever metrics you're testing your AI system with, that's what you will get.

That's the performance that will be good.

And it's really important to think about ways that what you're testing for in development might be different than what you get in practice.

So for instance, with Microsoft's Bing release,

one thing they, I don't think, I don't, it seems like they didn't anticipate that if you have Bing searching the web and that feeding into what the chatbot says, I don't know if folks saw this, but there was one example of a researcher, I think, in Germany, who had some unpleasant conversation with Bing.

Then that got reported on.

So there were newspaper articles about it.

And so then he went, but the same researcher went back to Bing and said, hey, this is my name.

You don't like me.

Bing searched the web and found a bunch of coverage of how it didn't like him.

And it

became even more extreme and even more aggressive.

And so that was an example of, in real life,

your use case might look a little different than what you expect.

And so make sure to think about ways that the task might be different than what you're testing.

That's great.

Let's start right here.

Hi there, Nicolas DuPont from Cyborg.

So AI is obviously very data hungry, both for training and inference.

Where does digital privacy fit into this conversation, particularly when it comes to private data that's used by companies like Meta or Microsoft or whomever else to train these models?

I think it depends on the use case.

I think lots of use cases don't necessarily need private data.

And for the use cases that do, I think there are a few different ways that you can approach that.

There are actually privacy enhancing technologies, or privacy-preserving machine learning, this gets called, where you can actually use techniques like federated learning.

So that's what's used

for predictive text on your phone.

Obviously, you don't want all your texts being sent to some massive Apple database, and that's actually not how they train their systems.

They use the system that can learn on your device and send back sort of particular lessons without sharing all of your data.

Or techniques called differential privacy, which gives guarantees of how much private data is being released.

So I think it's something to be very kind of aware of and cautious about, but there are also practical techniques that can help.

JPeter's from the Verge.

JPeters from the Verge.

A lot of companies are trying to make chatbots more friendly now, and that makes them more approachable.

You want to chat with them more.

Is this a good thing or a bad thing?

I guess I would say it depends again.

I think, obviously, there's significant upsides to your systems having personalities that are more engaging and that make you want to talk to them, make you feel more comfortable talking to them.

A thing I do worry about, though, is

sort of like what Helen said before, you get what you measure for.

So if you're optimizing your AI system to be engaging and get humans to like it and like its personality and want to engage with it, that may not necessarily reflect kind of what the human would endorse if they thought about it more, if they didn't have that kind of emotional influence in their life.

So it feels similar to the sort of the question lots of people ask with social media and optimizing for engagement.

Is optimizing for engagement what people would sort of endorse if they could step back and look at it more objectively?

To me as well, I think this is like

almost a UX question, and I feel like the UX user experience of AI is something that we're still really only beginning to explore.

So I thought it was really interesting.

Reid Hoffman and Mustafa Seleman's company, Inflection, has an AI bot called Pi, which people may have played with.

And apparently, it was a deliberate choice that the text of that bot kind of fades in in this unusual way.

It doesn't look like you're kind of texting with your friend.

Apparently, they did that deliberately, so it would be clear: you know, this is a machine, this is not a human.

I think there are probably like thousands of little choices like that, including sort of how friendly is your AI, what kinds of ways is it friendly, that we're just starting to explore.

And I hope we have lots of

time to kind of experiment with those different possibilities.

Very cool.

Well, that is our time.

Thank you so much, Helen and JF.

Thank you, Casey.

Thank you.

This month on Explain It To Me, we're talking about all things wellness.

We spend nearly $2 trillion on things that are supposed to make us well.

Collagen smoothies and cold plunges, Pilates classes, and fitness trackers.

But what does it actually mean to be well?

Why do we want that so badly?

And is all this money really making us healthier and happier?

That's this month on Explain It to Me, presented by Pureleaf.