Meta's Chief AI Scientist Yann LeCun Makes the Case for Open Source

Transcript

Speaker 2 Hi, everyone, from New York Magazine and the Vox Media Podcast Network. This is on with Kara Swisher, and I'm Kara Swisher.

Speaker 2 Today, we've got a special episode for you, My Conversation with Jan Lacun, Chief AI Scientist at Meta.

Speaker 2 This was recorded live as part of a series of interviews on the future of AI I'm conducting in collaboration with the Johns Hopkins University Bloomberg Center.

Speaker 2 And Jan is really the perfect person for this. He's known as one of the godfathers of AI.
Some even call him an early AI prophet.

Speaker 2 He's been pushing the idea that computers could develop skills using artificial neural networks since the 1980s, and that's the basis for many of today's most powerful AI systems.

Speaker 2 Jan joined what was then known as Facebook as director of AI research in 2013, and he currently oversees one of the best-funded AI research organizations anywhere.

Speaker 2 He's also been a longtime professor at New York University and received the 2018 Turing Award, which is often called the Nobel Prize of Computing, together with Jeffrey Hinton and Yashua Bengio for their breakthroughs on deep neural networks that have become critical components of computing.

Speaker 2 Jan is also a firebrand. He's pretty outspoken politically.
He's not a fan of president-elect Donald Trump or Elon Musk and lets you know it on social media.

Speaker 2 And is also not without controversy in his own field.

Speaker 2 When others, including Hinton and Bengio, started warning about the potential dangers of unmitigated AI research and calling for government regulation, Jan called it BS.

Speaker 2 In fact, he said that regulating AI RD would have apocalyptic consequences. I want to talk to him about this dispute.

Speaker 2 We'll also get into what Meta is doing in this space right now, where he sees the potential and risks for all the new generative AI agents coming on the market, and perhaps most importantly, how close we are to artificial general intelligence or AGI.

Speaker 3 Support for On with Carrasswisher comes from Saxford Avenue. Saxfith Avenue makes it easy to holiday your way whether it's finding the right gift or the right outfit.

Speaker 3 Sax is where you can find everything from a lovely silk scarf from Saint-Laurent for your mother or a chic leather jacket from Prada to complete your cold weather wardrobe.

Speaker 3 And if you don't know where to start, Saks.com is customized to your personal style so you can save time shopping and spend more time just enjoying the holidays.

Speaker 3 Make shopping fun and easy this season and get gifts and inspiration to suit your holiday style at SACS Fifth Avenue.

Speaker 2 Welcome. Thank you for joining me here at the new Johns Hopkins University Bloomberg Center for this special live conversation.

Speaker 2 You're obviously known as one of the godfathers of AI because of your foundational work on neural networks.

Speaker 2 There's a few people like that who've been around, which is the basis for today's most powerful AI systems. For people who don't know, AI has been with us for a while.
It's just reached a moment.

Speaker 2 You know, we're here with a new administration coming in. And I have to tell you, you are the most entertaining person on social media.
That's a wonk that I've ever met.

Speaker 2 You're also quite outspoken as a scientist, as a person, I think as a citizen is what you're talking about. And I promised the Meta PR people that I wouldn't get them fired today.

Speaker 2 But

Speaker 2 you're an astonishing person. I just want to like, I'm going to read a few and I want you to talk about why you do this.

Speaker 2 I don't see a lot of people in tech do this except for Elon Musk, but you actually, I like.

Speaker 2 So you write, Trump is a threat to democracy. Elon is his loudest advocate.
You won't get me to stop fighting enemies of democracy.

Speaker 2 Elon didn't just buy Twitter. He bought a propaganda machine to influence how you think.
Those were the nice ones.

Speaker 2 As I've said multiple times about Elon, I like his cars, his rockets, satellite network. I disagree with his stance on AI existential risk.
I don't like his constant hype.

Speaker 2 I positively hate his newfound, vengeful, conspiracist, paranoid, far-right politics.

Speaker 2 I'm nicer to him than you are, and that's the thing.

Speaker 2 And you talk about this a lot, and

Speaker 2 you've been pretty...

Speaker 2 Not supportive of Donald Trump, too. I'm not going to read them all, but they're tough, tougher than I've ever been.
So I want to talk about that.

Speaker 2 You've gotten in open disputes with Elon. You've called President Trump a pathological liar.

Speaker 2 And Mark was just in Mar-a-Lago enjoying a lovely meal on the terrace there.

Speaker 2 Talk about your relationship with the upcoming administration and how you're going to, are you going to have to start to not do this? Or do you give a fuck?

Speaker 1 Well,

Speaker 1 I mean, I worry about many things, but

Speaker 1 or I'm interested in a lot of questions.

Speaker 1 I'm sort of

Speaker 1 politically pretty clearly a classic liberal,

Speaker 1 right?

Speaker 1 Which

Speaker 1 on the European political spectrum puts me right in the center.

Speaker 1 Not in the US.

Speaker 2 No.

Speaker 1 And

Speaker 1 what got me riled up with Ilon was when he started sort of attacking the institutions of higher learning, of science

Speaker 1 and scientists, like Antony Fauci and things like this. And I'm a scientist, I'm a professor as well as

Speaker 1 an executive at Meta.

Speaker 1 And I have a very independent voice. And I really appreciate the fact that at Meta I can have an independent voice.
I'm not doing corporate speech, as you can tell.

Speaker 1 So

Speaker 1 that tells you something about, I think, how the company is run.

Speaker 1 And it's also reflected in the fact that in the research lab that I started at Meta, we publish everything we do. We distribute our code in open source.

Speaker 1 We're so very open about things and about our opinions as well.

Speaker 1 So

Speaker 1 that's the story.

Speaker 2 So now he's at the red-hot center of things. How are you going to cope with that going forward?

Speaker 1 Well, I mean, I met Elon a bunch of times.

Speaker 1 He can be reasonable.

Speaker 1 I mean, you have to work with people, right? Regardless of disagreements about political or philosophical opinions, at some point you have have to work with people.

Speaker 1 And that's what's going to happen between...

Speaker 1 I don't do policy at Meta, right? I'm a... You don't.

Speaker 1 I work on fundamental research, right? I don't do content policy. I don't do any of that.

Speaker 1 I talk to a lot of governments around the world, but mostly about AI technology and how it impacts

Speaker 1 their policies.

Speaker 1 But

Speaker 1 I don't have any influence on

Speaker 1 the relationship between Meta and the political system.

Speaker 2 I'm curious why you then didn't, why you went to a place like Meta versus, in the old days, you would have been at a big research university or somewhere else. How do you look at your power then?

Speaker 2 What is your influence? I mean, you're sort of saying I'm just a simple scientist making, you know, making things.

Speaker 1 Okay, I'm also an academic, right? I'm a professor at NYU.

Speaker 1 And I've kept my position at NYU. When Mark Zuckerberg approached me 11 years ago, almost to the day,

Speaker 1 He asked me to kind of create basically a research lab in AI for Meta because he had this vision that this was going to have a big impact and a big importance. He was right.

Speaker 1 And I told him, I only have three conditions. I don't move from New York.
I don't quit my job at NYU. And all the research that we're going to do, we're going to do it in the open.

Speaker 1 We're going to publish everything we do. And we're going to open source our code.
And his answer was, yes, yes. And the third thing is, you don't have to worry about this.

Speaker 1 It's in the data of the company. We already open source all of our platform code.
And so, yeah, there's no problem with that. And this is not an answer I would have had anywhere else

Speaker 1 that would have had the resources to create a research lab. And I had the opportunity there.
I was given the opportunity basically to create a research organization in industry. from scratch

Speaker 1 and basically shape it the way I thought was proper. I had some experience with this because I started my career at Bell Labs.

Speaker 1 So I had some experience with sort of, you know, how you do real ambitious research in industry.

Speaker 1 So I thought that was the most exciting challenge.

Speaker 2 So Trump recently named David Sachs as the AI in cryptozar. For those who don't know, Sachs is an investor and part of the PayPal Mafia, also a longtime friend of Elon Musk.

Speaker 2 He's shifted his politics pretty dramatically. Talk about the, is there a need for that right now in Washington as someone who's doing this research or do you not care whatsoever?

Speaker 2 Does it matter to you? Is it important that the government do something like this? Oh, absolutely. And tell us why.

Speaker 1 Well, there's a number of different things. The first thing is to not make regulations that make open source AI platforms illegal, because I think it's essential

Speaker 1 for the future of

Speaker 1 not just the progress of technology, but also the way people use them, like to make it widely disseminated and everything. So that's that's the first thing.

Speaker 1 The second thing is,

Speaker 1 and by the way, there is no problem regulating products that are based on AI. That's perfectly fine.
I'm not anti-regulation or anything.

Speaker 1 The second thing is

Speaker 1 the academic world is falling behind and has a hard time contributing to the progress of AI because of the lack of computing resources.

Speaker 1 And so I think there should be resources allocated by the government to

Speaker 1 give computing resources to academics.

Speaker 2 To academics. Now this is, as you said, it's shifted rather dramatically because academics is where a lot of the research, the early computing research, was done, and now it's moved away from that.

Speaker 2 Andrew Ferguson has been tapped head of the FDC. Former Fox News anchor Pete Hegseth is nominated Defense Secretary.

Speaker 2 Ferguson seems to want to roll back any attempts to be a regulator. Is this important for government to be more active in this area?

Speaker 1 It's certainly important for the government to be more informed

Speaker 1 and educated about it.

Speaker 1 But

Speaker 1 I mean, active certainly, for the reasons that

Speaker 1 I said before, because there's probably an industrial policy to have.

Speaker 1 All the chips that enable AI at the moment are all fabricated in Taiwan,

Speaker 1 designed by a single company.

Speaker 1 There's probably something to do there to sort of

Speaker 1 maybe

Speaker 1 make the landscape a little more competitive or

Speaker 1 for chips for example for chips for example

Speaker 1 and this there's another question I think that that's that's really crucial also and that

Speaker 1 has consequences not just for the US government but governments around the world which is that you know AI is quickly going to become a kind of

Speaker 1 universal knowledge platform basically you know the sort of a repository of all human knowledge but that can only happen with free and open source platforms that are trained on data from around the world.

Speaker 1 You can't do this within the walls of a single company on the West Coast of the U.S. You can't have a system speak all 700 languages of India, or however many there are.

Speaker 1 So eventually, those platforms will have to be trained in a distributed fashion with lots of contributors from around the world.

Speaker 1 And it will need to be open.

Speaker 2 I know you worry about premature regulation, stifling innovation, but you signed an open letter to President Biden against his AI executive order. Talk about why you did that more broadly.

Speaker 2 What role you think the government should play exactly.

Speaker 1 So I think there were plenty of completely reasonable things in that executive order. Similarly, in the EU AI Act,

Speaker 1 like for protection of privacy and things like this, which make complete sense.

Speaker 1 really sort of I disagreed with both in the EU AI Act in its original form and in the executive order is that there was a limit established where if you train a model with more than 10 to the 24 10 to the 25th flop

Speaker 1 you have to basically get a license from the government

Speaker 1 or get authorization of some kind based again on the idea that ai is intrinsically dangerous that you're above a certain level of of sophistication is intrinsically dangerous and i'm i'm I completely disagree with this this approach.

Speaker 1 I mean, there are

Speaker 1 important questions about AI safety that need to be discussed, but a limit on competition just makes no absolutely.

Speaker 2 Makes no sense.

Speaker 2 Recently, many big tech companies rolled out either LLM updates or new AI agents or AI features. I want to get an overview of what you're doing at Meta right now.
It's a little different.

Speaker 2 You released LLMA 3.3 as the latest update that powers Meta AI. Talk about

Speaker 2 what it does, and I'm going to ask you to compare it to other models out there and be honest. Like,

Speaker 2 how good is it it compared to and how do you look at that?

Speaker 1 Scientists need to be honest.

Speaker 1 I mean, the main difference between Lama and most of the other models is that it's free and open.

Speaker 2 Right, open source.

Speaker 1 So technically, it's.

Speaker 2 Explain to people who may not understand what that means.

Speaker 1 Okay, so

Speaker 1 open source software is software that comes to you with the source code. So you can modify it.
compile it yourself, you can use it for free.

Speaker 1 And in most licenses, if you make some improvement to it and you want to use it in a product, you have to release your improvement as well in the form of source code. So that allows

Speaker 1 platform-style software to progress really quickly. And it's been astonishingly successful

Speaker 1 as a way to distribute platform software over the years. The entire internet runs on open source software.
Most computers in the world run on Linux. Almost all computers in the world run on Linux.

Speaker 1 In In your car, in your Wi-Fi, whatever. So that's incredibly successful.
And the reason is it's a platform. People need to be able to modify it, make it safer, more secure, et cetera,

Speaker 1 make it run on various hardware.

Speaker 1 That's what happens. And

Speaker 1 it's not by design, it's just the market forces naturally push the industry to pick open source platforms, open source code,

Speaker 1 when it's a platform. Now, for AI, the question of whether something is open source is complicated because when you build an AI system, first of all, you have to collect training data.

Speaker 1 Second, you have to train what's called a foundation model on that training data. Okay, and that the training code for that and the data generally is not distributed.

Speaker 1 So Meta, for example, does not distribute the training data nor the training code, or most of it,

Speaker 1 for the Lama models, for example. Okay, then you can distribute the trained foundation model, and so that's what Lama is.

Speaker 1 And it comes with open source code, which allows you to run the system and also fine-tune it anywhere you want. You don't have to pay meta.
You don't have to ask questions. You don't have to ask Meta.

Speaker 1 You can do this.

Speaker 1 There are some limits to this that are due to the legal landscape, essentially.

Speaker 2 So why is that better? You make the argument, though. All the others are not.
They're closed systems. They d develop their own.

Speaker 1 There are a few other open.

Speaker 1 AI, Anthropic, and Google are closed.

Speaker 2 Why did they choose that from your perspective?

Speaker 1 Well, quite possibly to

Speaker 1 get a commercial advantage. Like, if you want to derive revenue directly from a product of this type, and you think you are ahead technologically, or you think you can be ahead technologically,

Speaker 1 and your main source of revenue is going to come from

Speaker 1 those services, then maybe there is an argument for keeping it closed.

Speaker 1 But this is not the case for Meta.

Speaker 1 For Meta, AI tools are part of a whole set of

Speaker 1 experiences, which are all funded by advertising, right?

Speaker 1 And so that's not the main source of revenue. On the other hand, what we think is that the platform will progress faster.
In fact, we've seen this.

Speaker 2 More innovative because it's. More innovative.

Speaker 1 There's a lot of innovations that we would not have had the idea of or we didn't have the bandwidth to do that people have done because they had the LAMA system in their hands and they were able to experiment with it and sort of come up with new ideas.

Speaker 2 So one of the criticisms is that you were behind and this was your way to get ahead.

Speaker 2 How do you address that? I've heard that from your competitors.

Speaker 1 So

Speaker 1 there's an interesting history to all of this, right?

Speaker 1 So first of all,

Speaker 1 you have to realize that everyone in the industry, except Google,

Speaker 1 to build AI system uses open source software platform called PyTorch, which is mostly developed, which was originally developed at Meta.

Speaker 1 Meta transferred the ownership of it to the Linux Foundation, so now it's not owned by Meta anymore. But OpenAI, Anthropic, everybody uses PyTorch.

Speaker 1 So, without Meta, there would not be ChatGPT and cloud and all of those things.

Speaker 1 Or not to this, you know, the same extent that they are today.

Speaker 1 There has been developments, the underlying techniques that are used in tools like ChatGPT were invented in various places. OpenAI made some contributions back when they were not secretive.

Speaker 1 Google certainly made some contributions.

Speaker 2 I like how you just put that in there when they were not secretive.

Speaker 1 When they were not secretive, because it became secretive, right? They kind of climbed up

Speaker 1 in the last three years or so. Google climbed up too, to some extent, not completely, but they did.
And Anthropic has never been open.

Speaker 1 So

Speaker 1 they sort of tried to push the technology in secret.

Speaker 1 I think we are perhaps at Meta, we're a pretty large research organization, and we also have an applied research and advanced development organization called Gen AI.

Speaker 1 The research organization is called FAIR. That used to mean Facebook Air Research.
Now that means fundamental air research.

Speaker 1 And it's about 500 people. And what we're working on is really sort of the next generation AI system.
So beyond LLMs, beyond large language models, beyond chatbots. There was this idea by some people

Speaker 1 in the past that you take LLMs like the

Speaker 1 ChatGPT MetaAI, Gemini of the world, and you just scale them up, train them on more data with more compute, and somehow sort of human-level intelligence will emerge from it.

Speaker 1 And Levi believed in this concept. Right.

Speaker 2 We've reached the end and there's no more data. Right.

Speaker 1 And it's pretty clear that we're reaching kind of a ceiling in the performance of those systems because we basically run out of natural data.

Speaker 1 Like all the text that's publicly available on the internet is currently being used to train all those

Speaker 1 LLMs. And we can get much more than that.
So people are kind of generating synthetic data and things like this. But we're not going to improve this by a factor of 10 or 100.

Speaker 1 So it's hitting a saturation.

Speaker 1 And what we're working on is basically the next generation AI system that is not based on just predicting the next word. So an LLM is called a large language model

Speaker 1 because it's basically trained to just predict the next word in a text.

Speaker 1 You collect

Speaker 1 typically something like 20 trillion words, something of that order. That's all the publicly available text on the internet with some filtering.

Speaker 1 And you train some gigantic neural net with billions or hundreds of billions of tunable parameters in it.

Speaker 1 to just predict the next word. Given a sequence of a few thousand words, can you predict the next word that will occur?

Speaker 1 You can never do this exactly, but what those systems do is that they predict basically a probability distribution over words, which you can use to then generate text.

Speaker 1 Now, there is no guarantee that whatever sequence of words is produced makes sense,

Speaker 1 you know, doesn't generate cofabulations or makes stuff up, right?

Speaker 1 So what a lot of the industry has been working on is basically fine-tuning those systems, training them with humans in the loop, to train them to do particular tasks and not produce nonsense.

Speaker 1 And also

Speaker 1 to kind of interrogate a database or search engine where they don't actually know the answer. And so you have to have systems that can actually detect whether they know the answer or not.

Speaker 1 And then perhaps generate multiple answers and then pick which ones are good.

Speaker 1 But ultimately, this is not how future AI systems will work.

Speaker 2 So talk about that. Last week, Meta released Meta Motivo.

Speaker 2 It's made to make digital avatars that seem more lifelike, because I understand.

Speaker 2 I feel like it's Mark trying to bring the metaverse and make it happen again. But

Speaker 2 talk about how it's

Speaker 2 what it is. I don't quite understand it, because there's a lot of money you're all investing in all these things.
Yeah, a lot of money. To make something that people would want to buy, right?

Speaker 2 Not just to make better advertising. You've got to have a bigger goal than that.

Speaker 1 Okay.

Speaker 1 Let you in on the secret. I'm wearing smart glasses right now, right?

Speaker 2 Yes, I have a pair myself.

Speaker 1 He's got, it's pretty cool, right? He's got cameras. Yeah.
If you're smart, I can take a a picture of you guys. Yeah, yeah.

Speaker 2 This is how far we've come.

Speaker 2 I had one of the first pairs of Google Glass, but it's a low bar from that.

Speaker 1 Go ahead. Now, here's the thing.
Eventually,

Speaker 1 we'll be working around, you know, we're talking five, ten years from now, we'll be working around with smart glasses, perhaps other smart devices, and they will have AI assistants in them.

Speaker 1 This one has one. I can talk to Meta AI through this, right?

Speaker 1 And, you know, those things will be sort of assisting us in our daily lives.

Speaker 1 And we need those systems to have essentially human-like intelligence, human-level intelligence, or perhaps even superhuman intelligence in many ways.

Speaker 1 And now, how do we get to that point? And we're very far from that point.

Speaker 1 Some people are kind of making us believe that we're really close to what they call AGI, artificial general intelligence. We're actually very far from it.

Speaker 1 I mean, when I say very far, it's not centuries. It may not be decades, but it's several years.

Speaker 1 And the way you can tell is that

Speaker 1 the type of task, right, we have LLMs that can pass the bar exam or

Speaker 1 pass some college exam or whatever.

Speaker 1 But where is our domestic robot that cleans the house and

Speaker 1 clears up the dinner table and fills up the dishwasher? We don't have that. And it's not because we can't build the robots.
We just cannot make them smart enough.

Speaker 1 We can't get them to understand the physical world. Turns out the physical world is much harder for AI systems to understand that language.
Language is simple.

Speaker 1 I mean, it's kind of counterintuitive for humans to think that, you know, we think language is the pinnacle of intelligence.

Speaker 1 It's actually simple, because it's, you know, just a sequence of discrete symbols.

Speaker 1 We can handle that. The real world, we don't.

Speaker 1 So what we're working on basically are kind of new architectures, new systems that understand the physical world and learn to understand the physical world the way babies and young animals do it by basically observing the world and acting in it.

Speaker 1 And those systems will eventually be able to plan sequences of actions so as to fulfill a particular goal. And that's what we call agentic, right?

Speaker 1 So an agentic system is a system that can plan a sequence of actions to arrive at a particular result. Right now, the agentic systems that everybody talks about don't actually do this planning.

Speaker 1 They kind of cheat a little bit. They kind of learn templates of plans.

Speaker 2 Right, but they can't do this. You're also working on the information just reported Meta is developing AI search engine.

Speaker 2 So, well, I assume you want to best Google search.

Speaker 2 Is that true? And do you think that's important?

Speaker 1 Well,

Speaker 1 a component of an intelligent assistant that you want to talk to

Speaker 1 obviously is search. You want to search for facts, right, and link to the sources of that fact so that the person you talk to kind of trusts

Speaker 1 the results. So search engine is a component of an overall complete AI system.

Speaker 2 And an run around the Google system, presumably.

Speaker 1 Well, I mean,

Speaker 1 the goal is not necessarily to compete with Google directly, but to

Speaker 1 serve people who want an AI system.

Speaker 2 So what do you imagine it's going to be for? Because most people perceive that Meta was lagging in the AI race, especially with all the hype around ChatGPT.

Speaker 2 But Mark Zuckerberg just said it had nearly 600 million monthly active users and on track to be the most used AI globally by the end of the year.

Speaker 2 It's very different from what people are doing on ChatGPT, which is a standalone app or with search. So what is it for for you besides to make advertising more efficient?

Speaker 2 I know Mark has talked about that, but from your perspective and Meta's perspective, what is it for for Meta? What does it mean for Meta?

Speaker 1 It is that vision of the future where everyone will have an AI assistant with them at all times. And it's going to completely, I mean, it's a new computing platform, right?

Speaker 1 I mean, before we used to call this a metaverse, but I mean, those glasses eventually will have displays, you know, augmented reality displays.

Speaker 1 I mean, there's already demonstrations of this with the Orion

Speaker 1 project that

Speaker 1 was shown recently.

Speaker 1 We can build them cheap enough

Speaker 1 right now, so we can't sell them yet, but eventually

Speaker 1 they'll be there. So that's that vision, that long-term vision.

Speaker 2 So to be our helper, our age.

Speaker 1 It'll be our helper, Delhi helper.

Speaker 1 I mean, it's like everyone will work around with a virtual assistant, which is like a human assistant, basically, or eventually like a staff of really really smart people, maybe smarter people than you, working for you.

Speaker 2 That's great. But right now, Meta is forecasting to spend between $38 billion and $40 billion.
Google says it's going to spend more than $51 billion. It's spent this year.

Speaker 2 Analysts predict Microsoft's spend will come close to $90 billion.

Speaker 2 Too much spending? Mark Benioff recently told me it was a race to the bottom.

Speaker 2 Are you worried about being outspent?

Speaker 2 To get me a smarter assistant doesn't seem to be a great business, but I don't know. I didn't take the job at Facebook when I was offered it in the early days, so don't ask me, but go ahead.

Speaker 1 Well,

Speaker 1 it's a long-term investment. I mean, you need the infrastructure to be able to run those

Speaker 1 AI assistants at a reasonable speed for a growing number of people.

Speaker 1 As you say, there is 600 million people using Meta AI

Speaker 1 right now.

Speaker 1 By the way, there's another interesting number. The open source engine, Lama,

Speaker 1 on top of which MetaAI is built, but which is open source, has been downloaded 650 million times.

Speaker 1 That's an astonishing number.

Speaker 1 I don't know who are all these people, by the way,

Speaker 1 but that's an astonishing number. There are 85,000 projects that have been derived from Lama that are publicly available, all open source.
Mostly in parts of the world, a lot of those projects are...

Speaker 1 basically training LAMA, for example, to speak a bunch of languages from

Speaker 1 Senegal or from India or

Speaker 2 so you don't think this money is ill-spent?

Speaker 1 No, I don't think so because there's going to be a very large population who will use those AI systems on a daily basis, you know,

Speaker 1 within a year or two and then growing. And then those systems are more useful if they're more powerful and the more powerful they are, the more expensive they are computationally.

Speaker 1 So this investment is investment in infrastructure.

Speaker 2 In infrastructure, what's happening by private companies. Now you said the concentration of proprietary AA models in the hands of just a few companies was a huge danger.

Speaker 2 Obviously, there's also been critics of the open source model. They worry about bad actors, could use them to spread disinformation, cyber warfare, bioterrorism.

Speaker 2 Talk about the difference.

Speaker 2 Does Meta have a role in preventing that happening, given you're handing these tools, these powerful tools, in an open source method?

Speaker 1 Okay, so this was a huge debate. It was.

Speaker 1 You know, in the,

Speaker 1 you know, just

Speaker 1 until fairly recently, you know, the early 2023, when we started

Speaker 1 distributing Lama, the first Lama was not open source. You had to ask permission and you had to show that you were a researcher.

Speaker 1 And it's because, you know, the legal landscape was uncertain and we didn't know what people were going to do with it. So

Speaker 1 it wasn't open source. But then all of us at Meta received a lot of requests from industry saying like, you have to open source the next version because this is going to create a whole industry.

Speaker 1 It's going to enable a lot of startups and kind of new products and new things.

Speaker 1 And so we had a big internal discussion for several months internally,

Speaker 1 a weekly discussion, two hours with 40 people from Mark Zuckerberg down.

Speaker 1 Very serious discussions about this, about safety, about legal landscape, about all kinds of questions.

Speaker 1 And then at some point, the decision was made by Mark to say, okay, we're going to open source Lama 2.

Speaker 1 Tell me how to do it. And that was done in kind of summer 2023.

Speaker 1 And since then, it's basically completely jump-started a whole industry.

Speaker 2 Why is it more safe than these proprietary models that are controlled by the companies?

Speaker 1 Because there are more eyeballs on it. And so there are more people kind of fine-tuning them for all kinds of things.

Speaker 1 And so there was a question as to, you know, maybe a lot of badly intentioned people will put their hands on it and then

Speaker 1 we'll use them for nefarious purpose.

Speaker 2 Well, Chinese researchers developed an AI model for military use with an older version of Meta's Lama model as a backbone.

Speaker 1 It's actually kind of a

Speaker 1 very kind of

Speaker 1 minor bad things and you could have used one of the many excellent open source Chinese models, the one called Quinn, that's really good,

Speaker 1 which is on par with the best.

Speaker 1 So I mean the Chinese have good research,

Speaker 1 good engineers.

Speaker 1 They open source. a lot of their own models.

Speaker 1 This is not like...

Speaker 2 So you don't think that's Meta's responsibility. You put it out there, the tools, and then what people do with it.

Speaker 1 No, it is to some extent, of course. So there is a big effort in the LAMA team, in the Gen AI organization, to red team all the systems that we put out so that we ensure that they

Speaker 1 are,

Speaker 1 at least when they come out, or

Speaker 1 are minimally

Speaker 1 toxic and things like that, right? And mostly safe. That's a really important effort, actually.

Speaker 1 We even initially gave Lama to

Speaker 1 a bunch of hackers and at DEF CON and sort of had

Speaker 1 them like tried to do something bad with it. And the result is we haven't been aware of anything really bad done with any of the models that we've been distributing over the last almost two years.

Speaker 2 Yet would be the word I would put that behind that.

Speaker 1 Well yeah, but you know, it would have happened already. I mean,

Speaker 1 there have been,

Speaker 1 you know, the public doesn't realize this because they think it just appeared which had GPT, but there have been LLMs, LLMs,

Speaker 1 open source LLMs, available for many years before that.

Speaker 1 And I don't know if you remember this, but when OpenAI came up with GPT-2,

Speaker 1 they said, oh, we're not going to open source it because

Speaker 1 it's very dangerous. So people could do really bad things.
They could flood

Speaker 1 the internet with disinformation and blah, blah, blah. So we're not going to open source it.
I made fun of them because, I mean,

Speaker 1 it was kind of ridiculous. At the time, the capabilities of the system really was not that bad.

Speaker 1 And so, I mean, you have to accept the fact that those things have been available for several years and nothing really bad has happened. There was some

Speaker 1 bit of worry that people would use this for disinformation in the run-up of the elections in the U.S.

Speaker 1 And all kinds of things like this,

Speaker 1 cyber attacks and things. None of that really has happened.

Speaker 2 It's still good to be worried about such things.

Speaker 1 Well, I mean, you have to be

Speaker 1 watchful

Speaker 1 and do what you can to prevent those things from happening. My point is, you know, you don't need any of those AI systems for disseminating disinformation, as Twitter has shown us.

Speaker 1 Okay, good, there, good.

Speaker 2 I like how you get your little digs in. I'm watching it very carefully.
You did an Elon one, the secretive drama queens of open AI. I got that.

Speaker 2 So you also get a lot of flack online recently for saying that cultural institutions, libraries, foundations should make their content available for training by free and open AI foundation models like Lama, presumably.

Speaker 2 You were responding to a new data set that Harvard released, made up of over a million books. But those are public domain works, not works by living authors, artists, academics.

Speaker 2 Talk about the concerns and the flack you got

Speaker 2 about these AI models vacuum up all of our cultural knowledge from the creators, writers, researchers without getting any credit.

Speaker 2 I mean, internet companies are known for scraping.

Speaker 2 I think Walt called, I believe it was Face, when it used to be called Facebook rapacious information thieves, but he may have been talking about Google.

Speaker 2 So talk to me about that, the controversy that happened with that. Okay.

Speaker 1 Outside of all of those kind of legal questions, if you have this vision that AI is going to be the repository of all human knowledge, then all human knowledge has to be available to train those models, right?

Speaker 1 And most of it is either not digitized or digitized but not available publicly. And it's not necessarily copyrighted material.

Speaker 1 It could be the entire content of the French National Library, a lot of which is digitized but not available

Speaker 1 for training. So

Speaker 1 I was not necessarily talking about copyrighted work in that case. It's more like, you know, if you are in,

Speaker 1 so I'm from my family, my father's family is from Brittany, okay, the western part of France, right? The traditional language spoken there, which was spoken until my great-grandfather,

Speaker 1 is Breton. Breton is disappearing.
There is something like 30,000 people speaking it on a daily basis, which is very small.

Speaker 1 If you want future LLMs to speak Boton, there needs to be enough training data in Breton. Where are you going to get that?

Speaker 1 You're going to have cultural non-profits, you know, kind of collecting all the stuff that they have, maybe governments helping, things like that.

Speaker 1 And they're going to say, like, you know, use my data. Like, I want your system to speak Breton.
Now, they...

Speaker 1 may not want to just hand that data just like that to you know a big companies on big companies on the west coast of the US. But a future that I envision, this is not company policy, right? This is

Speaker 1 my view,

Speaker 1 is that the best way to get to that level is by kind of training an AI system,

Speaker 1 a common AI system, repository of all human knowledge, in a distributed fashion, so that there would be several data centers around the world

Speaker 1 using local data to contribute to training a global system. And you don't have to copy that data.

Speaker 2 Who runs that that global system?

Speaker 1 Who writes Linux?

Speaker 1 Okay.

Speaker 2 Right. So that should exist for all of humanity.

Speaker 1 Yeah. I mean, who pays for Wikipedia? Right.

Speaker 2 I pay $7 a month because

Speaker 1 for Linux, in the case, actually, Linux is mostly supported by employees of companies who tell them

Speaker 1 to actually

Speaker 1 distribute their contributions. And you can have kind of a similar system where

Speaker 1 everyone contributes to this kind of global model.

Speaker 2 That's AI for everybody else.

Speaker 1 Which is AI, you know,

Speaker 1 LLMs in the short term.

Speaker 2 It's not necessarily monetizable. Yeah.

Speaker 1 Well, you monetize on top of it, right? I mean, Linux, you don't pay for Linux, but if you buy a widget that runs Linux, like an Android phone or a car that has Linux in its touchscreen,

Speaker 1 you pay for the widget that you buy. So it's going to be the same thing with AI.
That people do. The basic foundation model is going to be open and.

Speaker 2 It does feel like that it's a coalescing of small amount of powers running everything. It does at this point.
And that vision is a lovely one, but it's not occurring, right?

Speaker 1 Well, it's,

Speaker 1 in my opinion, it's actually inevitable.

Speaker 2 You've been in a public debate, you like to debate, with other godfathers of AI. Your Turing Award co-winners, Jeffrey Hinton and I think it's Yashua Bengio.
Yep.

Speaker 2 They've both been ringing alarm bells, warning about the potential dangers of AI, quite dramatically, I would say. They've called for stricter government regulation, oversight, including R D.

Speaker 2 You've called their warnings complete BS. I don't think you minced words there.
Talk to me about why that's complete BS.

Speaker 2 And one of the things you disagreed was one of the first attempts at AI regulation here in the U.S., California Bill SB 1047. Hinton and Benjio both endorsed it.
You lobbied against it.

Speaker 2 You wrote, regulating R D would have apocalyptic consequences on the AI system. Very dramatic of you, sir.

Speaker 2 You said the illusion of existential risk is being pushed by a handful of, quote, delusional think tanks. These two aren't delusional, I don't believe.
Hinton just won the Nobel Prize for his work.

Speaker 2 Talk about that in particular. And by the way, Governor Newsom vetoed the bill, but is working with people like Stanford Professor Faif Ailey to overhaul it.
Talk about why you called it complete BS.

Speaker 2 You're very strong on this.

Speaker 1 I'm very vocal about that. Yes.

Speaker 1 So, Jeff and Yoshua are both good friends. We've been friends for decades.

Speaker 1 You know, I did my postdoc in 1987, 88 with Jeff Hinton. So we've known each other for a very long time, for 40 years now.

Speaker 1 Same with Yoshua.

Speaker 1 I met him the first time. He was a master's student and I was a postdoc.
So

Speaker 1 we've been kind of working together.

Speaker 1 We won this prize together because we worked together at a sort of reviving interest in what we now call deep learning and which is a root of a lot of AI technology today.

Speaker 1 So we agree on many things.

Speaker 1 We disagree on a few things, and that's one of them.

Speaker 2 The existential threat to the human rights.

Speaker 1 Existential threat.

Speaker 1 So exactly. So Jeff.

Speaker 1 You're like, ah, no.

Speaker 2 They're like, oh, yeah, they're coming for us.

Speaker 1 I mean, Jeff believes that current LLMs have subjective experience. I completely disagree with this.
I think he's completely wrong about that. We disagreed on technical things before.

Speaker 1 It was kind of less public. It was more kind of technical,

Speaker 1 but it's not the first time we disagree.

Speaker 1 I just think he's it is he's wrong we're still good friends Yashua comes from a slightly different point of view he's more worried he's worried a little bit about this but he's more worried about

Speaker 1 bad people doing bad things with with AI systems yeah I'm with him developing like bioweapons or or chemical weapons or things like this

Speaker 1 I think frankly this is those dangers have been you know formulated for several years now and have been like incredibly inflated to the point of being kind of distorted

Speaker 1 so much that really

Speaker 1 they don't make any sense.

Speaker 2 Yes, delusional is the word you use. Well,

Speaker 1 I don't call them delusional. I call some of the other people who are more extreme and are pushing for

Speaker 1 regulation like SB 1047.

Speaker 1 Yes, delusional. I mean, some people will tell you in the face,

Speaker 1 a year ago,

Speaker 1 you asked them, like, how long is it going to take for AI to kill us all? And they say, like, five months.

Speaker 1 And obviously, they were wrong.

Speaker 2 So, this is what you're talking about. It's over AGI, artificial general intelligence and how close we are.
I would like you to explain it for people.

Speaker 2 When they hear it, they think about the plot of Terminator or iRobot or something like that. So Hinton and Benjio think the timeline for AGI could be more like five years and that we are not prepared.

Speaker 2 You said several years, if not a decade.

Speaker 2 You know, if you're wrong, you're going to be real wrong when it does kill us. So talk about why, you know, you'll be like, oh, we're not dead yet, and then we're dead.

Speaker 2 So talk about why you're not worried.

Speaker 1 So first of all, there's no question that at some point in the future, we're going to have AI systems that are smarter than us. Okay.
It's going to happen.

Speaker 1 Is it five years, 10 years, 20 years? It's really hard to tell.

Speaker 1 In our kind of, or at least my personal vision of it, the earliest it could happen is about five years, six years,

Speaker 1 but probably more like 10 and probably longer because it's probably harder than we think. And it's almost always harder than we think.
There is this history over

Speaker 1 the several decades of AI of people sort of completely underestimating how hard it is.

Speaker 1 And again, we don't have automatic robots, we don't have level five cell-driving cars, there's a lot of things that we don't know how to do with AI systems today.

Speaker 1 And so, until we figure out kind of a new set of techniques to get there, we're not even on a path towards human-level intelligence. So,

Speaker 1 once we, you know, a few years from now, once we have kind of a blueprint and some kind of believable demonstration that we might have a path towards human level AI, I don't like to call it AGI because human intelligence is very specialized actually.

Speaker 1 So we think we have general intelligence, we don't.

Speaker 1 So once we have a blueprint,

Speaker 1 we're going to have a good way to think about how to make it safe. It's kind of like

Speaker 1 if you kind of back

Speaker 1 you know, backpedal to the 1920s and and someone is telling you, you know,

Speaker 1 in a few decades, we're going to be flying millions of people across the Atlantic at near the speed of sound. You know, and someone would say, like, oh my God, are you going to make this safe?

Speaker 1 The turbojet was not invented yet. How can you make turbojets safe if you haven't invented the turbojet? We are in this situation today.
So

Speaker 1 making AI safe means designing those AI systems in ways in ways that are safe. But until we have a design, we're not going to be able to make them safe.
So the question

Speaker 1 makes no sense.

Speaker 2 You don't seem worried that AI would ever want to dominate humans.

Speaker 2 You said that current AI is dumber than a house cat.

Speaker 2 Whether AI is sent or not doesn't seem to matter if it feels real, right?

Speaker 2 And so what, how do you, if it's dumb or it doesn't want to dominate us or it doesn't want to kill us, what would be restrictions on AI and maybe AI R D that you would seem reasonable, if any?

Speaker 2 I think if none is what you're saying to me.

Speaker 1 Well, none on R D.

Speaker 1 I mean, clearly if you want to put out a domestic robot and that robot can cook for you, you probably want to hardwire some rules so that when there is people around the robot and the robot has a knife in his hand, he's not going to flut his arm around or something.

Speaker 1 So

Speaker 1 those are guardrails. So

Speaker 1 the design of current AI systems, to some extent, is

Speaker 1 intrinsically unsafe, you could say it this way.

Speaker 1 A lot of people I met are going to hate me for saying this,

Speaker 1 but

Speaker 1 they're kind of hard to control. You basically have to train them to behave properly.

Speaker 1 What you want, and this is something I've proposed, is another type of architecture which I call objective-driven, where the AI system basically is there to fulfill an objective and cannot do anything but fulfill this objective, subject to a number of guardrails, which are just other objectives.

Speaker 1 And that will guarantee that whatever output the system produces, whatever action action it takes, satisfy those guardrails and objectives and are safe.

Speaker 1 Now, the next question is: how do we design those objectives? And a lot of people are saying, Oh, we've never done this before. This is completely new.

Speaker 1 We're going to have to kind of invent a new science. No, actually, we're pretty familiar with this.
It's called making laws. We do this with people.

Speaker 1 We establish laws, and the laws basically change the cost of taking actions, right?

Speaker 1 And so, we've been shaping the behavior of people by making laws.

Speaker 1 And we're going to do the same for AI systems. The difference is that people can choose to not respect the law, whereas the AI system by construction will have to.

Speaker 2 Now, both these people, Hinton and Benji, endorsed a letter signed by current and former OpenAI employees calling employees that AI companies have the right to warn about serious risks by the technologies, and ordinary whistleblowers wouldn't protect them.

Speaker 2 You didn't endorse it. At the same time, we've seen some regulation in the EU.
They differentiate between high-risk AI systems and more general-purpose models.

Speaker 2 They have bans on certain applications that, quote, threaten citizens' rights, facial images,

Speaker 2 I suppose, this robot who wants to knife you.

Speaker 2 What is the model here

Speaker 2 to make it safer to make people

Speaker 2 you're suggesting we wait and see when bad things happen before putting up guardrails? Let's wait till there's some murder happening or not. I don't

Speaker 2 tell.

Speaker 1 No, no, no, that's not what I'm suggesting. I mean, you know, measures like

Speaker 1 banning

Speaker 1 massive face recognition in public places, that's a good thing. Like, you know, no nobody would really think that's a bad thing, except if you are an authoritarian government, yes.

Speaker 1 Some people think it's a great thing. Yeah, it already exists in some countries actually.
But

Speaker 1 but that you know, that that's a good thing, right. So and and there are there are measures like this that make complete sense, but they are they are at the at the product level.

Speaker 1 Um you know, also like uh

Speaker 1 uh you know, changing the face of someone on you know some embarrassing video and stuff like that. I mean, it's kind of already illegal, more or less.

Speaker 1 The fact that we have the tools to do it doesn't make it less illegal. There may be a need for specific rules against that, but

Speaker 1 I have no problem with that.

Speaker 1 I have a problem with this idea that AI is intrinsically dangerous and you need to regulate R ⁇ D.

Speaker 1 And the reason I think it's counterproductive is in a future in which you would have those open source platforms I was talking about, which I think are necessary for things like democracy in the future,

Speaker 1 then those rules would be counterproductive. They would basically make open source too risky for any company to distribute.

Speaker 2 So that just private companies will control everything. That's right.

Speaker 1 A small number of private companies on the west coast of the U.S. would control everything.
Now, talk to any government outside the U.S.

Speaker 1 and tell them about this future where everyone's digital diet will be mediated by AI assistance.

Speaker 1 And tell them that this will come from three companies on the west coast of the US. And they say, like, that's completely unacceptable.
Right. Like, this is the death of a democracy.

Speaker 1 Like, how will people get a diversity of opinions, right? If it all comes from three companies on the west coast of the US, we'll all have the same culture, we'll all speak the same language.

Speaker 1 Like, this is completely unacceptable. So what they want

Speaker 1 are open platforms that then can be fine-tuned for any culture, value system, center of interest, whatever, so that users around the world have a choice.

Speaker 1 They don't have to use like three systems. They can use, you know, the...

Speaker 2 So you're worried about the domination by OpenAI, Microsoft, Google, possibly Amazon.

Speaker 1 Anthropic.

Speaker 2 Anthropic, which is Amazon, really. So last two questions, you were awarded the 2024 Vin Future Prize.
There's so many prizes in your area. I never get any prizes.

Speaker 2 For transformational contributions to deep learning.

Speaker 2 In your acceptance speech, you said AI does not learn like humans or animals, which take in a massive amount of visual observation from the physical world. But you've been working to make this happen.

Speaker 2 You've been talking about it a while. Where do you imagine it being in years? Will it be like humans or animals or

Speaker 2 where?

Speaker 1 Well, so yeah, I mean,

Speaker 1 there is a point at which we're going to have systems that learn

Speaker 1 a little bit like humans and animals and can learn new skills and new tasks as efficiently as humans and animals, which is, frankly, astonishingly fast.

Speaker 1 Like, we can't reproduce this with machines, machines, right? We have, you know, companies like Tesla and others have hundreds of thousands or millions of hours of cars being driven by people.

Speaker 1 They could use this to train AI systems, which they do.

Speaker 1 They're still not as good as humans.

Speaker 1 We don't have, yeah, we can't buy a car that actually drives itself or a robot taxi unless we cheat. Like Waymo can do it, but it's a lot of tricks to it.

Speaker 1 And again, we can't buy a

Speaker 1 domestic robot because we can't make them smart enough. The reason for this is very simple.

Speaker 1 As I said before, we train LLMs and chatbots on all the publicly available text and some more. That's about 20 trillion words, right?

Speaker 1 A four-year-old has seen essentially the same amount of data visually than the biggest LLM has seen through text. That text would take any of us several hundred thousand years to read through.

Speaker 1 Okay, so what that tells you is that we're never going to get to human level AI by just training on text. We have to train on sensory input, which is basically an unlimited supply.

Speaker 1 16,000 hours of video is 30 minutes of YouTube uploads.

Speaker 1 We have way more of video data than

Speaker 1 we know what to do with.

Speaker 1 So the big challenge for the next few years in AI to make progress to the next level is get systems to understand how the world works by basically watching the world go by, watching video, and then interacting in the world.

Speaker 1 And this is not solved, but you know, there's a good chance that progress will be made, like significant progress will be made over the next five years, which is why you see all of those companies starting to build human right robots.

Speaker 1 They can't make them smart enough yet, but they're counting on the fact that AI is going to make sufficient progress over the next five years that by the time that

Speaker 1 those things can be sold in the public, that they, you know, the AI will be powerful enough.

Speaker 2 Right. Now I'm getting the glasses.
I understand what you're up to now, finally.

Speaker 2 I actually believe in a four-year-old more than I believe in most of Silicon Valley, I'll be honest with you. I met people like you, as I was saying.

Speaker 2 This is my very last question, I'm very quick, so we've got to go. Who are like this? It's going to change learning, it's going to change this,

Speaker 2 it's going to make everything better, everyone's going to get along.

Speaker 2 And as you cite all the time, and I respect you for that, is there's hate, there's dysfunction, there's loneliness, self-esteem among girls, danger to people who are often in danger, control by billionaires of our government.

Speaker 2 Why do do I trust you this time?

Speaker 1 Me?

Speaker 2 You, just you.

Speaker 1 Okay. I'm not a billionaire.

Speaker 2 What?

Speaker 1 I'm not a billionaire. That's not the first thing.

Speaker 1 I'm doing okay, though.

Speaker 2 I'm guessing you are.

Speaker 1 Okay.

Speaker 1 I'm first and foremost a scientist.

Speaker 1 And

Speaker 1 I would not sort of, you know,

Speaker 1 be able to look at myself in the mirror unless unless I had some level of integrity scientific integrity at least I might be wrong

Speaker 1 so you can trust that I'm not lying to you and that I'm not you know motivated by nefarious motives like you know greed or something like this

Speaker 1 but

Speaker 1 but I might be wrong I might I might very well be wrong in fact that's kind of the the whole process of science is that you have to accept the fact that you might be wrong and the you know elaborating the correct ideas comes from the collision of multiple ideas and people who disagree so

Speaker 1 but like

Speaker 1 you know look at the evidence so

Speaker 1 we look at the evidence from you know the the people who said that AI was going to destroy society because we're going to be inundated with disinformation or generated hate speech or things like this we're just not seeing this at all we're not seeing it we we've not seen it I mean people produce hate speech.

Speaker 1 People produce disinformation. And they try to disseminate it

Speaker 1 in every way they can. A lot of people are trying to disseminate hate speech on Facebook.
And it's against the content policy at Facebook to do this.

Speaker 1 Now, the best protection we have against this is AI systems.

Speaker 1 We couldn't do this in 2017, for example. 2017 AI technology was not good enough to allow Facebook and Instagram to detect hate speech in every language in the world.

Speaker 1 And what happened in between is progress in AI. Okay, so AI is not the tool that people use to

Speaker 1 produce hate speech or disinformation or whatever. It's actually the best countermeasure against it.
So what you need is

Speaker 1 just

Speaker 1 more powerful AI in the hands of the good guys than in the hands of the bad guys.

Speaker 2 I'm worried about the bad guys, but that's a great answer. Thank you so much.
I really appreciate it.

Speaker 2 On with Carraswisher is produced by Christian Castro-Roussel, Kateri Yoakum, Jolie Myers, Megan Burney, and Kaylin Lynch. Nashat Kurwa is Vox Media's executive producer of audio.

Speaker 2 Special thanks to Corinne Ruff and Kate Furby. Our engineers are Rick Kwan, Fernando Aruda, and Aliyah Jackson.
And our theme music is by Trackademics.

Speaker 2 If you're already following the show, you get a free pair of meta glasses. If not, watch out for that stabby robot.

Speaker 2 Go wherever you listen to podcasts, search for On with Carrot Swisher, and hit follow. Thanks for listening to On with Carris Wisher from New York Magazine, the Vox Media Podcast Network, and us.

Speaker 2 We'll be back on Monday with more.

Meta's Chief AI Scientist Yann LeCun Makes the Case for Open Source

Press play and read along

Transcript

More episodes from On with Kara Swisher

The Long Game with Jake Sullivan and John Finer

Tech Billionaires & the Rural Poor: Two Sides of Trump’s MAGA Populism

Fighting for Truth in a Rage-Driven Algorithmic Age with Jessica Yellin

Jennifer Welch of “I’ve Had It” Blasts Both MAGA & Centrist Dems

Gloves Off with Scott Jennings, CNN’s Conservative Pundit