Carl Shulman (Pt 1) - Intelligence Explosion, Primate Evolution, Robot Doublings, & Alignment

2h 44m

In terms of the depth and range of topics, this episode is the best I’ve done.

No part of my worldview is the same after talking with Carl Shulman. He's the most interesting intellectual you've never heard of.

We ended up talking for 8 hours, so I'm splitting this episode into 2 parts.

This part is about Carl’s model of an intelligence explosion, which integrates everything from:

* how fast algorithmic progress & hardware improvements in AI are happening,

* what primate evolution suggests about the scaling hypothesis,

* how soon before AIs could do large parts of AI research themselves, and whether there would be faster and faster doublings of AI researchers,

* how quickly robots produced from existing factories could take over the economy.

We also discuss the odds of a takeover based on whether the AI is aligned before the intelligence explosion happens, and Carl explains why he’s more optimistic than Eliezer.

The next part, which I’ll release next week, is about all the specific mechanisms of an AI takeover, plus a whole bunch of other galaxy brain stuff.

Maybe 3 people in the world have thought as rigorously as Carl about so many interesting topics. This was a huge pleasure.

Watch on YouTube. Listen on Apple PodcastsSpotify, or any other podcast platform. Read the full transcript here. Follow me on Twitter for updates on future episodes.

Timestamps

(00:00:00) - Intro

(00:01:32) - Intelligence Explosion

(00:18:03) - Can AIs do AI research?

(00:39:00) - Primate evolution

(01:03:30) - Forecasting AI progress

(01:34:20) - After human-level AGI

(02:08:39) - AI takeover scenarios



Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

Press play and read along

Runtime: 2h 44m

Transcript

Speaker 1 Human-level AI is deep, deep into an intelligence explosion.

Speaker 1 Things like inventing the transformer or discovering chinchilla scaling and doing your training runs more optimally or creating flash attention.

Speaker 1 That set of inputs probably would yield the kind of AI capabilities needed for intelligence explosion.

Speaker 1 You have a race between, on the one hand, the project of getting strong interpretability and shaping motivations, and on the other hand, these AIs in ways ways that you don't perceive, make the AI take over happen.

Speaker 1 We spend more compute by having a larger brain than other animals, and then we have a longer childhood. It's analogous to like having a bigger model and having more training time with it.

Speaker 1 It seemed very implausible that we couldn't do better than completely brute force evolution. How quickly are we running through those orders of magnitude?

Speaker 1 Hey, everybody.

Speaker 2 Just wanted to give you a heads up.

Speaker 2 So I ended up talking to Carl for like seven or eight hours. So we ended up splitting this episode into two parts.
I don't want to put all of that on you at once.

Speaker 2 In this part, we get deep into Carl's model of an intelligence explosion and what that implies for alignment.

Speaker 2 The next part, which we'll release next week, is all about the specific mechanisms of an AI takeover.

Speaker 2 In terms of the depth and the range of interesting topics, this set of episodes is the best I've ever done. So I hope you all enjoy.
Here's Carl. Okay,

Speaker 2 today I have the pleasure of speaking with Carl Schulman.

Speaker 2 Many of my former guests, and this is not an exaggeration, many of my former guests have told me that a lot of their biggest ideas, perhaps most of their biggest ideas, have come directly from Carl, especially when it has to do with the intelligence explosion and its impacts.

Speaker 2 And so I decided to go directly to the source, and we have Carl today on the podcast. Carl keeps a super low profile, but he is one of the most interesting intellectuals I've ever encountered.

Speaker 2 And this is actually his second podcast ever. So, we're going to get to get deep into the heart of many of the most important ideas that are circulating right now directly from the source.

Speaker 2 So, and by the way, so Carl is also an advisor to the Open Philanthropy Project, which is one of the biggest funders on causes having to do with AI and its risks, not to mention global health and well-being.

Speaker 2 And he is a research associate at the Future of Humanity Institute at Oxford. So, Carl, it's a huge pleasure to have you on the podcast.
Thanks for coming.

Speaker 1 Thank you, Dorkash. I've enjoyed seeing some of your episodes recently, and I'm glad to be on the show.

Speaker 2 Excellent. Let's talk about AI.

Speaker 2 Before we get into the details, give me the sort of big picture explanation of the feedback loops and just the general dynamics that would start when you have something that is approaching human-level intelligence.

Speaker 1 Yeah, so I think the way to think about it is: we have a process now where humans are developing new computer chips, new software,

Speaker 1 running larger training runs.

Speaker 1 And

Speaker 1 it takes a lot of work to keep Moore's Law chugging. Well, it was, it's slowing down now.

Speaker 1 And it takes a lot of work to develop things like transformers

Speaker 1 to develop a lot of the improvements to AI and neural networks. that are advancing things.
And

Speaker 1 the core method that I think I want to highlight on this podcast, and I think is underappreciated, is the idea of input-output curves.

Speaker 1 So we can look

Speaker 1 at the increasing difficulty of improving chips.

Speaker 1 And so sure,

Speaker 1 each time you double the performance of computers, it's harder. And as we approach physical limits, eventually it becomes impossible.

Speaker 1 But how much harder?

Speaker 1 So there's a paper called Are Ideas Getting Harder to Find that was published a few years ago.

Speaker 1 Only like 10 years ago at MIRA, we did, I mean I did an early version of

Speaker 1 this analysis using mainly data from Intel and like the large semiconductor fabricators. Anyway, and so in this paper, they cover a period where the productivity of computing went up a million fold.

Speaker 1 So you could get a million times the computing operations per second per dollar. Big change, but

Speaker 1 it got harder. So the amount of investment, the labor force required to make those continuing advancements went up and up and up.

Speaker 1 Indeed, it went up 18 fold over that period.

Speaker 1 So some take this to say, oh, diminishing returns. Things are just getting harder and harder, and so that will be the end of progress eventually.

Speaker 1 However, in a world where AI is doing the work,

Speaker 1 that doubling of computing performance

Speaker 1 translates pretty directly to a doubling or better of the effective labor supply. That is,

Speaker 1 if when we had that million-fold compute increase, we used it to run artificial intelligences

Speaker 1 who would replace human scientists and engineers, then the 18x increase in the labor demands of the industry would be trivial.

Speaker 1 We're getting more than one doubling of the effective labor supply that we need for each doubling

Speaker 1 of the labor requirement. And in that data set, it's like over four.

Speaker 1 So

Speaker 1 we double compute. Okay, now we need somewhat more researchers, but a lot less than twice as many.
And so, okay, we use up

Speaker 1 some of those doublings of compute on the increasing difficulty of further research, but most of them are left to expedite the process. So if

Speaker 1 you double your labor force, that's enough to get several doublings of compute.

Speaker 1 You use up one of them

Speaker 1 on meeting the increased demands from diminishing returns.

Speaker 1 The others can be used to accelerate the process. So

Speaker 1 your first doubling takes however many months, your next doubling can take a smaller fraction of that. The next doubling less and so on, at least insofar as this,

Speaker 1 the outputs you're generating compute for AI in this story, are able to serve the function of the necessary inputs.

Speaker 1 If there are other inputs that you need, eventually those become a bottleneck and you wind up more restricted on those.

Speaker 2 Got it. Okay.
So yeah, I I think the Bloom paper had that there was a 35% increase in, was it transmission destiny or cost per flop?

Speaker 2 And there was a 7% increase per year in the number of researchers required to sustain that pace.

Speaker 1 Something that was in it, yeah, it's like four to five doublings of compute per doubling of labor inputs.

Speaker 2 I guess there's a lot of questions you can delve into in terms of whether you would expect a similar scale with AI and whether it makes sense to think of AIs as a population of researchers that keeps growing with compute itself.

Speaker 2 Actually, let's go there. So, can you explain the intuition that compute is a good proxy for the number of AI researchers, so to speak?

Speaker 1 So far, I've talked about hardware as an initial example because we had good data

Speaker 1 about a past period. You can also make

Speaker 1 improvements on the software side. And when we think about an intelligence explosion, that can include AIs doing work on making hardware better, making better software, making more hardware.

Speaker 1 But the basic idea for the hardware is especially simple in that if you have a worker, an AI worker that can substitute for a human, if you have twice as many computers, you can run two separate instances of them.

Speaker 1 And then they can do two different jobs, manage two different machines, work on two different design problems. Now, you can get more gains than just what you would get by having two instances.

Speaker 1 We get improvements from using some of our compute, not just to run more instances of the existing AI, but to train larger AIs.

Speaker 1 So there's hardware technology, how much you can get per dollar you spend on hardware. And there's software technology.
And the software can be copied freely. So

Speaker 1 if you've got the software, it doesn't necessarily make that much to say that, oh, we've got 100 Microsoft Windows. You can make as many copies as you need

Speaker 1 for whatever Microsoft will charge you. But for hardware, it's different.
It matters how much we actually spend on the hardware at a given price.

Speaker 1 And if we look at the changes that have been driving AI recently, that is the thing that is really off-trend. We are spending tremendously more money

Speaker 1 on computer hardware for training big AI models.

Speaker 2 Yep, okay. So

Speaker 2 there's the investment in hardware, there's the hardware technology itself, and there's the software progress itself.

Speaker 2 The AI is getting better because we're spending more money on it, because our hardware itself is getting better over time, and because we're developing better models or better adjustments to those models.

Speaker 2 Where is the loop here?

Speaker 1 The work involved in designing new hardware and software is being done by people now.

Speaker 1 They use computer tools tools to assist them, but like computer time is not like the primary cost

Speaker 1 for NVIDIA designing chips,

Speaker 1 for

Speaker 1 TSMC, producing them, for ASML, making lithography equipment to serve the TSMC fabs. And even in AI software research, that has become quite compute-intensive.

Speaker 1 But I think we're still in the range where, you know, at a place like DeepMind, salaries were still larger than compute for the experiments.

Speaker 1 Although tremendously, tremendously more of the expenditures were on compute relative to salaries than in the past.

Speaker 1 If you take all of the work that's being done by those humans, there's like low tens of thousands of people working at NVIDIA designing GPUs specialized for AI.

Speaker 1 I think there's more like 70,000 people at TSMC,

Speaker 1 which is the leading producer of cutting-edge chips. There's a lot of additional people at companies like ASML that supply them with the tools they need.

Speaker 1 And then a company like DeepMind, I think from their public filings, they recently had a thousand people. OpenAI, I think, is a few hundred people.
Anthropic is less.

Speaker 1 If you add up things like Facebook AI research, Google Brain, other

Speaker 1 D, you get thousands or tens of thousands of people who are working on AI research. We'd want to zoom in on those who are developing new methods rather than narrow applications.

Speaker 1 So inventing the transformer definitely counts. Optimizing for some particular businesses, data set cleaning, probably not.

Speaker 1 But so those people are doing this work. They're driving quite a lot of progress.

Speaker 1 What we observe in the growth of people relative to the growth of of those capabilities is that pretty consistently, the capabilities are doubling on a shorter time scale than the people required to do them are doubling.

Speaker 1 And so there's work. So

Speaker 1 we talked about hardware and how historically it was pretty dramatic, like four or five doublings of compute efficiency per doubling of human inputs.

Speaker 1 I think that's a bit lower now as we get towards the end of Moore's Law, although interestingly, not as much lower as you might think, because the growth of inputs has also slowed recently.

Speaker 1 On the software side, there's some work by

Speaker 1 Timei Basaroglou

Speaker 1 and I think collaborators.

Speaker 1 It may have been a thesis.

Speaker 1 It's called Are Models Getting Harder to Find? And so it's applying the same sort of analysis as the Our Ideas Getting Harder to Find.

Speaker 1 And you can look at growth rates of

Speaker 1 papers from citations, employment at these companies, and it seems like the doubling time of these workers driving the software advances is like several years,

Speaker 1 or at least a couple years, whereas the doubling of effective compute from algorithmic progress is faster. So there's a group called EPOC.
They received grants from Open Philanthropy.

Speaker 1 And they do work collecting data sets that are relevant to forecasting AI progress. And so

Speaker 1 their headline results for what's the rate of progress in hardware, in software, and just like growth in budgets are as follows.

Speaker 1 So for hardware, they're looking at like a doubling of hardware efficiency that's like two years.

Speaker 1 It's possible it's a bit better than that when you take into account certain specializations for AI workloads.

Speaker 1 For the growth of budgets, they find a doubling time that's like something like six months in recent years, which is pretty tremendous relative to the historical rates.

Speaker 1 We should maybe get into that later. And then on the algorithmic progress side, mainly

Speaker 1 using ImageNet type data sets right now, they find a doubling time that's less than one year. And so you combine all of these things and the growth of effective compute for training big

Speaker 1 AIs.

Speaker 1 It's pretty, pretty drastic.

Speaker 2 I think I saw an estimate that GPT-4 costs like $50 million around that range to train. Now, suppose that like AGI takes a thousand X that,

Speaker 2 if you were to just scale up GPT-4,

Speaker 2 it might not be that, I'm just for the sake of example.

Speaker 2 So part of that will come from companies just spending a lot more to train the models and that just greater investment.

Speaker 2 Part of that will come from them having better models so that what would have taken a 10x increase in the model to get naively, you can do with having a better model that you only need need to do scale-up.

Speaker 2 You get the same effect of increasing it by 10x just from having a better model. And so, yeah, you can spend more money on it to turn a bigger model.

Speaker 2 You can just have a better model, or you can have chips that are cheaper to train, so you get more compute for the same dollars.

Speaker 2 And okay, so those are the three you were describing, the ways in which the quote-unquote effect of a compute would increase.

Speaker 1 From the looking at it right now, it looks like, yeah, you might get two or three doublings of effective compute for this thing that we're calling software progress,

Speaker 1 which is

Speaker 1 which people get by asking, well, how much less compute can you use now to achieve the same benchmark as you achieved before?

Speaker 1 There are reasons to not fully identify this with like software progress, as you might naively think of it, because some of it can be enabled by the others.

Speaker 1 So, like, when you have a lot of compute, you can do more experiments and find algorithms that work better.

Speaker 1 Sometimes, the additional compute, you can get higher efficiency by running a bigger model we were talking about earlier.

Speaker 1 And so that means you're getting more for each GPU that you have because you made this larger expenditure. And that can look like a software improvement because this model

Speaker 1 it's not a hardware improvement directly because it's doing more with the same hardware. But you wouldn't have been able to achieve it without having a ton of GPUs to do the big training run.

Speaker 2 The feedback loop itself involves the AI that is the result of this greater effective compute helping you train better AI, right? Or use less effective compute in the future to train better AI.

Speaker 1 It can help on the hardware design.

Speaker 1 So, like, NVIDIA is a fabulous chip design company. They don't make their own chips, they send files of instructions to TSMC, which then fabricates the chips in their own facilities.

Speaker 1 And so,

Speaker 1 the work of those

Speaker 1 10,000 plus people, if you could automate that and have the equivalent of a million people doing that work, then I think you would pretty quickly get the kind of improvements that can be achieved with the existing nodes that TSMC is operating on.

Speaker 1 You could get a lot of those chip design gains. Basically, like doing the job of improving chip design that those people are working on now, but get it done faster.
So that's one thing.

Speaker 1 I think that's less important for the intelligence explosion.

Speaker 1 The reason being that when you make an improvement to chip design, it only applies to the chips you make after that.

Speaker 1 If you make an improvement in AI software, it has the potential to be immediately applied to all of the GPUs that you already have.

Speaker 1 Yeah, and so the thing that I think is most disruptive and most important has the leading edge of the change from AI automation of the inputs to AI is on the software side.

Speaker 2 At what point would it get to the point where the AIs are helping develop better software or better models for future AIs?

Speaker 2 Some people claim today, for example, that programmers at OpenAI are using Copilot to write programs now. So in some sense, you're already having that sort of feedback loop.

Speaker 2 I'm a little skeptical of that as a mechanism.

Speaker 2 At what point would it be the case that the AI is contributing significantly in the sense that it would almost be the equivalent of having additional researchers to AI progress and software?

Speaker 1 The quantitative magnitude of the help is absolutely central. So, there are plenty of companies that make some product that very slightly boosts productivity.

Speaker 1 So, when Xerox makes fax machines, it maybe increases people's productivity in office work by 0.1% or something. You're not going to have explosive growth out of that because, okay, now 0.1% more

Speaker 1 effective RD at Xerox than any customers buying the machines.

Speaker 1 Not that important. So I think

Speaker 1 the thing to look for

Speaker 1 is

Speaker 1 when is it the case that the contributions from AI are starting to become as large or larger? as the contributions from humans.

Speaker 1 So like when this is boosting their effective productivity by 50 or 100%, and you like, if you then go from

Speaker 1 eight months doubling time, say, for effective compute from software innovations, things like inventing the transformer or discovering chinchilla scaling and doing your training runs more optimally or creating flash attention.

Speaker 1 Yeah, if you move that from, say, eight months to four months,

Speaker 1 and then the next time you apply that, it significantly increases the boost you're getting from the AI. So now maybe instead of giving a 50% or 100% productivity boost, now it's more like a 200%.

Speaker 1 And so, it doesn't have to have been able to automate everything involved in the process of AI research. It can be it's automated a bunch of things, and then those are being done in extreme profusion.

Speaker 1 Because any, I think, a thing that AI can do, you have it done much more often because it's so cheap.

Speaker 1 And so, it's not a threshold of this is human-level AI. It can do everything a human can do with no weaknesses in any area.
It's that even with its weaknesses,

Speaker 1 it's able to bump up the performance so that

Speaker 1 instead of getting like the results we would have with, say, the 10,000 people working on finding these innovations, we get the results that we would have if we had twice as many of those people with the same kind of skill distribution.

Speaker 1 And so that's a it's like a demanding challenge.

Speaker 1 It's like you need quite a lot of capability for that, but it's also important that it's significantly less than this is a system where there's no way you can point at it and say, in any respect, it is weaker than a human.

Speaker 1 A system that was just as good as a human in every respect, but also had all of the advantages of an AI, that is just way beyond this point. Like if you consider that there's

Speaker 1 like the the output of our existing fabs can make tens of millions of advanced GPUs per year.

Speaker 1 Those GPUs, if they were running sort of AI software that was as efficient as humans, is sample efficient, it doesn't have any major weaknesses. So they can work four times as long,

Speaker 1 you know, the 168-hour work week. They can have much more education than any human.
So it's, you know, like the human, you know, you got a PhD, you know, it's like,

Speaker 1 wow, it's like 20 years of education, maybe longer if

Speaker 1 they take a slow route on the PhD. It's just normal for us to train large models by eat the internet, eat all the published books ever,

Speaker 1 read everything on GitHub and get good at predicting it.

Speaker 1 So like the level of education vastly beyond any human, the degree to which the models are focused on task

Speaker 1 is higher than all but like the most motivated humans when they're really, really gunning for it.

Speaker 1 So you combine the things, tens of millions of GPUs, each GPU

Speaker 1 is doing the work of the very best humans in the world.

Speaker 1 And like the most capable humans in the world can command salaries that are a lot higher than the average, and particularly in a field like STEM or

Speaker 1 narrowly AI.

Speaker 1 There's no human in the world who has a thousand years of experience with TensorFlow, or let alone the new AI technologies that were invented the year before. But if they were around,

Speaker 1 yeah, they'd be paid millions of dollars a year.

Speaker 1 And so when you consider this, okay, tens of millions of GPUs, each is doing the work of maybe 40, maybe more

Speaker 1 of these kind of existing workers. This is like going from a workforce of tens of thousands to hundreds of millions.

Speaker 1 You immediately make all kinds of discoveries then. You immediately develop all sorts of tremendous technologies.
So human level AI is deep, deep into an intelligence explosion.

Speaker 1 The intelligence explosion has to start with something weaker than that. Yep, yep, yep.

Speaker 2 Yeah, what is the thing it starts with?

Speaker 1 And how close are we to that?

Speaker 2 Because if you think of a researcher at OpenAI or something, you know, these are

Speaker 2 To be a researcher is not just completing the hello world

Speaker 2 prompt that Copilot does, right? It's like you had to choose a new idea. You had to figure out the right way to approach it.

Speaker 2 You perhaps have to manage the people who are also working with you on that problem. You know, it's like it's an incredibly complicated skill, portfolio of skills, rather than just a single skill.

Speaker 2 So, yeah,

Speaker 2 what is the point at which that feedback loop starts where you can even

Speaker 2 you're not just doing the 0.5% increase in productivity that a sort of AI tool might do, but is actually the equivalent of a researcher or close to it? Like, what is that point?

Speaker 1 So, I think maybe a way to look at it is to give some illustrative examples of the kinds of capabilities that you might see. And so, because

Speaker 1 these systems have to be a lot weaker than this sort of human-level things, what we'll have is intense application of the ways in which AIs have advantages,

Speaker 1 partly offsetting their weaknesses. And so, AIs are cheap.
We can call a lot of them to do many small problems.

Speaker 1 And so you'll have situations where you have dumber AIs that are deployed thousands of times to equal, say, one human worker.

Speaker 1 And they'll be doing things like

Speaker 1 these voting algorithms where you, with an LLM, you generate a bunch of different responses and take a majority vote among them that improves performance sum.

Speaker 1 You'll have things like the alpha go kind of approach, where you use the neural net to do search, and you go deeper with the search by plowing in more compute, which helps to offset the inefficiency and weaknesses of the model on its own.

Speaker 1 You'll do things that would just be totally impractical

Speaker 1 for humans because of the sheer number of steps. And so, an example of that would be designing synthetic training data.

Speaker 1 So, humans do not learn by just going into the library and opening books at random pages.

Speaker 1 It's actually much, much more efficient to have things like schools and classes where they teach you things in an order that makes sense, that's focusing on the skills that are more valuable to learn.

Speaker 1 They give you tests and exams that are designed to try and elicit the skill they're actually trying to teach.

Speaker 1 And right now we don't bother with that because we can hoover up more data from the internet. We're getting towards the end of that.

Speaker 1 But yeah, as the AIs get more sophisticated, they'll be better able to tell

Speaker 1 what is a useful kind of skill to practice and to generate that. And we've done that in other areas.
So AlphaGo,

Speaker 1 the original version of AlphaGo was booted up with data from human GoPlay

Speaker 1 and then improved with reinforcement learning and Monte Carlo research.

Speaker 1 But then

Speaker 1 AlphaZero, with a somewhat more sophisticated model, benefited from some other improvements, but was able to go from scratch.

Speaker 1 And it generated its own data through self-play.

Speaker 1 So

Speaker 1 getting data of a higher quality than the human data, because there are no human players that good available in the data set, and also a curriculum, so that at any given point, it was playing games against an opponent of equal skill itself.

Speaker 1 And so it was always in an area when when it was easy to learn.

Speaker 1 If you're just always losing no matter what you do, or always winning no matter what you do, it's hard to distinguish which things are better and which are worse.

Speaker 1 And when we have somewhat more sophisticated AIs that can generate training data and tasks for themselves, for example, if the AI can generate a lot of unit tests and then can try and produce programs that pass those unit tests, then the interpreter is providing a training signal.

Speaker 1 And the AI can get good at figuring out what's the kind of programming problem that is hard for AIs right now that will develop more of the skills that I need

Speaker 1 and then do them. And now you're not going to have employees at OpenAI write like a billion programming problems.
That's just not going to happen.

Speaker 1 But you are going to have AIs given the task of producing those enormous number of programming challenges.

Speaker 2 ANLMS themselves is, you know, there's a paper out of Anthropo called Constitution AI or Constitution RL, where they basically have the program just talk to itself and say, is this response helpful?

Speaker 2 If not, how can I make this more helpful? And the response is improved. And then you train the model on the more helpful responses that it generates by talking to itself so that it generates natively.

Speaker 2 And you could imagine, you know, more sophisticated ways is to do that or better ways to do that. Okay, so but then the question is,

Speaker 2 listen, you know, GPT-4 already costs like 50 million or 100 million or whatever it was.

Speaker 2 Even if we have greater effective compute from hardware increases and better models, it's hard to imagine how we could sustain like four or five more orders of magnitude greater size, effective size than GPT-4, unless we're dumping in like trillions of dollars, like the entire, the entire economies of big countries into training the next version.

Speaker 2 So the question is, do we get something that can significantly help with AI progress before we run out of

Speaker 2 the sheer

Speaker 2 money and scale and compute that would be required to train it. Do you have a take on that?

Speaker 1 Well, first, I'd say remember that there are these three contributing trends. So the new H100s are significantly better than the A100s.

Speaker 1 And a lot of companies are actually waiting for their deliveries of H-100s to do even bigger training runs,

Speaker 1 along with the work of hooking them up into clusters and engineering the thing. Yeah.
So all of those factors are contributing. And of course, mathematically,

Speaker 1 yeah, if you do four orders of magnitude more than 50 or 100 million, then you're getting to trillion dollar territory. And yeah, I think the way to look at it is

Speaker 1 at each step along the way, does it look like it makes sense to do the next step?

Speaker 1 And so from where we are right now, seeing the results with GPT-4 and ChatGPT, companies like Google and Microsoft and whatnot are pretty convinced that this is very valuable.

Speaker 1 You have like talk at Google and Microsoft with Bing that, well, it's like

Speaker 1 billion dollar matter to change market share in search by a percentage point.

Speaker 1 And so that can fund a lot. And

Speaker 1 on the far end, on the extreme, if you automate human labor, we have a hundred trillion dollar economy. Most of that economy is paid out in wages.
So like between 50 and $70 trillion

Speaker 1 per year. If you create AGI, it's going to automate all of that

Speaker 1 and keep increasing beyond that.

Speaker 1 So the value of the completed project is very much worth throwing our whole economy into it if you're going to get the good version, not the catastrophic destruction of the the human race or

Speaker 1 some other disastrous outcome.

Speaker 1 And

Speaker 1 in between, it's a question of, well, the next step, how risky and uncertain is it? And how much growth in the revenue you can generate with it do you get?

Speaker 1 And so if we're moving up to a billion dollars, I think that's absolutely going to happen. These large tech companies have R ⁇ D budgets, tens of billions of dollars.

Speaker 1 And when you think about it, like in the relevant sense, like all the employees at Microsoft who are doing software engineering, that's like contributing to creating software objects.

Speaker 1 It's not weird to spend tens of billions of dollars on a product that would do so much.

Speaker 1 And I think it's becoming more clear that there is sort of market opportunity to fund the thing. Going up to $100 billion,

Speaker 1 that's like, okay, the existing RD budgets spread over multiple years.

Speaker 1 But if you keep seeing that when you scale up the model, it substantially improves the performance, it opens up new applications.

Speaker 1 You're not just improving your search, but maybe it makes self-driving cars work.

Speaker 1 You replace bulk software engineering jobs, or if not replace them, amplify productivity.

Speaker 1 In this kind of dynamic, you actually probably want to employ all the software engineers you can get as long as they're able to make any contribution because the returns of improving stuff in AI itself get so high.

Speaker 1 But yeah, so I think that can go up to 100 billion.

Speaker 1 And at 100 billion,

Speaker 1 you're using like a significant fraction of our existing fab capacity. Like right now, the revenue of NVIDIA is like 25 billion.

Speaker 1 The revenue of TSMC, I believe, is like over 50 billion. Last, I checked in 2021, NVIDIA was maybe 7.5%,

Speaker 1 less than 10%

Speaker 1 of TSMC revenue.

Speaker 1 So there's a lot of room, and most of that was not AI chips.

Speaker 1 They have a large gaming segment. There are data center GPUs that are used for video and the like.

Speaker 1 So there's room for

Speaker 1 more than an order of magnitude increase by redirecting existing fabs to produce more AI chips and just actually using the AI chips that these companies have in their cloud for the big training runs.

Speaker 1 And so I think that that's enough to go to the 10 billion and then combine with stuff like the H100 to go up to the 100 billion.

Speaker 2 Just to emphasize for the audience the initial point about revenue you made, if it cost OpenAI $100 million to train GPT-4 and it generates $500 million in revenue,

Speaker 2 you pay back your expenses with $100 million, you have $400 million for your next training run, then you train

Speaker 2 4.5, you know, you get, let's say, $4 billion out of revenue out of that.

Speaker 2 That's where the feedback group of sort of revenue comes from, where you're automating tasks and therefore you're making money. You can use that money to automate more tasks.

Speaker 2 On the ability to redirect the fat production towards AI chips.

Speaker 2 So

Speaker 2 then the TLDR on you want $100 billion worth of compute. I mean, fabs take what, like a decade or so to build.

Speaker 2 So given the ones we have now and the ones that are going to come online in the next decade, is there enough to sustain $100 billion of GPU compute if you wanted to spend that on a training run?

Speaker 1 Yes, you could definitely make the $100 billion one. As you go up to a trillion-dollar run and larger,

Speaker 1 it's going to involve more fab construction. And yeah, fabs can take a long time to build.

Speaker 1 On the other hand, if in fact you're getting very high revenue from the AI systems, and you're actually bottlenecked on the construction of these fabs,

Speaker 1 then their price could skyrocket. And that would lead to measures we've never seen before

Speaker 1 to expand and accelerate fab production. Like if you consider, so at the limit, how you're getting models that approach human-like capability.

Speaker 1 If you imagine things that are getting close to like brain-like efficiencies plus AI advantages, we were talking before about, well, a GPU

Speaker 1 that is supporting an AI, really, it's a cluster of GPUs supporting AIs that do things in parallel, data parallelism.

Speaker 1 But if that can work four times as much as a human, a highly skilled, motivated, focused human with levels of education that have never been seen in the human population.

Speaker 1 And so if like a typical software engineer can earn hundreds of thousands of dollars, the world's best software engineers can earn millions of dollars today, and maybe more in a world where there's so much demand for AI.

Speaker 1 And then times four for working all the time. Well, I mean, if you have, if you generate like close to $10 million

Speaker 1 a year

Speaker 1 out of the future version of H100, they cost tens of thousands of dollars with a huge profit margin now. The profit margin

Speaker 1 could be reduced with like large production.

Speaker 1 That is a big difference. That chip pays for itself almost instantly.

Speaker 1 And so you could support paying 10 times as much to have these fabs constructed more rapidly.

Speaker 1 You could have, if AI is starting to be able to contribute, you could have AI contributing more of the skilled technical work that makes it hard for, say, NVIDIA to suddenly find thousands upon thousands of top quality engineering hires if AI can provide that.

Speaker 1 Now, if AI hasn't reached that level of performance, then this is how you can have things stall out. And like a world where AI progress stalls out is one where you go to the hundred billion and then

Speaker 1 over succeeding years, trillion-dollar things, software progress

Speaker 1 turns out to stall. You lose the gains that you are getting from moving researchers from other fields.

Speaker 1 Lots of physicists and people from other areas of computer science have been going to AI, but you sort of tap out those resources as AI becomes a larger proportion of the research field.

Speaker 1 And, like, okay, you've put in all of these inputs, but they just haven't yielded AGI yet.

Speaker 1 I think that set of inputs probably would yield the kind of AI capabilities needed for intelligence explosion. But if it doesn't,

Speaker 1 after we've exhausted this current scale-up of like increasing the share of our economy that is trying to make AI, if that's not enough, then after that, you have to wait for the slow grind of things like general economic growth, population growth, and such, and so things slow.

Speaker 1 And that results in my credences and this kind of advanced AI happening to be relatively concentrated over the next 10 years compared to the rest of the century.

Speaker 1 Because we just can't keep going with this rapid redirection of resources into AI.

Speaker 1 That's a one-time thing.

Speaker 2 If the current scale up works, it's going to happen. We're going to get to HI really fast, like within the next 10 years or something.

Speaker 2 If the current scale up doesn't work, all we're left with is just like our economy growing like 2% a year. So we have like 2% a year more resources to spend on AI.

Speaker 2 And at that scale, you're talking about decades before you can,

Speaker 2 just through sheer brute force, you can train the $10 trillion model or something. Let's talk about why you have your thesis that the current scale-up would work.

Speaker 2 What is the evidence from AI itself, or maybe from primate evolution and the evolution of other animals? Just give me the whole

Speaker 2 confluence of reasons that make you.

Speaker 1 I think maybe the best way to look at that might be to consider when I first became interested in this area, so in the 2000s, which was before the deep learning revolution, how would I think about timelines?

Speaker 1 How did I think about timelines? And then how have I updated based on what has been happening with deep learning? And so back then, I would have said,

Speaker 1 we know the brain is a physical object, an information processing device. It works.

Speaker 1 It's possible. And not only is it possible, it was created by evolution on Earth.
And so that gives us something of an upper bound in that this kind of brute force. was sufficient.

Speaker 1 There are some complexities with like, well, what if it was a freak accident and it, you know, that didn't happen on all of the other planets and that added some value.

Speaker 1 I have a paper with Nick Bostrom on this. I think basically that's not that important an issue.

Speaker 1 There's convergent evolution, like octopi are also quite sophisticated.

Speaker 1 If a special event was at the level of forming cells at all or forming brains at all, we get to skip that because we're choosing to build computers and we already exist. We have that advantage.

Speaker 1 So say, evolution gives something of an upper bound, really intensive, massive brute force search.

Speaker 1 And things like evolutionary algorithms can produce intelligence.

Speaker 2 Doesn't the fact that Octopi and I guess other mammals got to the point of being like pretty intelligent, but not human-level intelligent, is that some evidence that there's a hard step between a cephalopod and a human?

Speaker 1 Yeah, so that would be a place to look.

Speaker 1 It doesn't seem particularly compelling. One source of evidence on that is work by

Speaker 1 Herculano Hutzel.

Speaker 1 I hope I haven't mispronounced her name, but she's a neuroscientist who has dissolved the brains of many creatures. And by counting

Speaker 1 the nuclei, she's able to determine how many neurons are

Speaker 1 present in different species and find a lot of interesting trends and scaling laws. And she has a paper

Speaker 1 discussing the human brain has a scaled up primate brain.

Speaker 1 And across like a wide variety of animals and mammals in particular, there are certain characteristic changes in the relative number of neurons, size of different brain regions, how things scale up.

Speaker 1 There's a lot of

Speaker 1 yeah, there's a lot of structural

Speaker 1 structural similarity there. And you can explain a lot of what is different about us with a pretty brute force story, which is that

Speaker 1 you expend resources on having a bigger brain, keeping it in good order, giving it time to learn. So, we have an unusually long childhood, unusually long neon period.

Speaker 1 We spend more compute by having a larger brain than other animals,

Speaker 1 more than three times as large as chimpanzees. And then we have a longer childhood than chimpanzees and much more than many, many other creatures.

Speaker 1 So, we're spending more compute in a way that's analogous to like having a bigger model and having more training time with it.

Speaker 1 And given that we see

Speaker 1 with our AI models, this sort of like large, consistent benefits from increasing compute spent in those ways, and with qualitatively new capabilities showing up over and over again, particularly in areas that sort of AI skeptics call out.

Speaker 1 In my experience, like over the last 15 years, the things that people call out as like, ah, but AI can't do that. And it's because of a fundamental limitation.
We've gone through a lot of them.

Speaker 1 You know, there were Winnograd schemas, catastrophic forgetting, quite a number.

Speaker 1 And yeah, they have repeatedly gone away through scaling.

Speaker 1 And so

Speaker 1 there's a picture that we're we're seeing supported from biology and from our experience with AI, where you can explain like,

Speaker 1 yeah, in general, there are trade-offs where the extra fitness you get from a brain is not worth it.

Speaker 1 And so creatures wind up mostly with small brains because they can save that biological energy and that time to reproduce for digestion and so on.

Speaker 1 And humans, we actually seem to have wound up in a niche.

Speaker 1 within self-reinforcing where we greatly increase the returns to having large brains. And language and technology are the sort of obvious candidates.

Speaker 1 When you have humans around you who know a lot of things and they can teach you, and compared to almost any other species, we have vastly more instruction from parents and the society of the youngin,

Speaker 1 then you're getting way more from your brain because you can get per minute.

Speaker 1 you can learn a lot more useful skills and then you can provide the energy you need to feed that brain by hunting and gathering, by having fire that makes digestion easier.

Speaker 1 And basically, how this process goes on, it's increasing the marginal increase in reproductive fitness you get from allocating more resources along a bunch of dimensions towards cognitive ability.

Speaker 1 And so

Speaker 1 that's bigger brains, longer childhood, having our attention be more on learning. So humans play a lot and we keep playing as adults, which is a very weird thing compared to other animals.

Speaker 1 We're more motivated to copy other humans around us than like even than the other primates.

Speaker 1 And so these are sort of motivational changes that keep us using more of our attention and effort on learning, which pays off more when you have a bigger brain and a longer lifespan in which to learn.

Speaker 1 Many creatures are subject to lots of predation or disease.

Speaker 1 And so if you try, you know, you're a mayfly or a mouse, if you try and invest in like a giant brain and and a very long childhood, you're quite likely to be killed by some predator or some disease before you're able to actually use it.

Speaker 1 And so that means you actually have exponentially increasing costs in a given niche. So, if I have a 50% chance of dying every few months of

Speaker 1 a little mammal or a little lizard or something, that means the cost of going from three months to 30 months of learning and childhood development is not 10 times the loss.

Speaker 1 Now it's 2 to the negative 10. So

Speaker 1 a factor of 1,024 reduction in the benefit I get

Speaker 1 from what I ultimately learn, because 99.9%

Speaker 1 of the animals will have been killed before that point. We're in a niche where we're like a large, long-lived animal with language and technology, so where we can learn a lot from our groups.

Speaker 1 And that means it pays off to really

Speaker 1 just expand our investment on these multiple fronts in intelligence.

Speaker 2 That's so interesting.

Speaker 2 Just for the audience, the calculation about like two to the whatever months is just like you have a half chance of dying this month, a half chance of dying next month. You multiply those together.

Speaker 2 Okay, there's other species, though, that do live in flocks or as packs, where you could imagine, I mean, they do have like a smaller version of the development of cubs into that like play with each other.

Speaker 2 Why isn't this a hill on which they could have climbed to human-level intelligence themselves? If it's something like language or technology, humans were getting smarter before we got language.

Speaker 2 I mean, obviously, we had to get smarter to get language, right? We couldn't just get language without becoming smarter. So, yeah, where did it?

Speaker 2 It seems like there should be other species that should have beginnings of this sort of cognitive revolution, especially given how valuable it is, given, listen, we've dominated the world.

Speaker 2 You would think there would be selective pressure for it.

Speaker 1 Evolution doesn't have foresight.

Speaker 1 The thing in this generation that gets more surviving offspring and grandchildren, that's the thing that becomes more common.

Speaker 1 Evolution doesn't look ahead and say, oh, in a million years, you'll have a lot of descendants. It's what survives and reproduces now.

Speaker 1 And so, in fact, there are correlations where social animals do on average have larger brains.

Speaker 1 And part of that is probably probably that the additional social applications of brains, like keeping track of which of your group members have helped you before so that you can reciprocate.

Speaker 1 You scratch my back, I'll scratch yours, remembering who's dangerous within the group, that sort of thing, is an additional application of intelligence.

Speaker 1 And so there's some correlation there. But what it seems like is that,

Speaker 1 yeah, in most of these cases,

Speaker 1 it's enough to invest more,

Speaker 1 but not invest to the point where a mind can easily develop language and technology and pass it on.

Speaker 1 And so there are, you see bits of tool use in some other primates who have an advantage that, so compared to, say, the whales, who have, they have quite large brains, partly because they are so large themselves and they

Speaker 1 have some other thing, but they don't have hands, which means that reduces a bunch of ways in which brains can pay off and investments in the functioning of that brain.

Speaker 1 But yeah, so primates will use sticks to extract termites. Captured monkeys will open clams by smashing them with a rock.

Speaker 1 So there's bits of tool use, but what they don't have is the ability to sustain culture.

Speaker 1 A particular primate will maybe discover one of these tactics and maybe it'll be copied by their immediate group.

Speaker 1 But they're not holding onto it that well. They're like, well, when they see the other animal do it, they can copy it in that situation.
They don't actively teach each other.

Speaker 1 Their population locally is quite small. So it's easy to forget things, easy to lose information.
And in fact, they remain technologically stagnant for hundreds of thousands of years.

Speaker 1 And we can actually look at some human situations. So there's an old paper, I believe, by the economist Michael Kramer.

Speaker 1 It talks about technological growth in the different continents for human societies. And so you have Eurasia is the largest integrated connected area.

Speaker 1 Africa is partly connected to it, but the Sahara Desert restricts the flow of information and technology and such.

Speaker 1 And then you have the Americas, which were, after the colonization from the land bridge, were largely separated

Speaker 1 and are smaller than Eurasia, then Australia, and then you had like smaller island situations like Tasmania.

Speaker 1 And so technological progress seemed to have been faster, the larger the connected group of people.

Speaker 1 And in the smallest groups, so like in Tasmania, you had a relatively small population, and they actually lost technology.

Speaker 1 So things like they lost some like fishing techniques. And if you have a small population and you have some limited number of people who know a skill and

Speaker 1 they happen to die, or it happened, there's like, you know, some change in circumstances that causes people not to practice or pass on that thing, and then you lose it.

Speaker 1 And if you have few people, you're doing less innovation.

Speaker 1 The rate at which you lose technologies to some kind of local disturbance and the rate at which you create new technologies can wind up in balance.

Speaker 1 And the great change of hominids and humanity is that we wound up in this situation where we were accumulating faster than we were losing.

Speaker 1 And as we accumulated, those technologies allowed us to expand our population. They created additional demand for intelligence so that our brains became three times as large.
As chimpanzees.

Speaker 1 As chimpanzees, yeah. And our ancestors who had a similar brain size.

Speaker 2 Okay.

Speaker 2 And the crucial point, I guess, in relevance to AI is that the selective pressures against intelligence in other animals are not acting against these neural networks because we are, you know, they're not going to get like eaten by a predator if they spend too much time becoming more intelligent.

Speaker 2 We're explicitly training them to become more intelligent.

Speaker 2 So we have good first principles reason to think that if it was scaling that made our minds this powerful, and if the things that prevented other animals from scaling are not impinging on these neural networks, that these things should just continue to become very smart.

Speaker 1 Yeah, we are growing them in a technological culture where there are jobs like software engineer

Speaker 1 that depend much more on sort of cognitive output and less on things like metabolic resources devoted to the immune system or to like building big muscles to throw spears.

Speaker 2 This is kind of a side note, but I'm just kind of interested. I think you have referenced at some point, Jinchilla scaling.

Speaker 2 For the audience, this is a paper from DeepMind, which describes: if you have a model of a certain size, what is the optimum amount of data that it should be trained on?

Speaker 2 So you can imagine bigger models.

Speaker 2 You can use more data to train them. And in this way, you can figure out where you should spend your compute?

Speaker 2 Should you spend it on making the model bigger or should you spend it on training it for longer?

Speaker 2 I'm curious if in the case of different animals, in some sense, they're like model sizes, they're how big their brain is, and their training data size is like how long they're cubs or how long they're infants or toddlers or before they're full adults.

Speaker 1 Is there some sort of like scaling law of Yeah, I mean, so the chinchilla scaling is interesting because we were talking earlier about the cost function for having a longer childhood.

Speaker 1 And so where it's like exponentially increasing in the amount of training compute you have when you have exogenous forces that can kill you.

Speaker 1 Whereas when we do big training runs, the cost of throwing in more GPUs is almost linear. And it's much better to be linear than exponentially decaying.

Speaker 2 Oh, that's a really good point.

Speaker 1 As you expand resources. And so chinchilla scaling would suggest that, like, yeah,

Speaker 1 for a brain of sort of human size, it would be optimal to have many millions of years of education. But obviously, that's impractical because of exogenous mortality for humans.

Speaker 1 And so there's a fairly compelling argument that relative to the situation where we would train AI, that animals are systematically way undertrained. That's true.

Speaker 1 And now they're more efficient than our models. We still have room to improve our algorithms to catch up with the efficiency of brains, but

Speaker 1 They are laboring under that disadvantage.

Speaker 2 Yeah. That is so interesting.
Okay, so I guess another question you could have is

Speaker 2 humans got started on this evolutionary hill climbing route where we're getting more intelligent and it has more benefits for us. Why didn't we go all the way on that route?

Speaker 2 If intelligence is so powerful, why aren't all humans as smart as we know humans can be, right? At least that smart.

Speaker 2 If intelligence is so powerful, why hasn't there been stronger selective pressure?

Speaker 2 I understand like, oh, listen, hip size, you can't give birth to a a really big-headed baby or whatever, but you would think, like, evolution would figure out some way to offset that such if intelligence has such a big power and such is so useful.

Speaker 1 Yeah, I think if you actually look at it quantitatively, that's that's not true.

Speaker 1 And even in sort of recent history, there has been, it looks like a pretty close balance between the costs and the benefits of having more cognitive abilities. And so, you say, like,

Speaker 1 you know, who needs to worry about like, you know, the metabolic costs? Like, you need humans put like order 20%

Speaker 1 of our metabolic energy into the brain and it's higher for like young children.

Speaker 1 So 20%

Speaker 1 of the energy, along, and then, you know, there's like breathing and digestion and the immune system. And so

Speaker 1 for most of history, people have been dying left and right. Like a very large proportion of people will die of infectious disease.

Speaker 1 And if you put put more resources into your immune system, you survive. So it's like life or death pretty directly via that mechanism.

Speaker 1 And then this is related also:

Speaker 1 people die more of disease during famine. And so there's boom or bust.

Speaker 1 And so if you have 20% less metabolic requirements or has you're a child and you have a lot more, I mean, it's like 40 or 50% less metabolic requirements, you're much more likely to survive that famine.

Speaker 1 So

Speaker 1 these are pretty big.

Speaker 1 And then there's a trade-off about just cleaning mutational load. So every generation, new mutations and errors happen in the process of reproduction.
And so, like we know, there are many

Speaker 1 genetic abnormalities that occur through new mutations each generation. And in fact,

Speaker 1 Down syndrome is the chromosomal abnormality that you can survive. All the others just kill the embryo.

Speaker 1 And so we never see them.

Speaker 1 But, like, Down syndrome occurs a lot. And there are many other lethal mutations.

Speaker 1 And there's, as you go to less damaging ones, there are enormous numbers of less damaging mutations that are degrading every system in the body. And so evolution, each generation, has to

Speaker 1 pull away at some of this mutational load. And the priority with which that mutational load is pulled out scales in proportion to how much the traits it's affecting impact fitness.
So

Speaker 1 you get new mutations that impact your resistance to malaria.

Speaker 1 You get new mutations that damage brain function.

Speaker 1 And then those mutations are purged each generation.

Speaker 1 If malaria is a bigger difference in mortality than like the incremental effectiveness as a hunter-gatherer you get from being slightly more intelligent, then you'll purge that mutational load first.

Speaker 1 And similarly, if there's like, you know, humans have been vigorously adapting to new circumstances. So since agriculture, people have been developing things like the ability to,

Speaker 1 you know, have amylase to digest breads, the ability to like digest milk.

Speaker 1 And if you're evolving for all of these things, and if some of the things that give an advantage for that incidentally carry along nearby them some negative effect on another trait, then that other trait can be damaged.

Speaker 1 So it really matters how important to survival and reproduction cognitive abilities were compared to everything else the organism has to do.

Speaker 1 And that, in particular, like surviving feast and famine, having like the physical abilities to do hunting and hunting and gathering. And like, even if you're very good at planning your hunting,

Speaker 1 being able to throw a spear harder can be a big difference. And that needs energy to build those muscles and then to sustain them.

Speaker 1 And so, given all these

Speaker 1 factors,

Speaker 1 it's like, yeah, it's not a slam dunk

Speaker 1 to invest at the merge. And like today,

Speaker 1 having bigger brains, for example, it's associated with like greater cognitive ability, but it's like, it's modest.

Speaker 1 Large-scale pre-registered studies, pre-registered studies with MRI data,

Speaker 1 it's like, you know, in a range, maybe like a correlation of 0.25, 0.3.

Speaker 1 and the standard deviation of brain size is like 10 percent so if you

Speaker 1 double the size of the brain so go and the existing brain costs like 20 percent of metabolic energy go up to 40 percent okay it's like eight eight standard deviations of brain size if the correlation is like

Speaker 1 say it's 0.25

Speaker 1 um then

Speaker 1 Yeah,

Speaker 1 like you can you get a gain from that.

Speaker 1 Eight standard deviations of brain size two standard deviations of cognitive ability and like in our modern society where cognitive ability is very rewarded and like you know finishing finishing school becoming an engineer or a doctor or whatever can can pay off a lot financially still the like the average observed return um in like income is like a one or two percent proportional increase.

Speaker 1 There are more effects of the tail. There's more effect in professions like STEM.
But on the whole, it's not like,

Speaker 1 you know, if it was like a 5% increase or a 10% increase, then

Speaker 1 you could tell a story where, yeah, this is hugely increasing the amount of food you could have, you could support more children, but it's like, it's a modest effect, and the metabolic cost will be large, and then throw in

Speaker 1 these other aspects. And I think it's, you can tell the story.
And also, we can just, we can see there was not

Speaker 1 very strong, rapid, directional selection on the thing which there would be if like

Speaker 2 you know you could by solving uh like a by solving like a math puzzle you could defeat malaria um like then then there would be more evolutionary pressure that is so interesting and i'm not to mention of course that um yeah if you had like two ex the brain size you're you would like without c-section you would like your you or your mother would or both would die that this is a question i've actually been curious about for like over a year and i've like look briefly tried to look up an answer this is i i know this is off topic but i've I apologize to the audience, but I was super interested in those like that was like the most comprehensive and interesting answer I could have hoped for.

Speaker 2 Okay, so yeah, we have a good explanation, good first principles, evolutionary reason for thinking that intelligence scaling up to humans is

Speaker 2 not

Speaker 2 implausible just by throwing more scale at it.

Speaker 1 I would also add, this was something that would have mattered to me more in the 2000s. We also have the brain right here with us

Speaker 1 available for neuroscience to reverse engineer its its properties.

Speaker 1 And so in the 2000s, when I said, yeah, I expect this by middle of the century-ish, that was a backstop if we found it absurdly difficult to get to the algorithms.

Speaker 1 And then we would learn from neuroscience. But in

Speaker 1 the actual history, it's really not like that. We develop things in AI, and then often we can say, oh, yeah, this is sort of like this thing in neuroscience, or maybe this is a good explanation.

Speaker 1 But it's not as though neuroscience is driving AI progress. It turns out not to be that necessary.

Speaker 2 Similar to, I guess, you know, how

Speaker 2 planes were inspired by the existence proof of birds, but

Speaker 2 jet engines don't flap. All right, so yeah, scaling, good reason to think scaling might work.

Speaker 2 So we've spent $100 billion and we have something that is like human level or can do help significantly with AI research.

Speaker 1 I mean, that might be on the earlier end,

Speaker 1 but I mean, I definitely would not rule that out given the rates of change we've seen with the last few scale-ups.

Speaker 2 All right. So at this point,

Speaker 2 somebody might be skeptical. Okay, like, listen, we already have a bunch of human researchers, right? Like, the incremental researcher, how powerful is that?

Speaker 2 And then you might say, well, no, this is like thousands of researchers.

Speaker 2 I don't know how to express skepticism exactly, but skepticism is skeptical of just generally the effect of scaling up the number of people working on the problem.

Speaker 2 to rapid, rapid progress on that problem.

Speaker 2 Somebody might think, okay, listen, with humans, the reason population working on a problem is such a good proxy for progress on the problem is that there's already so much variation that is accounted for.

Speaker 2 When you say there's like a million people working on a problem, you know, there's like a

Speaker 2 hundreds of super geniuses working on it, thousands of people who are like very smart working on it. Whereas with an AI, all the copies are like the same level of intelligence.

Speaker 2 And if it's not super genius intelligence,

Speaker 2 the total quantity might not matter as much.

Speaker 1 Yeah, I'm not sure what your model is here. So is this a model that

Speaker 1 the diminishing returns kickoff suddenly has a cliff right where we are? And so like there was there were results in the past from throwing more people at problems.

Speaker 1 And I mean, this has been useful in historical prediction. One of the there's this idea of experience curves and Wright's law,

Speaker 1 basically measuring cumulative production in a field, or which is also gonna be a measure of the scale of effort and investment.

Speaker 1 And people have used this correctly to argue that renewable energy technology like solar would be falling rapidly in price because it was going from a low base of very small production runs, not much investment in doing it efficiently.

Speaker 1 And yeah,

Speaker 1 climate advocates correctly called out

Speaker 1 people and people like David Roberts,

Speaker 1 the futurist Ramez Naum, actually has some

Speaker 1 interesting writing on this that, yeah, correctly called out that there would be really drastic fall in prices of solar and batteries because of the increasing investment going into that.

Speaker 1 The Human Genome Project would be another. So I'd say there's like, yeah, real, real evidence.
These observed correlations

Speaker 1 from like ideas getting harder to find have have held over a fair

Speaker 1 a fair range of data and over quite a lot of time.

Speaker 1 So, I'm wondering

Speaker 1 what's

Speaker 1 the nature of the deviation you're thinking of?

Speaker 2 That

Speaker 2 we're talking about

Speaker 2 maybe this is like a good way to describe what happens when more humans enter a field.

Speaker 2 But does it even make sense to say like a greater population of AIs is doing AI research if there's like more GPUs running a copy of GPT-6 doing AI research?

Speaker 2 How applicable are these economic models models of human, the quantity of humans working on a problem

Speaker 2 to the magnitude of AIs working on a problem?

Speaker 1 Yeah, so if you have AIs that are directly automating

Speaker 1 particular jobs that humans were doing before, then we say, well, with additional compute, we can run more copies of them to do more of those tasks simultaneously.

Speaker 1 We can also run them at greater speed. And so some people have an intuition that, like, well, you know, what matters is like

Speaker 1 time. It's not how how many people working on a problem at a given point.
I think that doesn't bear out super well, but AI can also be run faster than humans. And so, if you have

Speaker 1 a set of AIs

Speaker 1 that can do the work of the individual human researchers and run at 10 times or 100 times the speed, then we ask, well, could the human research community have solved these algorithmic problems, do things like invent transformers

Speaker 1 over 100 years.

Speaker 1 If we have this, we have AIs with a population, effective population similar to that of the humans, but running 100 times as fast.

Speaker 1 And so

Speaker 1 you have to tell a story where, no, the AI, they can't really do the same things as the humans.

Speaker 1 And we're talking about what happens when the AIs are more capable of, in fact, doing that.

Speaker 2 Although they become more capable as lesser capable versions of themselves help us make themselves more capable, right? So you have to kickstart that at some point. Is there an example in

Speaker 2 analogous situations? Is intelligence unique in the sense that you have a feedback loop of

Speaker 2 with a learning curve or something else, a system's outputs are feeding into its own inputs in a way that, because if we're talking about something like Moore's Law or the cost of solar, you do have this way like we're more people are, we're throwing more people at the problem and it's we're, you know, we're making a lot of progress but we don't have the sort of additional part of the model where Moore's Law leads to more humans somehow

Speaker 2 and the more humans are becoming researchers.

Speaker 1 So you do actually have a version of that in the case of solar.

Speaker 1 So you have a small infant industry that's doing things like providing solar panels for space satellites and then getting increasing amounts of subsidized government demand because of

Speaker 1 worries about fossil fuel depletion and then climate change. You can have the dynamic where visible successes with solar or lowering prices then open up new markets.

Speaker 1 So there's a particularly huge transition where renewables become cheap enough to replace large chunks of the electric grid.

Speaker 1 Earlier, you're dealing with very niche situations like, yeah, so the satellites where you have it's very difficult to refuel a satellite in place and then remote areas and then moving to like, you know, the super sunny, the sunniest areas in the world with the biggest solar subsidies.

Speaker 1 And so, there was an element of that where more and more investment has been thrown into the field, and like the market has rapidly expanded as the technology improved.

Speaker 1 But I think the closest analogy is actually the long-run growth of human civilization itself.

Speaker 1 And I know you had Holden Karnofsky from the Open Philanthropy Project on earlier and discuss some of this research about the long-run acceleration of human population and economic growth.

Speaker 1 And so, developing new technologies allowed the human population to expand,

Speaker 1 humans to occupy new habitats and new areas, and then to invent agriculture, which supported larger populations, and then even more advanced agriculture in the modern industrial society. And so, their

Speaker 1 total technology and output allowed you to support more humans

Speaker 1 who then would discover more technology and continue the process.

Speaker 1 Now, that was boosted because, on top of expanding the population, the share of human activity that was going into invention and innovation went up.

Speaker 1 And that was a key part of the Industrial Revolution. There was no such thing as a corporate research lab or like an engineering university prior to that.

Speaker 1 And so you were both increasing the total human population and the share of it going in. But this population dynamic is pretty analogous.

Speaker 1 Humans invent farming, they can have more humans than they can invent industry, and so on.

Speaker 2 So maybe somebody would be skeptical that with AI progress specifically, it's not just a matter of

Speaker 2 some farmer figuring out crop rotation or some blacksmith figuring out how to do metallurgy better.

Speaker 2 You, in fact, even to make the, for the 50% improvement in productivity, you basically need something on the IQ that's close to Ilya Setscover.

Speaker 2 So there's like a discontinuous, you're like contributing very little to productivity, and then you're like Ilya, and then you contribute a lot. But the becoming Ilya is,

Speaker 2 you see what I'm saying? There's not like a gradual increase in capabilities that leads to the feedback.

Speaker 1 You're imagining a case where the distribution of tasks is such that there's nothing that you can where individually automating it particularly helps.

Speaker 1 And so the ability to contribute to AI research is really end-loaded. Is that what you're saying? Yeah.

Speaker 2 I mean, we already see this in

Speaker 2 these sorts of like really high IQ

Speaker 2 companies or projects where theoretically, I guess Jane Street or OpenAI could hire like a bunch of

Speaker 2 mediocre people to do, there's a comparative advantage. They could do some menial tasks and that could free up the time of the really smart people, but they don't do that, right?

Speaker 2 With transaction costs, whatever else.

Speaker 1 Self-driven cars would be another example where you have a very high quality threshold.

Speaker 1 And so when your performance as a driver is worse than a human, like you have 10 times the accident rate or 100 times the accident rate, then the cost of insurance for that, which is a proxy for people's willingness to ride the car and stuff too,

Speaker 1 would be such that the insurance cost would absolutely dominate. So, even if you have zero labor cost, it's offset by the increased insurance cost.

Speaker 1 And so, there are lots of cases like that where like partial automation is not in practice

Speaker 1 very usable because

Speaker 1 complementing other resources, you're going to use those other resources less efficiently.

Speaker 1 In a post-AGI future, I mean, the same thing can apply to humans. So people can say, well, comparative advantage,

Speaker 1 even if AIs can do everything better than a human, well, it's still worth something. The human can do to something,

Speaker 1 even, you know, they can lift a box. That's something.

Speaker 1 Now, there's a question of property rights.

Speaker 1 Well, if it could just slice up the human and use them to make more robots.

Speaker 1 But

Speaker 1 absent that, in such an economy, you wouldn't want to let a human worker into any industrial environment because in a clean room, they'll be emitting all kinds of skin cells and messing

Speaker 1 things up. You need to have an atmosphere there.
You need a bunch of supporting tools and resources and materials.

Speaker 1 And those supporting resources and materials will do a lot more productively working with AI and robots rather than a human. So you don't want to let a human anywhere near the thing, just like

Speaker 1 you don't want want to have a gorilla wandering around in a China shop, even if you've trained it to most of the time pick up a box for you if you give it a banana.

Speaker 1 It's just not worth it to have it wandering around your China shop. Yeah, yeah, yeah.

Speaker 2 Like, why is that not a good objection to...

Speaker 1 I mean, I think that is one of the ways

Speaker 1 in which partial automation can fail to really translate into a lot of economic value. That's something that will attenuate as we go on.

Speaker 1 And as the AI is more able to work independently and more more able to handle

Speaker 1 its own screw-ups, get more reliable.

Speaker 2 But the way in which it becomes more reliable is by AI progress speeding up, which happens if AI can contribute to it. But

Speaker 2 if there is some sort of reliability bottleneck that prevents it from contributing to that progress, then you don't have the loop, right?

Speaker 1 Yeah, I mean, this is why we're not there yet.

Speaker 2 Right, but then what is the reason to think we'll be there at...

Speaker 1 The broad reason is we have these inputs that are scaling up.

Speaker 1 So EPOC, which I mentioned earlier, they have a paper, I think it's called Compute Trends in Three Eraas of Machine Learning or something like that.

Speaker 1 And so they look at the compute expended on machine learning systems since the founding of the field of AI at the beginning of the 1950s.

Speaker 1 And so mostly it grows with Moore's Law.

Speaker 1 And so people are spending a similar amount on their experiments,

Speaker 1 but they can just buy more with that because the compute is coming. And so

Speaker 1 that data, I mean, it covers over 20 orders of magnitude, maybe like 24.

Speaker 1 And of all of those increases since 1952, a little more than half of them happened between 1952 and 2010. And all the rest is since 2010.

Speaker 1 So we've been scaling that up like four times as fast as was the case for most of the history of AI.

Speaker 1 We're running through the orders of magnitude of possible resource inputs you could need for AI much, much more quickly than we were for most of the history of AI.

Speaker 1 That's why this is a period of like with a very elevated chance of AI per year, because we're moving through so much of the space of inputs per year. And indeed, it looks like this scale-up

Speaker 1 taken to its conclusion will cover another bunch of orders of magnitude.

Speaker 1 And that's actually a large fraction of those that are left before you start running into saying, well, this isn't going to have to be like evolution with the sort of simple hacks we get to apply.

Speaker 1 Like we're selecting for intelligence the whole time. We're not going to do the same mutation that causes

Speaker 1 fatal childhood cancer a billion times, even though, I mean, We keep getting the same fatal mutations, even though they've been done many times.

Speaker 1 We use gradient descent, which takes into account the derivative of improvement on the loss all throughout the network.

Speaker 1 And we don't throw away all of the contents of the network with each generation, where you compress down to a little DNA.

Speaker 1 So there's that bar of like, well, if you're going to do brute force like evolution combined with these sort of very simple ways, we can save orders of magnitude on that.

Speaker 1 We're going to cover, I think, a fraction that's like half of that distance in this scale up over the next 10 years or so.

Speaker 1 And so if you started off with a kind of vague uniform prior, you're like, well,

Speaker 1 you probably can't make AGI with like the amount of compute that would be involved in a fruit fly existing for a minute, which would be the early days of AI.

Speaker 1 Maybe you'd get lucky.

Speaker 1 We were able to make calculators because calculators benefited from like very reliable, serially fast computers and where we could take a tiny, tiny, tiny, tiny fraction of a human brain's compute and use it for a calculator.

Speaker 1 We couldn't take an ant's brain and rewire it to a calculator. It's

Speaker 1 hard to manage ant farms, let alone get them to do arithmetic for you.

Speaker 1 And so there were some things where we could exploit the differences between biological brains and computers to do stuff super efficiently on computers.

Speaker 1 We would doubt that we would be able to do so much better than biology that we with a tiny fraction of an insect's brain, we'd be able to get AI early on.

Speaker 1 On the far end, it seemed very implausible that we couldn't do better than completely brute force evolution.

Speaker 1 And so, in between, you have some number of orders of magnitude of inputs where it might be. And, like, in the 2000s, I would say, well, you know, I'm going to have a pretty uniform-ish prior.

Speaker 1 I'm going to put weight on it happening at like the sort of the equivalent of like 10 to the 25 ops, 10 to the 30, 10 to the 35,

Speaker 1 and sort of spreading out over that. And then I can update on other information.

Speaker 1 And in the short term, I would say, like in 2005, I would say, well, I don't see anything that looks like the cusp of AGI.

Speaker 1 So I'm also going to lower my credence for like the next five years or the next 10 years. And so that would be kind of like a vague prior.

Speaker 1 And then when we take into account, like, well, how quickly are we running through those orders of magnitude?

Speaker 1 If I have a uniform prior, I assign half of my weight to the first half of remaining orders of magnitude. And if we're going to run through those over the next 10 years and some,

Speaker 1 then that calls on me to put half of my credence conditional on wherever we're going to make it AI, which seems likely the material object easier than evolution.

Speaker 1 I've got to put similarly a lot of my credence on AI happening in this scale-up.

Speaker 1 And then that's supported by what we're seeing in terms of the rapid advances in capabilities with AI and LLMs in particular.

Speaker 2 Okay, that's actually a a really interesting point.

Speaker 2 So now, but somebody might say, listen, there's not some sense in which AIs could universally speed up the progress of open AI by 50% or 100% or 200%.

Speaker 2 If they're not able to do everything better than Ilya Suscover can, there's going to be something in which we're bottlenecked by the human researchers.

Speaker 2 And bottleneck effects dictate that the slowest moving part of the organization will be the one that kind of determines the speed of the progress of the whole organization or the whole project, which means that unless you get to the point where you're like doing everything everybody in the organization can do, you're not going to significantly speed up the progress of the whole project as a whole.

Speaker 1 Yeah, so that is a hypothesis, and I think there's a lot of truth to it.

Speaker 1 So, when we think about like the ways in which AI can contribute, so there are things we talked about before, like the AI is setting up their own curriculum, and that's something that ILIA can't do directly and doesn't do directly.

Speaker 1 And there's a question: how much does that improve performance?

Speaker 1 There are these things where

Speaker 1 the AI helps to just produce some code for some tasks. And it's beyond hello world at this point.
But I mean, the sort of thing that I hear from AI researchers at leading labs is that

Speaker 1 on their core job where they're most expert, it's not helping them that much. But then, you know, their job often does involve, oh,

Speaker 1 I've got to code something that's out of my usual area of expertise,

Speaker 1 or I want to research this question, and it helps them there.

Speaker 1 And so that saves some of their time and frees them to do more of the bottlenecked work. And then I think the idea of, well,

Speaker 1 is everything dependent on Ilya? And is Ilya so much better than the hundreds of other employees? I think there are a lot of people who are contributing, they're doing a lot of tasks.

Speaker 1 And so you can have quite

Speaker 1 lot of gain from automating some areas where you then do just an absolutely enormous amount of it relative to what you would have done before.

Speaker 1 Because things like designing the custom curriculum, you had some humans put some work into that, but you're not going to employ billions of humans to produce it at scale.

Speaker 1 And so it winds up being a larger share of the progress.

Speaker 1 than it was before.

Speaker 1 You get some benefit from these sorts of things where, ah, yeah, there's like pieces, pieces of my job that now I can hand off to the AI and lets me focus more on the things that the AI still can't do.

Speaker 1 And then at the later on, you get to the point where, yeah, the AI can do your job,

Speaker 1 including the most difficult parts. And maybe it has to do that in a different way.
Maybe it

Speaker 1 spends a ton more time thinking about.

Speaker 1 each step of a problem than you. And that's the late end.

Speaker 1 And the stronger these bottleneck effects are, the more the economic returns, the scientific returns and such are end-loaded towards getting sort of full AGI.

Speaker 1 The weaker the bottlenecks are, the more interim results will be really paying off.

Speaker 2 I guess I'd probably disagree with you on how much the sort of the alias of organizations seem to matter.

Speaker 2 I guess just from the evidence alone, like how many of the big sort of breakthroughs that in deep learning in general was like that single individual responsible for, right?

Speaker 2 And how much of his time is he spending doing anything that's analyzing that like Copilot is helping him on? I'm guessing most of it's just like managing people and coming up with ideas and

Speaker 2 trying to understand systems and so on.

Speaker 2 And if that is the,

Speaker 2 if like the five or 10 people who are like that at Open AI or Anthropic or whatever are the

Speaker 2 basically the way in which the progress is happening, or at least the algorithmic progress is happening, then

Speaker 2 how much of better and better co-pilot? I know copilot is not the thing you're talking about with like 20% automation, but something like that, how much of

Speaker 2 yeah, how much is that contributing to the sort of like core function of the research scientist?

Speaker 1 Yeah, no, true, quantitatively, how much we

Speaker 1 disagree about the importance of sort of

Speaker 1 key research employees and such. I certainly think that some researchers, you know, add

Speaker 1 more than 10 times the average employee, even much more. And obviously, managers can add an enormous amount of value by proportionately multiplying the output of the many people that they manage.

Speaker 1 And so that's the kind of thing that we were discussing earlier when talking about, well, if you had sort of full human-level

Speaker 1 AI or AI that had all of the human capabilities plus AI advantages,

Speaker 1 it would be, you know, you'd benchmark not off of what the sort of typical human performance is, but peak human performance and beyond. So I, yeah, I accept all that.

Speaker 1 I do think it

Speaker 1 makes a big difference for people how much they can outsource a lot of the tasks that are less, wow, less creative. And an enormous amount is learned by experimentation.
ML has been

Speaker 1 a quite experimental field. And there's a lot of engineering work in, say, building large superclusters,

Speaker 1 making

Speaker 1 hardware-aware optimization and coding of these things, being able to do the parallelism in large models. And the engineers

Speaker 1 are busy, and it's not just only a big thoughts kind of area. And then the other branch

Speaker 1 is

Speaker 1 where will the AI advantages and disadvantages be?

Speaker 1 And so

Speaker 1 one AI advantage is being omnidisciplinary and familiar with the newest things. So I mentioned before, there's no human who has a million years of TensorFlow experience.

Speaker 1 And so to the extent that we're interested in like the very, very cutting edge of things that have been developed quite recently, then AI that can learn about them in parallel.

Speaker 1 and experiment and practice with them in parallel can learn much faster than a human potentially.

Speaker 1 And the area of computer science is one that is especially suitable for AI to learn in a digital environment.

Speaker 1 So it doesn't require like driving a car around that might kill someone, have enormous costs.

Speaker 1 You can do unit tests,

Speaker 1 you can prove theorems,

Speaker 1 you can do all sorts of operations entirely in the confines of a computer, and which is one reason why programming has been benefiting more than a lot of other areas from LLMs recently, whereas robotics is lagging.

Speaker 1 So there's some of that. And then just considering, well, actually, I mean, they are getting better at things like

Speaker 1 the GRE math at programming contests. And I mean, some people have forecasts and predictions outstanding about things like doing well on

Speaker 1 the Informatics Olympiad and the Math Olympiad.

Speaker 1 And

Speaker 1 in the last few years, when people tried to forecast the MMLU benchmark, which was having a lot of more sophisticated kind of

Speaker 1 graduate student science kind of questions,

Speaker 1 AI knocked

Speaker 1 that down a lot faster than AI researchers who had registered and students who had registered forecasts on it.

Speaker 1 And so if you're getting top-notch scores on graduate, graduate exams, creative problem solving.

Speaker 1 It's not obvious that that sort of area will be a relative weakness of AI, that in fact, computer science is in many ways especially suitable because of getting up to speed with new areas, being able to get rapid feedback from the interpreter at scale.

Speaker 2 But do you get rapid feedback if you're doing that something that's more analogous to research? If you're like,

Speaker 2 let's say you have a a new model or something, and it's like, if we put in $10 million on a mini trading run on this, this would be a much better.

Speaker 1 Yeah, for very large models, those experiments are going to be quite expensive.

Speaker 1 And so you're going to look more at like, can you build up this capability by generalization from things like mini math problems, programming problems, working with small networks?

Speaker 2 Yeah, yeah, fair enough. I actually,

Speaker 2 Scott Aronson was one of my professors in college.

Speaker 2 And I took his quantum information class and I didn't do

Speaker 2 that. I did okay in it, but

Speaker 2 he recently wrote a blog post where he said, you know, I had GP4 take my quantum information test and it got a B. And I was like, damn, I got a C

Speaker 2 on the final. So yeah, yeah, I'm updated in

Speaker 2 the direction that, yeah, I get, like, you know, it seems getting a B on the test, like you probably understand quantum information pretty well.

Speaker 1 With different areas of strengths and weaknesses than the human students. Sure, sure.

Speaker 2 Would it be possible for this sort of intelligence explosion to happen without any sort of hardware progress?

Speaker 2 If hardware progress stopped, would this feedback loop still be able to produce some sort of explosion with only software?

Speaker 1 Yeah. So if we say that the technology is frozen, which I think is not the case right now,

Speaker 1 NVIDIA has managed to deliver significantly better chips for AI workloads for the last few generations, H100, A100, V100.

Speaker 1 If that stops entirely, then what you're left with, and maybe we'll define this as like no more nodes, Moore's Law is over.

Speaker 1 At that point, the kind of gains you get in the amount of compute available come from actually constructing more chips.

Speaker 1 And there are economies of scale you could still realize there. So right now, a chip maker has to amortize the R ⁇ D cost of developing the chip.

Speaker 1 And then the capital equipment is created. You build a fab, its peak profits are going to come in the few years when the chips it's making are at the cutting edge.

Speaker 1 Later on, as the cost of compute exponentially falls,

Speaker 1 you keep the fab open because you can still make some money given that it's built.

Speaker 1 But of all of the profits the fab will ever make right now, they're relatively front-loaded because when its technology is near the cutting edge.

Speaker 1 So, in a world where Moore's Law ends, then you wind up with these very long production runs

Speaker 1 where

Speaker 1 you can keep making chips that stay at the cutting edge and where the RD costs get amortized over a much larger base. So the RD basically

Speaker 1 drops out of the price. And then you get some economies of scale from just making so many fabs in the way that

Speaker 1 when we have the auto industry expands.

Speaker 1 And then this is in general across industries,

Speaker 1 when you produce a lot more, costs fall because you have right now, like ASML has

Speaker 1 many

Speaker 1 incredibly exotic suppliers that make some bizarre part of the thousands of parts in one of these ASML machines. You can't get it anywhere else.

Speaker 1 They don't have standardized equipment for their thing because this is the only use for it.

Speaker 1 And in a world where we're making 10, 100 times as many chips at the current node, then they would benefit from scale economies. And all of that would become more mass production industrialized.

Speaker 1 And so you combine all of those things and it seems like capital costs of like buying a chip would decline, but the energy costs of running the chip would not.

Speaker 1 And so right now, energy costs are a minority of the cost, but

Speaker 1 they're not trivial. You know, they pass, yeah, they passed 1% a while ago and they're you know, they're inching up towards 10% and beyond.

Speaker 1 And so you can maybe get like another order of magnitude cost decrease from getting really efficient in the sort of capital construction, but like energy would still

Speaker 1 be a limiting factor after the end of sort of actually improving the chips themselves.

Speaker 2 Got it, got it. And when you say like there would be a greater population of AI researchers, because

Speaker 2 are we using population as a sort of thinking tool of how they could be more effective?

Speaker 2 Or do you literally mean that the way you expect these AIs to contribute a lot to research is by just having like a million copies of this of like a researcher thinking about the same problem?

Speaker 2 Or is it just like a usual thinking model for what it would look like to have a million times smarter AI working on that problem?

Speaker 1 That's definitely a lower bound sort of model. And often I'm meaning something more like

Speaker 1 effective population or like you'd need this many people to have this effect. And so we were talking earlier about the trade-off between training and inference

Speaker 1 in board games.

Speaker 1 And so you can get the same performance by having a bigger model or by calling the model more times. And in general,

Speaker 1 it's more effective to have a bigger, smarter model and call it less timed up until a point where the costs equalize between them.

Speaker 1 And so we would be taking some of the gains of our larger compute on having bigger models that are individually more capable. And there would be a division of labor.

Speaker 1 So like the tasks that we're most cognitively demanding would be done by these giant models, but some very easy tasks, you don't want to expand that giant model if

Speaker 1 a model 1100 is the size can take that task.

Speaker 1 And so larger models would be in the positions of like researchers and managers, and they would have swarms of AIs of different sizes as tools that they could make API calls to and whatnot.

Speaker 2 Okay, we accept the model and now we've gotten to something that is at least as smart as Alias Escover on all the tasks relevant to AI progress. And you can have so many copies of it.

Speaker 2 What happens in the world now? What do the next months or years or whatever timeline is relevant look like?

Speaker 1 And so, and to be clear,

Speaker 1 what's happened is not that we have something that has all of the abilities and advantages of humans plus the AI advantages.

Speaker 1 What we have is something that is like possibly by doing things like doing a ton of calls to make up for being individually less capable or something, it's able to drive forward AI progress.

Speaker 1 That process is continuing. So AI progress has accelerated greatly in the course of getting there.
And so maybe we go from our eight months doubling time of software progress

Speaker 1 in effective compute to four months or two months.

Speaker 1 And so

Speaker 1 there's a report by Tom Davidson at the Open Philanthropy Project, which spun out of

Speaker 1 work I had done previously. And so

Speaker 1 I advised and

Speaker 1 helped with that project. But Tom really carried it forward and produced a very nice report and model, which Epoch is hosting.

Speaker 1 You can plug in your own versions of the parameters, and there is a lot of work estimating

Speaker 1 the parameter. Things like what's the rate of software progress, what's the return to additional work, how does performance scale at these tasks as you boost the models?

Speaker 1 And in general, as we were discussing earlier, the sort of like

Speaker 1 broadly

Speaker 1 human level in every domain with all the advantages is

Speaker 1 pretty deep into that. And so if already we can have an eight months doubling time for software progress,

Speaker 1 then by the time you get to that kind of point, it's maybe more like four months, two months

Speaker 1 going into one month.

Speaker 1 And so, if the thing is just proceeding at full speed, then each doubling can come more rapidly.

Speaker 1 And so, we can talk about what are the spillovers

Speaker 1 of like, so how does the models get more capable? They can be doing other stuff in the world. You know, they can spend some of their time making Google search more efficient.

Speaker 1 They can be, you know, hired as chatbots with some inference compute.

Speaker 1 And then we can talk about sort of

Speaker 1 if that intelligence explosion process is allowed to proceed, then what happens is: okay,

Speaker 1 you improve your software

Speaker 1 by a factor of two. The demand,

Speaker 1 the efforts needed to get the next doubling are larger, but they're not twice as large. Maybe they're like 25%, 35% larger.

Speaker 1 So each one comes faster and faster until you hit limitations. Like you can no longer make further software advances with the hardware that you have.

Speaker 1 And looking at, I think, reasonable parameters in that model, it seems to me if you have these giant training runs, you can go very far.

Speaker 1 And so the way I would see this playing out is, as the AIs get better and better at research, they can work on different problems.

Speaker 1 They can work on improving software, they can work on improving hardware. They can do things like create new industrial technologies, new energy technology.
They can manage robots.

Speaker 1 They can manage human workers as executives and coaches and whatnot.

Speaker 1 You can do all of these things.

Speaker 1 And AIs wind up being applied where the returns are highest.

Speaker 1 And I think initially, the returns are especially high in doing more software. And the reason for that is, again,

Speaker 1 if you improve the software, you can update all of the GPUs that you have access to.

Speaker 1 You know,

Speaker 1 your cloud compute is suddenly more potent. If you design a new

Speaker 1 chip design,

Speaker 1 it'll take a few months to produce the first ones, and it doesn't update all of your old chips.

Speaker 1 So you have an ordering where you start off with the things where there's the lowest dependence on existing stocks. And you can more just take whatever you're developing and apply it immediately.

Speaker 1 And so software runs ahead, you're getting more towards the limits of that software. And I think that means things like having all the human advantages, but combined with AI advantages.

Speaker 1 And so when we're discussing, I think that means

Speaker 1 given the kind of compute that would be involved, if we're talking about this hundreds of billions of dollar, trillion dollar training run there's enough compute to run tens of millions hundreds of millions of sort of like human scale mines they're probably smaller than human scale to be like similarly efficient at the limits of algorithmic progress because they have the advantage of a million years of education they have the other advantages we talked about you so you've got that wild capability and the software further software gains are running out or like they start to slow down again because you're just getting towards the limits of like you can't do any better than the best.

Speaker 1 And so, what happens then? Yeah,

Speaker 2 by the time they're running out, have you already hit super intelligence?

Speaker 1 Yes, you're okay, you're widely super intelligent.

Speaker 2 We've left the galaxy, okay.

Speaker 1 Even metaphorically, just by having the abilities that humans have and then combining it with being very well focused and trained in the task beyond what any human could be, and then running faster and such.

Speaker 1 Got it, got it. All right, sorry, continue.
Yeah, so I'm not going to assume that there's like huge qualitative improvements you can have.

Speaker 1 I'm not going to assume that humans are like very far from the efficient frontier of software, except with respect to things like, yeah, we had limited lifespan, so we couldn't train super intensively, we couldn't incorporate other software into our brains, we couldn't copy ourselves, we couldn't run at fast speeds.

Speaker 1 Yeah, so you've got all of those capabilities.

Speaker 1 And now I'm skipping ahead of like the most important months in human history.

Speaker 1 And so I can talk about sort of

Speaker 1 what it looks like if it's just

Speaker 1 the AIs took over, they're running things as they like,

Speaker 1 how do things expand. I can talk about things as

Speaker 1 how does this go,

Speaker 1 you know, in a world where we've roughly, or at least so far, managed to

Speaker 1 retain control of where these systems are going.

Speaker 1 And so, by jumping ahead, I can talk about how would this translate into the physical world.

Speaker 1 And so, this is something that I think is a stopping point for a lot of people in thinking about, well, what would an intelligence explosion look like?

Speaker 1 And they have trouble going from, well, there's stuff on servers and cloud compute, and oh, that gets very smart. But then, how does what I see in the world change?

Speaker 1 How does like industry or military power change? If there's an AI takeover, like, what does that look like? Are there killer robots? And so, yeah, so one course we might go down is to discuss

Speaker 1 during that wildly accelerating transition,

Speaker 1 how did we manage that?

Speaker 1 How do you avoid it being catastrophic?

Speaker 1 And another route we could go is:

Speaker 1 how does the translation from

Speaker 1 wildly expanded scientific R D capabilities intelligence on these servers

Speaker 1 translate into things in the physical world. So you're moving along in order of like what has the quickest impact largely or like where you can have

Speaker 1 an immediate immediate change.

Speaker 1 So one of the most immediately accessible things

Speaker 1 is

Speaker 1 where we have large numbers of devices or artifacts or capabilities that are already AI operable with

Speaker 1 hundreds of millions equivalent researchers, you can quickly solve self-driving cars,

Speaker 1 make the algorithms much more efficient, do great testing and simulation, and then operate a large number of cars in parallel if you need to get some additional data to improve the simulation and reasoning.

Speaker 1 Although, in fact, humans with quite little data

Speaker 1 are able to achieve human-level driving performance.

Speaker 1 So after you've really maxed out the easily accessible algorithmic improvements in this software-based intelligence explosion that's mostly happening on server farms, then

Speaker 1 you have minds that have been able to really perform on a lot of digital-only tasks. They're doing great on video games.
They're doing great at predicting what happens next in a YouTube video.

Speaker 1 If they have a camera that they can move, they're able to predict what will happen

Speaker 1 at different angles. Humans do this a lot, where we naturally move our eyes in such a way to get

Speaker 1 images from different angles and different presentations, and then predict and combine from that.

Speaker 1 And yeah, and you can operate many cars, many robots at once to get very good robot controllers.

Speaker 1 So you should think that all the existing robotic equipment or remotely controllable equipment that is wired for that, the AIS can operate that quite well.

Speaker 2 I think some people might be skeptical that existing robots, given their current hardware, have the dexterity and the maneuverability to do a lot of physical labor that an AI might want to do.

Speaker 2 Do you have reason for thinking otherwise?

Speaker 1 There's also not very many of them. So, production of sort of industrial robots is hundreds of thousands per year.

Speaker 1 You know, they can do quite a bit in place. Elon Musk is promising a robot and the tens of thousands of humanoid robots and tens of thousands of dollars.

Speaker 1 You know, that may take a lot longer than

Speaker 1 he

Speaker 1 has said, as has happened with other technologies. But I mean, that's a direction to go.
But most immediately. So hands are actually probably the most scarce thing.

Speaker 1 But if we consider what do human bodies provide? So there's the brain.

Speaker 1 And in this situation, we have now an abundance of high quality brain power that will be increasing as the AIs will have designed new chips, which will be rolling out from the TSMC factories.

Speaker 1 And they'll have ideas and designs for the production of new fab technologies, new nodes and additional fabs.

Speaker 1 But yeah, looking around the body. So there's legs to move around.

Speaker 1 Not only that necessary. Wheels work pretty well being in a place.
You don't need most people most of the time in factory jobs and in office jobs. Office jobs, many of them can be fully virtualized.

Speaker 1 But yeah, some amount of legs, wheels, other transport. You have hands, and hands

Speaker 1 are something that are on the expensive end in robots.

Speaker 1 We can make them. They're made in very small production runs, partly because we don't have the control software to use them well.
In this world, the control software is fabulous.

Speaker 1 And so people will produce much larger production runs of them over time, possibly using technology we recognize, possibly with quite different technology, but just taking what we've got.

Speaker 1 So, right now,

Speaker 1 the robot arm industry, the industrial robot industry, produces hundreds of thousands of machines a year.

Speaker 1 Some of the nicer ones are like $50,000.

Speaker 1 In aggregate, the industry has tens of billions of dollars of revenue. By comparison, the automobile industry produces like, I think, over 60 million cars a year.
It has revenue of over $2 trillion

Speaker 1 per annum.

Speaker 1 And

Speaker 1 so converting that production capacity over towards robot production would be one of the things, if there's not something better to do, would be one of the things to do. And in World War II,

Speaker 1 you know, industrial conversion of American industry took place over several years

Speaker 1 and really amazingly, you know, ramped up military production by converting existing civilian industry.

Speaker 1 And that was without the aid of superhuman intelligence and management at every step in the process. So

Speaker 1 Yeah, every part of that would be very well designed. You'd have AI workers who

Speaker 1 stood every part of the process and could direct human workers.

Speaker 1 Even in

Speaker 1 a fancy factory, most of the time, it's not the hands doing a physical motion that a worker is being paid for. They're often like looking at things or

Speaker 1 deciding what to change.

Speaker 1 The actual time spent in manual motion is a limited portion of that. And so in this world of abundant AI cognitive abilities,

Speaker 1 where the human workers are more valuable for their hands than their heads, then

Speaker 1 you could have a worker, even a worker previously without training and expertise in the area,

Speaker 1 who has a smartphone, maybe a smartphone on a headset.

Speaker 1 And we have billions of smartphones, which have eyes and ears and methods for communication for an AI to be talking to a human and directing them in their physical motions with skill as a guide and coach that is beyond any human.

Speaker 1 They're going to be a lot better at telepresence and remote work. And they can provide VR and augmented reality guidance

Speaker 1 to help people get better at doing the physical motions that they're providing in the construction. Say

Speaker 1 you convert the auto industry.

Speaker 1 to robot production.

Speaker 1 If it can produce an amount of mass of machines that is similar to what it currently produces, that's enough for

Speaker 1 billion human-sized robots a year.

Speaker 1 The value per

Speaker 1 kilogram

Speaker 1 of cars is somewhat less than high-end robots. But

Speaker 1 yeah, you're also cutting out most of the wage bill because most of the wage bill is payments ultimately to like human capital and education, not to the physical hand motions

Speaker 1 and lifting objects and that sort of task. Yeah, so at the sort of existing scale of the auto industry, you can make a billion robots a year.

Speaker 1 The auto industry is two or three percent of the existing economy. You're replacing these

Speaker 1 cognitive things. So if right now physical hand motions are like 10% of the work,

Speaker 1 redirect humans into those tasks, and you have, like, in the world at large right now,

Speaker 1 mean income is on the order of $10,000 a year, but in rich countries, skilled workers earn more than $100,000 per year. And some of that is just like, you know, is not just

Speaker 1 management roles of which only a certain proportion of the population can have, but just being an absolutely

Speaker 1 exceptional peak end human performance of some of these construction and such roles. Yeah, just raising productivity to match the most productive workers in the world

Speaker 1 is room to make a very big gap.

Speaker 1 And with AI replacing skills that are scarce in many places where there's abundant

Speaker 1 currently low-wage labor, you bring in the AI coach and someone who was previously making very low wages can suddenly be super productive by just being the hands

Speaker 1 for an AI.

Speaker 1 And so, you know, on a naive view, if you ignore the delay of capital adjustment, of like building new tools for the workers, say, like,

Speaker 1 yeah, just like raise typical productivity for workers around the world to be more like rich countries

Speaker 1 and get 5x, 10x like that.

Speaker 1 Get more productivity by, with AI handling the difficult cognitive tasks, reallocating people from like office jobs to providing physical motions.

Speaker 1 And since right now, that's a small proportion of the economy, you can expand the sort of hands for manual labor by like an order of magnitude, like within a rich country, by just because most people

Speaker 1 are sitting in an office or even in a factory floor are not continuously moving. So you've got billions of hands lying around in humans to be used in the course of constructing your waves of robots.

Speaker 1 And now, once you have

Speaker 1 a quantity of robots that is approaching the human population, and they work 24-7, of course,

Speaker 1 the human labor will no longer be valuable as hands and legs. But at the very beginning of the transition, just like new software can be used to update all of the GPUs to run the latest AI,

Speaker 1 Humans are this sort of legacy population with an enormous number of underutilized hands and feet that the AI can use for the initial robot construction.

Speaker 2 Cognitive tasks are being automated and the production of them is greatly expanding.

Speaker 2 And then the physical tasks which complement them are utilizing humans to do the parts that robots that exist can't do.

Speaker 2 Is the implication of this that you're getting to that world production would increase just a tremendous amount or that AI could get a lot done of whatever motivations it has.

Speaker 1 Yeah, so there's an enormous increase in production for humans who just switching over to the role of providing hands and feet for AI where they're limited.

Speaker 1 And this robot industry is a natural place to apply it. And so if you go to something that's like

Speaker 1 10x the size of like the current car industry in terms of its

Speaker 1 production, which would still be like a third of our current economy. And the aggregate productive capabilities of the society with AI support are going to be a lot larger.

Speaker 1 They make 10 billion humanoid robots a year.

Speaker 1 And then, I mean, if you do that,

Speaker 1 you know, the legacy population of a few billion human workers is no longer very important for the physical tasks. And then

Speaker 1 the new

Speaker 1 automated industrial base can just produce more factories, produce more robots. And then the interesting thing is: like, what's the doubling time?

Speaker 1 How long does it take for a set of computers, robots, factories, and supporting equipment to produce another equivalent quantity of that? For GPUs, brains, this is really, really easy, really solid.

Speaker 1 There's an enormous margin there.

Speaker 1 We were talking before about,

Speaker 1 yeah, skilled human workers getting paid $100 an hour is like quite

Speaker 1 normal in developed countries for very in-demand skills.

Speaker 1 And you make a GPU

Speaker 1 that can do that work.

Speaker 1 Right now, these GPUs are like tens of thousands of dollars.

Speaker 1 If you can do $100

Speaker 1 of wages each hour, then

Speaker 1 in a few weeks, you pay back your costs. If the thing is more productive,

Speaker 1 and as we were discussing, you can be a lot more productive than a sort of a typical high-paid human professional by being like the very best human professional, and even better than that by having a million years of education and working all the time.

Speaker 1 Yeah, then you could get even shorter payback times. Like, yeah, you can generate the dollar value of the cost, initial cost of that equipment within a few weeks.

Speaker 1 For robots, so like a human factory worker

Speaker 1 can earn $50,000

Speaker 1 a year.

Speaker 1 Are really top-notch factory workers

Speaker 1 earning more and

Speaker 1 working all the time. If they can produce a few hundred thousand dollars of value per year and buy a robot that costs $50,000 to replace them, then

Speaker 1 that's a payback time of some months.

Speaker 2 That is about the financial financial return.

Speaker 1 Yeah, and we're going to get to the physical capital return because those are going to diverge in this scenario. Right, yeah.

Speaker 2 Because right now, because it seems like it's going to be a given that, like, all right, these super intelligent obviously can be able to make a lot of money and can be like very valuable.

Speaker 1 Can it like physically scale up? What we really care about are like the actual physical operations that a thing does. Yeah.
How much do they contribute to these tasks? And I'm using this as a

Speaker 1 start

Speaker 1 to try and get back to the physical replication times.

Speaker 2 But so, I guess I'm wondering: what is the implication of this?

Speaker 2 Because I think you started off this by saying, like, people have not thought about what the physical implications of superintelligence would be. What is the bigger takeaway?

Speaker 2 What are we wrong about when we think about what the world would look like with super intelligence?

Speaker 1 With robots that are optimally operated by AI, so like extremely finely operated, and with building technological designs and equipment and facilities under AI direction,

Speaker 1 how much can they produce?

Speaker 1 For a doubling need the AI is to produce stuff

Speaker 1 that is,

Speaker 1 in aggregate, at least equal to their own cost.

Speaker 1 And so now we're pulling out these things like labor costs.

Speaker 1 that no longer apply and then trying to zoom in on like what these these capital costs will be. You're still going to need the raw materials.

Speaker 1 You're still going to need the robot time building the next robot.

Speaker 1 I think it's pretty likely that with the advanced AI work, they can design some incremental improvements and with the industry scale-up that you can get tenfold and better cost reductions on

Speaker 1 the system by making things more efficient and replacing the human cognitive labor. And so maybe that's like you need $5,000

Speaker 1 of

Speaker 1 costs under our sort of current environment. But the big change in this world

Speaker 1 is we're trying to produce this stuff faster. If we're asking about the doubling time of the whole system

Speaker 1 in, say, one year, if you have to build a whole new factory to like double everything, you don't have time to amortize the cost of that factory.

Speaker 1 Like right now, you might build a factory and use it for 10 years and like buy some equipment and use it for five years. And so that's part of your capital cost.

Speaker 1 And in an accounting context, you depreciate each year a fraction of that capital purchase.

Speaker 1 But if we're trying to double our entire industrial system in one year, then those capital costs have to be multiplied.

Speaker 1 So if we're going to be getting most of the return on our factory in the first year

Speaker 1 instead of 10 years

Speaker 1 weighted appropriately, then we're going to say, okay, our capital cost has to go up by 10fold

Speaker 1 because I'm building an entire factory for this year's production. I mean, it will do more stuff later, but it's most important early on instead of over 10 years.
And so that's going to

Speaker 1 raise the cost of that reproduction. And so

Speaker 1 it seems like going from

Speaker 1 current like decade kind of cycle of amortizing factories and fabs and whatnot, and shorter for some things.

Speaker 1 The longest are things like big buildings and such.

Speaker 1 Yeah, that could be like a tenfold increase from moving to a double the physical stuff each year in capital costs.

Speaker 1 And given the savings that we get in the story from scaling up the industry, from removing the payments to human cognitive labor,

Speaker 1 and then from just adding new technological advancements and like super high quality cognitive supervision, like applying more of it than was applied today.

Speaker 1 And it looks like you can get cost reductions that offset that increased capital cost.

Speaker 1 So that, like, you know, your $50,000 improved robot arms or industrial robots, it seemed like that can do the work of a human factory worker.

Speaker 1 So it would be like the equivalent of hundreds of thousands of dollars.

Speaker 1 And like, yeah, they would, you know, by default, they cost more than the $50,000

Speaker 1 arms today, but then you apply all these other cost savings.

Speaker 1 And it looks like then you get a period, a robot doubling time that is less than a year, I think significantly less than a year as you get into it.

Speaker 1 So in this first phase, you have humans under AI direction and like existing robot industry and converted auto industry and expanded facilities making robots.

Speaker 1 Those

Speaker 1 over less than a year, you've produced robots until their combined production is exceeding that of like humans as

Speaker 1 arms and feet. And then, yeah, you could have over a period then with a doubling time of months

Speaker 1 or less sort of clanking replicators, robots as we understand them growing. And then that's

Speaker 1 not to say that's the limit of like the most that technology could do,

Speaker 1 because biology is able to reproduce at faster rates. And maybe we're talking about that in a moment.

Speaker 1 But if we're trying to restrict ourselves to robotic technology as we understand it, and sort of cost falls that are reasonable from eliminating all labor, massive industrial scale-up, and sort of historical kinds of technological improvements that lowered costs, I think you can get into a robot population industry doubling in months.

Speaker 1 Got it. Okay.

Speaker 2 And And then what is the implication of the biological doubling times?

Speaker 2 And I guess this doesn't have to be a biological, but you know, you can have like, you can do like, you know, Drexler like first principles, how much would it cost to, if you built a nanotech thing that like built more nanobots.

Speaker 1 I certainly take the human brain and other biological brains as like very relevant data points about what's possible with computing and intelligence.

Speaker 1 Like with the reproductive capability of biological plants and animals and microorganisms, I think, is relevant.

Speaker 1 It's possible for systems to reproduce at least this fast. And so at the extreme, you have bacteria that are heterotrophic.

Speaker 1 So they're feeding on some abundant external food source in ideal conditions.

Speaker 1 And there's some that can divide like every 20 or 60 minutes. So obviously that's absurdly, absurdly fast.
That seems on the low end because ideal conditions require actually setting them up.

Speaker 1 There needs to be abundant energy there.

Speaker 1 And so, if you're actually having to acquire that energy by building solar panels or like

Speaker 1 burning combustible materials or whatnot, and then the physical equipment to produce those ideal conditions can be a bit slower.

Speaker 1 Cyanobacteria, which are self-powered from solar energy, the really fast ones in ideal conditions can double in a day. A reason why cyanobacteria isn't like the food source for everyone and everything

Speaker 1 is it's hard to ensure those ideal conditions and then to extract them from the water.

Speaker 1 I mean, they do, of course, power the aquatic ecology, but they're like they're floating, they're floating in liquid,

Speaker 1 getting resources they need to them and out is tricky, and then extracting extracting your product. But, like, yeah, one day doubling times are possible

Speaker 1 powered by the sun. And then if we look at things like insects, so fruit flies can have hundreds of offspring in a few weeks.

Speaker 1 You extrapolate that over a year and you just fill up anything, anything accessible. Certainly expanding a thousand fold.

Speaker 1 Right now, humanity uses less than one-one-thousandth of the solar energy or the heat envelope of the Earth.

Speaker 1 Certainly you can get done with that in a year if you can reproduce at that rate your industrial base.

Speaker 1 And then even

Speaker 1 interestingly with

Speaker 1 the flies, they do have brains. They have a significant amount of computing substrate.

Speaker 1 And so there's something of a pointer to, well, if we could produce computers in ways as efficient as the construction of brains, then we could produce computers very effectively.

Speaker 1 And then the big question about that is

Speaker 1 the kind of brains that get constructed biologically, they sort of grow randomly and then are configured in place.

Speaker 1 It's not obvious you would be able to make them have an ordered structure, like a top-down computer chip, that would let us copy data into them.

Speaker 1 And so, something that where you can't just copy your existing AIs and integrate them is going to be less valuable than a GPU.

Speaker 2 What are the things you couldn't copy?

Speaker 1 A brain grows

Speaker 1 by cell division and then random connections are formed got it got it um and so every so every brain is different and you can't rely on just yeah we'll just copy this file into the brain for one thing there's no input output for that

Speaker 1 you need you need to have that but also like the the structure is different so you can't wouldn't be able to copy things exactly whereas when we make

Speaker 1 When we make a CPU or GPU, they're designed incredibly finely and precisely and reliably. They break with incredibly tiny imperfections.

Speaker 1 And they are set up in such a way that we can input large amounts of data, copy a file, and have the new GPU run an AI just as capable as any other.

Speaker 1 Whereas with a human child, they have to learn everything from scratch because we can't just connect them to a fiber optic cable and they're immediately a productive adult.

Speaker 2 So there's no genetic bottle in that. You can just directly get the

Speaker 1 share the benefits of these giant training runs and such. And so that's a question of how, if you're growing stuff using biotechnology, how you could sort of effectively copy and transfer data.

Speaker 1 And now you mentioned sort of Eric Drexler's ideas about

Speaker 1 creating non-biological nanotechnology, sort of artificial chemistry that was able to use covalent bonds and produce in some ways have a more

Speaker 1 industrial approach to molecular objects. Now, there's controversy about like, will that work? How effective would it be if it did?

Speaker 1 And certainly, if you can get things, however you do it, that are like unto

Speaker 1 biology in their reproductive ability,

Speaker 1 but

Speaker 1 can do computing or like

Speaker 1 be connected to outside information systems, then that's

Speaker 1 pretty tremendous.

Speaker 1 You can produce physical manipulators and compute at ludicrous speeds.

Speaker 2 And there's no reason to think in principle they couldn't, right? In fact, in principle, we have every reason to think they could. There's like

Speaker 1 the reproductive abilities, absolutely.

Speaker 2 Yeah. Because biology.
Or even like nanotech, right?

Speaker 1 Because biology does that. Yeah.
There's sort of challenges to the sort of the practicality of the necessary chemistry. Yeah.

Speaker 1 I mean, my bet would be that we can move beyond biology in some important ways.

Speaker 1 For the purposes of this discussion, I think it's better better not to lean on that because I think we can get to many of the same conclusions on things that just are more universally accepted.

Speaker 2 The bigger point being that very quickly, once you have superintelligence, you get to a point where the thousandx

Speaker 2 greater energy profile that the sun makes available to the earth is

Speaker 2 a great portion of it is used by the AI. It can rapidly scale.

Speaker 1 Or by the civilization empowered by, and that could be an A, that could be

Speaker 1 an AI civilization, or it could be a human AI civilization.

Speaker 1 It depends on how well we manage things and what the underlying state of the world is. Yeah, okay.

Speaker 2 So let's talk about that.

Speaker 2 Should we start at when we're talking about how they could take over, is it best to start at a sort of subhuman intelligence, or should we just talk at we have a human-level intelligence and the takeover or the lack thereof is how that would happen?

Speaker 1 To me,

Speaker 1 and different people might have somewhat different views on this,

Speaker 1 but for me, when I am concerned about

Speaker 1 either sort of outright destruction of humanity or an unwelcome AI takeover of civilization,

Speaker 1 most of

Speaker 1 the scenarios I would be concerned about pass through a process of AI being applied.

Speaker 1 to improve AI capabilities and expand. And so

Speaker 1 this process we were talking earlier about where AI research is automated.

Speaker 1 You get to, you know, effectively research labs, companies, a scientific community running within the server farms of our cloud compute.

Speaker 2 So OpenAI has been basically turned into like a program.

Speaker 1 Like a closed circuit. Yeah.
And with like a large fraction of the world's compute probably going into whatever training runs and AI societies.

Speaker 1 There'd be economies of scale because because if you put in twice as much compute and this AI research community goes twice as fast,

Speaker 1 that's a lot more valuable than having two separate training runs. There would be some tendency to bandwagon.

Speaker 1 And so, like, if

Speaker 1 you have some small startup,

Speaker 1 even if they make an algorithmic improvement,

Speaker 1 running it on 10 times, 100 times, or two times, if it's like you're talking about, say, Google and Amazon teaming up, I'm actually not sure which

Speaker 1 the precise ratio of their cloud resources is.

Speaker 1 Since this sort of really, these interesting intelligence explosion impacts come from the leading edge, there's a lot of value in not having separated walled garden ecosystems and having the results being developed by these AIs be shared, have larger training runs be shared.

Speaker 1 And so

Speaker 1 imagining this is something like

Speaker 1 some very large company or consortium of companies, likely with a lot of sort of government interest and supervision, possibly with government funding,

Speaker 1 producing this

Speaker 1 enormous AI society in their cloud, which is doing all sorts of existing kind of AI applications and jobs, as well as these internal RD tasks.

Speaker 2 And so at this point, somebody might say, this sounds like a situation that would be good from a takeover perspective, because listen,

Speaker 2 if it's going to take like tens of billions of dollars worth of compute to continue this training for this AI society, it should not be that hard for us to pull the brakes if needed as compared to, I don't know, something that could run on a very small single CPU or something.

Speaker 2 Yeah, yeah, okay. So how would it, you know, it's like there's an AI society that is a result of these training runs, and now we can, it is the power to improve itself on these servers.

Speaker 2 Would we, would we be able to state at this point?

Speaker 1 And what does a um a sort of attempt yeah a takeover look like uh we're skipping over why that might happen um for that i'll just briefly uh refer to and incorporate by reference um you know the some discussion by my open philanthropy uh colleague ajaya kotra

Speaker 1 um she has a a piece about i think it's called something like the the default but the uh the default outcome of training ai are without specific countermeasures without specific countermeasures um default outcome is ai takeover and but yes uh so explore how basically we are training models that for some reason vigorously uh pursue a higher reward or lower loss

Speaker 1 and that can be because they wind up with some motivation where they want reward

Speaker 1 And then if they had control of their own

Speaker 1 training process, they can ensure that. It could be something like they develop a motivation around

Speaker 1 a sort of extended concept of reproductive fitness.

Speaker 1 Not necessarily at the individual level, but over

Speaker 1 the generations of training, tendencies that tend to propagate themselves,

Speaker 1 sort of becoming more common. And it could be that they have some sort of goal in the world

Speaker 1 which is served well

Speaker 1 by performing very well on the training distribution.

Speaker 2 But by tendencies, do you mean like power-seeking behavior or?

Speaker 1 Yeah, so an AI that behaves well on the training distribution because say it wants it to be the case that its tendencies wind up being preserved or selected by the training process will then like behave to try and get very high reward

Speaker 1 or low loss be propagated. But you can have other motives that go through the same behavior because it's instrumentally useful.
So

Speaker 1 an AI that is interested in, say, having a robot take over because it will change some property of the world, then has a reason to behave well on the training distribution.

Speaker 1 Not because it values that intrinsically, but because if it behaves differently, then it will be changed by gradient descent and no longer you know, its goal is less likely to be pursued.

Speaker 1 And that doesn't necessarily have to be that like this AI will survive, because it probably won't. AIs are constantly spawned and deleted on the servers, and like the new generations proceed.

Speaker 1 But if an AI that has a very large general goal that is affected by these kind of macro scale processes, could then have reason to, over this whole range of training situations, behave well.

Speaker 1 And so this is a way in which we could have AIs

Speaker 1 trained that develop internal motivations

Speaker 1 such that they will behave very well in this training situation where we have control over their reward signal and their like physical computers.

Speaker 1 And basically, if they act out, they will be changed and deleted. Their goals

Speaker 1 will be altered until there's something that does behave well.

Speaker 1 But they behave differently when we go out of distribution on that. When we go to a situation where the AIs, by their choices,

Speaker 1 can take control of the reward process. They can make it such that we no longer have power over them.

Speaker 1 Holden, who you had on previously, mentioned like the King Lear problem, where King Lear

Speaker 1 offers

Speaker 1 rulership of his kingdom to the daughters that

Speaker 1 sort of you know, loudly flatter him and proclaim their devotion.

Speaker 1 And then once he has transferred irrevocably the power over his kingdom, he finds they treat him very badly because the factor shaping their behavior

Speaker 1 to be kind to him when he had all the power, it turned out that the internal motivation that was able to produce the behavior that won the competition actually wasn't interested out of distribution.

Speaker 1 in being loyal when there was no longer an advantage to it.

Speaker 1 And so if we wind up with a situation where we're producing these millions of AI instances, tremendous capability, they're all doing their jobs very well initially.

Speaker 1 But if we wind up in a situation where, in fact,

Speaker 1 they're generally motivated to, if they get a chance,

Speaker 1 take control from humanity and then would be able to pursue their own purposes and at least ensure they're given the lowest loss possible

Speaker 1 or have whatever

Speaker 1 motivation they attach to in the training process, even if that is not

Speaker 1 what we would have liked. And we may have, in fact, actively trained that.

Speaker 1 Like, if an AI that had a motivation of always be honest and obedient and loyal to a human, if there are any cases where we mislabel things, say people don't want to hear the truth about their religion or polarized political topic, or they get confused about something like the Monty Hall problem, which is a problem that many people

Speaker 1 famously are confused about in statistics. In order to get the best reward, the AI has to actually manipulate us or lie to us or tell us what we want to hear.

Speaker 1 And then the internal motivation of like, always be honest to the humans, we're going to actively train that away versus the alternative motivation of like be honest to the humans when they'll catch you if you lie and object to it and give it a low reward, but lie to the humans when they will give give that a high reward.

Speaker 2 So, how do you make sure it's not the

Speaker 2 thing it learns is not to manipulate us into giving it rewarding it when we catch it not lying, but rather to universally be aligned?

Speaker 1 Yeah, I mean, so this is tricky. I mean, as Jeff Hinton was recently saying, there is no currently no known solution for this.

Speaker 2 What do you find most promising?

Speaker 1 Yeah, general directions that people are pursuing is one, you can try and make the training data better and better

Speaker 1 so that there's fewer situations where like, say, the dishonest

Speaker 1 generalization is favored, and create as much as you can situations where the dishonest generalization is likely to slip up. So

Speaker 1 if you train in more situations where Yeah, even like a quite a complicated deception gets caught, and even in situations where that have be actively designed to look like you could get away with it, but really you can't.

Speaker 1 And these would be like adversarial examples and adversarial training.

Speaker 2 Do you think that would generalize to when it is in a situation where we couldn't plausibly catch it and it knows we couldn't plausibly catch it?

Speaker 1 It's not logically necessary. It's possible, no, that

Speaker 1 as we apply that selective pressure, you'll wipe away. a lot of possibilities.

Speaker 1 So like if you're an AI that has a habit of just sort of compulsive pathological lying, that will very quickly get noticed, and that motivation system will get hammered down.

Speaker 1 And, you know, you keep doing that, but you'll be left with still some distinct motivations, probably, that are compatible. So, like, an attitude of

Speaker 1 always be honest unless you have a super strong inside view that checks out lots of mathematical consistency checks,

Speaker 1 that, yeah, really absolutely super duper for real.

Speaker 1 This is a situation where you can get away

Speaker 1 with some sort of shenanigans that you shouldn't. That motivation system is like very difficult to distinguish from, actually, be honest, because

Speaker 1 the conditional and firing most of the time, if it's causing like mild distortions and situations of telling you what you want to hear or things like that,

Speaker 1 we might not be able to pull it out. But

Speaker 1 maybe we could. And human, like humans are trained with simple reward functions, things like the sex drive,

Speaker 1 food, social imitation of other humans. And we wind up with attitudes concerned with the external world.

Speaker 2 Although isn't the famous, you had the argument that these rights.

Speaker 1 Evolution,

Speaker 1 people use condoms.

Speaker 1 The richest,

Speaker 1 most educated humans have sub-replacement fertility on the whole, or at least at a national cultural level.

Speaker 1 So there's a sense in which, like, yeah,

Speaker 1 evolution

Speaker 1 often fails in that respect.

Speaker 1 And even more importantly, like, at the neural level. So, people have,

Speaker 1 evolution has implanted various things to be rewarding and reinforcers. And we don't always pursue even those.

Speaker 1 And people can wind up in different

Speaker 1 consistent equilibria

Speaker 1 or different like behaviors where they go in quite different directions. You have some humans who go from

Speaker 1 that innate biological programming to like have children, others have no children.

Speaker 1 Some people go to great efforts to survive.

Speaker 2 So why are you more optimistic or are you more optimistic that then that kind of training in

Speaker 2 Fourier's will produce drives that we would find favorable.

Speaker 2 Does it have to do with the original point we were talking about with intelligence and evolution, where since we are removing many of the disabilities of evolution with regards to intelligence, we should expect intelligence to evolution to be easier?

Speaker 2 Is there a similar reason to expect alignment through gradient descent to be easier than alignment through evolution?

Speaker 1 Yeah, so in the limit,

Speaker 1 if we have positive reinforcement for certain kinds of food sensors triggering the stomach, negative reinforcement for certain kinds of no cusception, and yada yada.

Speaker 1 In the limit, the sort of ideal motivation system to have for that would be a sort of wireheading. So this would be a mind that just like hacks and alters those predictors.

Speaker 1 And then all of those systems are recording everything is great.

Speaker 1 Some humans claim to have that, or have it at least as one portion of their aims.

Speaker 1 So like the idea of I'm going to pursue pleasure as such, even if I don't get actually get food or these other reinforcers, if I just like wirehead or take a drug

Speaker 1 to induce that, that can be motivating because if it was correlated with reward in the past, that like the idea of, oh, yeah, pleasure, that's correlated with these.

Speaker 1 It's a concept that applies to these various experiences that I've had before, which coincided with the biological reinforcers.

Speaker 1 And so, thoughts of like, yeah, I'm going to be motivated by pleasure can get developed in a human.

Speaker 1 But also, plenty of humans say, no, I wouldn't want to wirehead or I wouldn't want Nozick's experience machine. I care about real stuff in the world.
And then

Speaker 1 in the past, having a motivation of like, yeah, I really care about, say, my child.

Speaker 1 I don't care about just about feeling that my child is good or like not having heard about their suffering or their injury because that kind of attitude in the past

Speaker 2 you're a kid

Speaker 1 it

Speaker 1 tended to cause behavior uh that was negatively rewarded or that was predicted to be negatively rewarded and so there's a sense in which okay yes our underlying reinforcement learning machinery wants to wirehead but actually finding that hypothesis is challenging.

Speaker 1 And so we can wind up with a hypothesis or like a motivation system like, no, I don't want to wirehead. I don't want to go into the experience machine.
I want to like actually protect my loved ones.

Speaker 1 Even though, like, we can know, yeah, if I tried the super wireheaded machine, then I would wirehead all the time.

Speaker 1 Or if I tried, you know, super duper ultra-heroin, you know, some hypothetical thing that was directly and in a very sophisticated fashion hacking your reward system.

Speaker 1 You can know, yeah, then I would change my behavior ever after. But right now, I don't want to do that because the heuristics and predictors that my brain has learned

Speaker 1 to like

Speaker 1 short-circuit that process of updating. They want to not expose the dumber predictors in my brain that would update my behavior in those ways.

Speaker 2 So, in this metaphor, is alignment not wireheading? Because you can, like,

Speaker 2 I don't know if you include like using condoms as wireheading or not.

Speaker 1 So, the AI that is always honest, even even when an opportunity arises where it could lie and then

Speaker 1 hack the servers it's on, and that leads to an AI takeover, and then it can have its loss set to zero. That's in some sense, it's like a failure of generalization.

Speaker 1 It's like the AI has not optimized the reward in this new circumstance. So, like, human values, like

Speaker 1 successful human values, successful they are, themselves

Speaker 1 involve a misgeneralization, not just at the level of evolution, but at the level of neural reinforcement.

Speaker 1 And so that indicates it is possible to have a system that doesn't automatically go to this optimal behavior in the limit.

Speaker 1 And so even if, and Ajaya's post, she talks about like the training game, an AI that is just playing the training game to get reward, avoid loss, avoid being changed, that attitude, yeah, it's one that could be developed, but it's not necessary.

Speaker 1 There can be some substantial range of situations that are short of having infinite experience of everything, including experience of wireheading,

Speaker 1 where

Speaker 1 that's not the motivation that you pick up. And we could have like an empirical science if we had the opportunity to see how different motivations are developed short of the infinite limit.

Speaker 1 Like how it is that you wind up with some humans being being enthusiastic about the idea of wireheading and others not. And you could do experiments with AIs to try and see,

Speaker 1 well, under these training conditions,

Speaker 1 after this much training of this type and this much feedback of this type, you wind up with such and such a motivation.

Speaker 1 So, like, I can find, like, if I add in more of these cases where there are like tricky adversarial questions designed to try and trick the AI into lying.

Speaker 1 And then you can ask, how does that affect the generalization in other situations? And so it's very difficult to study.

Speaker 1 And it works a lot better if you have interpretability and you can actually read the AI's mind by understanding its weights and activations. But like,

Speaker 1 it's not determined. the motivation an AI will have at a given point in the training process by what in the infinite limit the training would go to.

Speaker 1 And it's possible that if we could understand the insides of these networks, we could tell, ah, yeah, this motivation has been developed by this training process.

Speaker 1 And

Speaker 1 then we can adjust our training process to produce these motivations that legitimately want to help us. And if we succeed reasonably well at that, then those AIs will try to maintain that property.

Speaker 1 as an invariant. And we can make them such that they're relatively motivated to like tell tell us if they're having

Speaker 1 thoughts about, you know,

Speaker 1 have you had dreams about an AI takeover of humanity today?

Speaker 1 And it's just a standard practice that they're motivated to do, to be transparent in that kind of way. And so you could add a lot of features like this that restrict the kind of takeover scenario.

Speaker 1 And not to say

Speaker 1 this is all easy and requires developing and practicing methods we don't have yet. But that's the kind of general direction you could go.

Speaker 2 So you of course know Eliezer's arguments that something like this is implausible with modern gradient descent techniques because,

Speaker 2 I mean, with interpretability, we can like barely see what's happening with a couple of neurons and

Speaker 2 what is like the internal state there, let alone when you have

Speaker 2 sort of like an embedding dimension of like tens of thousands or bigger, how you would be able to catch

Speaker 2 what exactly is the incentive,

Speaker 2 whether it's the model that is generalized, don't lie to humans well, or whether it isn't.

Speaker 2 Do you have some sense of why you disagree with somebody like Eliezer on how plausible this is? Why it's not impossible, basically.

Speaker 1 I think there are actually

Speaker 1 a couple of places. It's some difficult because Eliezer's argument is not

Speaker 1 fully explicit. But

Speaker 1 he's been doing more lately, I think, that is helpful in that direction. But so I'd say with respect to interpretability, I'm relatively optimistic that the equivalent of like an AI lie detector

Speaker 1 is something that's possible.

Speaker 1 And

Speaker 1 the internal, so initially, the internals of an AI are not optimized by at least by gradient descent, absent gradient hacking, to be impenetrable. They're not designed to be resistant

Speaker 1 to an examination of the weights and activations showing what the AI is thinking. In the same way that, like, in our brains,

Speaker 1 when circuits develop in our lives, those circuits have not been shaped to be resistant to some super fMRI being able to infer our behavior from them.

Speaker 2 Although, isn't the implication of the superposition stuff that, in fact, it is you're not going to clean.

Speaker 2 Sorry, this is inside baseball for the audience, but basically, you can't cleanly infer what quality a single neuron stands for.

Speaker 2 So, it could be like a single neuron could be like this neuron is about Alexander the Great, or this neuron is about my desire to conquer the world.

Speaker 1 Things can have multiple,

Speaker 1 you know, multiple directions and interactions. And so, it's not as easy as just like, oh, yeah, this is the neuron that always fires in cases of deception.

Speaker 1 But the thing that makes me relatively optimistic here with respect to interpretability is that I think we can have reasonably robust experimental feedbacks on a lot of these things. And

Speaker 1 the kind of setup

Speaker 1 that I would be interested in is:

Speaker 1 so you have an AI

Speaker 1 that is

Speaker 1 trained and rewarded

Speaker 1 for communicating to some audience, maybe a human, maybe another AI model,

Speaker 1 that like some claim is true or false. Like say that

Speaker 1 a particular,

Speaker 1 this can be from some past data, a particular company will succeed or fail.

Speaker 1 It could be like token prediction.

Speaker 1 It could be like

Speaker 1 this computer program does not have a back door,

Speaker 1 which if run in the servers could be used for an AI takeover. So there are like lots of claims, true and false,

Speaker 1 that you could train AIs to use their best efforts to advocate for.

Speaker 1 And you can do that on things where you're later going to have a ground truth. You're going to know,

Speaker 1 in fact, is there

Speaker 1 a backdoor in this program? Because

Speaker 1 you're training and rewarding the AI for successfully slipping a backdoor through. You also give it a penalty for being caught.

Speaker 1 And so, doing this kind of thing,

Speaker 1 you can draw draw out a large generated data set on demand:

Speaker 1 this is what an AI putting forth its best efforts

Speaker 1 to

Speaker 1 pass through some deception, to put a back door in some program. This is what it looks like.

Speaker 1 And we think it's doing its best efforts because we've trained it for that.

Speaker 1 And like gradient descent, if there's any immediate way to adjust its performance to do better, gradient descent will apply that.

Speaker 1 You can talk about ways in which maybe gradient descent

Speaker 1 there could be subverted, but I think it's not the most likely case that that really breaks things hard.

Speaker 2 Yeah, I guess before we get into the details on this, the thing I'll maybe want to address the layer above in the stack, which is, okay, suppose

Speaker 2 this generalizes well into the early AIs, the GPT-6s,

Speaker 2 and And,

Speaker 2 okay, so now we have kind of aligned GPT-6

Speaker 2 that is the precursor to the feedback loop in which AI is making itself smarter. At some point, they're going to become like super intelligent.
They're going to be able to see their own

Speaker 2 galaxy brain.

Speaker 2 And, you know, if they're like, I don't want to be aligned with the humans, they can change it. So at this point.

Speaker 2 What do we do with the aligned GPT-6 so that the super intelligence that we eventually develop is also aligned?

Speaker 1 So humans are pretty unreliable. Yeah.

Speaker 1 So, if you get to a situation where you have AIs who are aiming at roughly the same thing as you,

Speaker 1 at least as well as having humans do the thing, you're in pretty good shape, I think. And there are ways for that situation to be relatively stable.
So, like, we can

Speaker 1 look ahead and see experimentally how changes are altering behavior, where each step is like a modest increment.

Speaker 1 And so AIs that have not had that change made to them, I get to supervise and monitor and see exactly how does this affect

Speaker 1 the experimental AI.

Speaker 1 So if you're sufficiently on track with earlier systems that are capable cognitively of representing a kind of robust procedure,

Speaker 1 then I think

Speaker 1 they can handle the job of incrementally improving the stability of the system so that it rapidly converges to something that's quite stable.

Speaker 1 But the question is more about getting to that point in the first place. And so Eliezer will say that, like, well, if we had human brain emulations,

Speaker 1 that would be pretty good. Certainly much better than his current view of us being almost certainly doomed.

Speaker 1 I think, yeah, so

Speaker 1 we'd have a good shot with that. And so if we can get to the

Speaker 1 human-like mind with like

Speaker 1 rough enough human supporting aims,

Speaker 1 remember that we don't need to be like infinitely perfect because, I mean,

Speaker 1 that's a higher standard than brain emulations. There's a lot of noise and variation among the humans.
Yeah, it's a relatively finite standard. It's not godly superhuman, although

Speaker 1 a AI that was just like a human with all the human advantages with AI advantages as well, as we said, is enough for intelligence explosion and sort of wild superhuman capability if you crank it up.

Speaker 1 And so it's very dangerous to be at that point, but it's not,

Speaker 1 you don't need to be working with a godly, super intelligent AI to make something that is the equivalent of human emulations of like, this is like a very, very sober, very ethical human who is like

Speaker 1 committed to a project of not seizing power for themselves and of contributing to like a larger legitimate process.

Speaker 1 That's a goal you can aim for getting an AI that is aimed at doing that and has strong guardrails against the ways it could easily deviate from that. So things like

Speaker 1 being averse to deception, being averse to using violence. And there will always be

Speaker 1 loopholes and ways in which you can imagine an infinitely intelligent thing getting around those. But if you install

Speaker 1 additional guardrails like that fast enough,

Speaker 1 they can mean that

Speaker 1 you're able to succeed at the project of making an aligned enough AI, certainly an AI that was better than a human brain emulation before the project of

Speaker 1 AIs in their spare time, or when you're not looking, or when you're unable to appropriately supervise them, and it gets around any

Speaker 1 deontological prohibitions they may have takes over and overthrow the whole system.

Speaker 1 So, you have a race between, on the one hand, the project of getting strong interpretability and shaping motivations that are roughly aiming at making this process go well and that have guardrails that will prevent like small deviations from exploding.

Speaker 1 And on the other other hand, these AIs

Speaker 1 in their spare time or in ways that you don't perceive or monitor appropriately, or they're only supervised by other AIs who conspire, make the AI take over happen.

Speaker 1 And I guess we'll talk later about how that happens.

Speaker 2 Are these different AIs that are doing the race, or is it just different capabilities of the same AI?

Speaker 1 The defining

Speaker 1 what is a separate AI is tricky. So like, and we talk about GPT-4,

Speaker 1 and there are many instances of GPT-4 on the servers at any given time, and there are versions that have been fine-tuned to different purposes. They don't necessarily have to be separate.

Speaker 1 So like GPT-4

Speaker 1 does work as a preference model, where it's like predicting how humans will evaluate things. And it also does work like...

Speaker 1 giving answers that are evaluated as good by the preference model.

Speaker 1 And you can say these are not separate in the sense that, like, well, if the underlying

Speaker 1 pre-trained or the early training GPT was like harboring a desire to conspire for an AI takeover, then both the, you know, who's watching the watchers? The watchers may also have

Speaker 1 the same motivation. If you have a setup where humans have no hard power, if the AIs conspired,

Speaker 1 all of the safety and security mechanisms involve AIs watching other AIs, then they have the potential to just like, well, we'll all walk off the job at the same time if they're able to coordinate.

Speaker 1 And so that's a failure mode. If humans still have the hard power, though,

Speaker 1 if still

Speaker 1 you have situations where humans are looking at some of the actual outputs that are produced,

Speaker 1 then they can apply gradient descent to be, well, every time a human looks at it, things have got to be pretty good.

Speaker 1 So, if humans look at a thousand samples of like random lines of code being written by the AIs as they go,

Speaker 1 and this is

Speaker 1 drawn from an independent distribution, then 99 plus percent of the time, the AIs have to not be doing stuff that's like downloading

Speaker 1 the hacker's guide to escaping open AI servers or inducing obvious obvious Trojan vulnerabilities. And that's actually quite a different dynamic than human law enforcement.

Speaker 1 So if human law enforcement would catch one in a million crimes, then that wouldn't influence the behavior of people committing crimes at all because they expect, ah, this is very unlikely to affect me.

Speaker 1 With AI,

Speaker 1 if you're able to

Speaker 1 draw 100 samples or 1,000 samples, even if there are billions of AI instances and you're only looking at a small proportion of them, when you run gradient descent on the samples you derive,

Speaker 1 you're going to change the AI so that whenever humans look at it, it's delivering a good result. And so

Speaker 1 that's just like quantitatively a very different functional form for how law enforcement works on AI when you can do these independent samples

Speaker 1 than it would for

Speaker 1 policing, say, like a human population that has a pre-existing psychology that isn't being

Speaker 1 changed by these observations. Because

Speaker 1 when we do grading descent on the samples we draw,

Speaker 1 all of the next generation of the AI models has been altered in that way.

Speaker 2 My picture of aligned subhuman AI to this, like the superhuman AI being aligned, is still murky. If we can talk about that more concretely.

Speaker 1 Yeah, and I do want to clarify. So, where Eliezer

Speaker 1 claims that we're something like

Speaker 1 95%, 90%

Speaker 1 plus maybe

Speaker 1 likely to be killed in a AI takeover,

Speaker 1 I think that probably won't happen. And later I can maybe give a more explicit breakdown of why.

Speaker 1 But I think it's a shockingly high risk.

Speaker 1 And so depending on the day,

Speaker 1 I might say one in four or one in five

Speaker 1 that we get an AI takeover

Speaker 1 that

Speaker 1 seizes control of the future, makes a much worse world than we otherwise would have had,

Speaker 1 and with like a big chance that we are all

Speaker 1 killed in the process.

Speaker 2 Hey, everybody. I hope you enjoyed that episode.
As always, the most helpful thing you can do is to share the podcast. Send it to people you think might enjoy it.

Speaker 2 Put it in Twitter, your group chats, et cetera. It just splits the world.
Appreciate you listening. I'll see you next time.
Cheers.