Joseph Carlsmith - Utopia, AI, & Infinite Ethics

August 03, 2022 1h 31m

Joseph Carlsmith is a senior research analyst at Open Philanthropy and a doctoral student in philosophy at the University of Oxford.

We discuss utopia, artificial intelligence, computational power of the brain, infinite ethics, learning from the fact that you exist, perils of futurism, and blogging.

Watch on YouTube. Listen on Spotify, Apple Podcasts, etc.

Episode website + Transcript here.

Follow Joseph on Twitter. Follow me on Twitter.

Subscribe to find out about future episodes!

Timestamps

(0:00:06) - Introduction

(0:02:53) - How to Define a Better Future?

(0:09:19) - Utopia

(0:25:12) - Robin Hanson’s EMs

(0:27:35) - Human Computational Capacity

(0:34:15) - FLOPS to Emulate Human Cognition?

(0:40:15) - Infinite Ethics

(1:00:51) - SIA vs SSA

(1:17:53) - Futurism & Unreality

(1:23:36) - Blogging & Productivity

(1:28:43) - Book Recommendations

(1:30:04) - Conclusion

Please share if you enjoyed this episode! Helps out a ton!

Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

Download Audio Original Episode

Listen and Follow Along

Speed:

Full Transcript

Today, I have the pleasure of interviewing Joe Carlsmith, who's a senior research analyst at OpenPhilanthropy and a doctoral student in philosophy at the University of Oxford. Joe has a really interesting blog that I got to check out called Hands in Cities.
And that's the reason that I wanted to have him on the podcast, because it has a bunch of thought-provoking and insightful post on there about philosophy and morality, ethics, the future. And yeah, so I really wanted to talk to you, Joe.
But do you want to give a bit of a longer intro on what you're up to? Sure. So I work at Open Philanthropy on existential risk from artificial intelligence.
And so, you know, I think about what's going to happen with AI, how can we make sure it goes well, and in particular, how can we make sure that advanced AI systems are safe? And then I have a side project, which is this blog where I write about philosophy and the future and things like that. And that emerges partly from my background, which is I was before getting into AI and working at Open Plan for me, I was in academic philosophy.
Okay, yeah, that's quite an ambitious side project. I mean, given the length and the regularity of those posts, it's actually quite stunning.

Do you want to talk more about what you're working on about AI at Open Philanthropy? So it's a mix of things. Right now, I'm thinking about AI timelines and what's called takeoff speed, sort of how fast the transition is from pretty impressive AI systems to AI systems that are kind of radically transformative.
And I'm trying to use that to provide more perspective on the probability that everything goes terribly wrong. I see.
Okay. So what are the implications? I suppose it's higher or lower than I would expect.
I guess if it's higher, maybe I should work on AI limit. But other than that, what are the implications of that figure changing? I think there are a number of implications just from understanding timelines with respect to how you prioritize and what, you know, just to some extent, the sooner something is, then you need to be planning for it coming sooner and kind of cutting more corners or, you know, counting less on having more time.
And yeah, I think overall, the higher you think the probability of catastrophe is, the easier it is for this to become kind of the most important priority. I do think there's a range of probabilities where it maybe doesn't matter that much.
But I think the difference between, say, 1% and 10%, I think, is quite substantive. And the difference between 10 and 90 is quite substantive.
And I know people in all of those ranges. Gotcha.
Okay, interesting. Yeah, so let's back up here and talk a bit more about the philosophy motivating this.
So I think you identify as a long-termist. Yeah, so maybe Rod picked your question here is, you have an interesting blog post about why the future looking back on us might think about the 21st century, given the risk we're taking.
So, I mean, what, what do you think about the possibility that we're potentially giving up resources, potentially dedicating, well, I'm not, you're dedicating your career to, you know, building a future that, you know, maybe given, And given the fact that you're alive now, you might find strange or disturbing or disgusting. I mean, so I guess to add more context to the question, from a utilitarian perspective, the present is clearly much, much better than the past.
But somebody from the past might think that there's a lot of things about the present that are kind of disturbing. I mean, they might not like the configuration of how maybe isolating a modern city might be.
They might find that kinds of free to cheap information that you can access on your phone, uh, kind of disturbing. Yeah.
So how do you think about that? So yeah, a few comments there. So one, um, I do think that if you took, you know, for most people throughout history, if you brought them to the present day, they would, my guess is that fairly quickly, depending on exactly the circumstances, they would come to prefer living in the present day to the past, even if there are sort of a bit of future shock and a bit of some things are alienating or disturbing.
But that said, I think the distance sort of gap between historical humans and the present is actually much, much smaller, both in terms of time and kind of other factors than the gap I envisioned between present day humans and the future humans who are living ideally in a kind of radically better situation. And so I do expect sort of greater distance and possibly greater alienation when you first show up.
My personal view is that the best futures are going to be such that if you really understood them and if you really experienced what they're like, which may be a big step and might require sort of extensive engagement and possibly sort of changes to your capacities to understand and experience, then you would think it's really good. And I think that's the relevant standard.
So for me, I worry less if the future is sort of initially alienating. And the question for me is how do I feel once I've really, really understood what's going on? I see.
So I wonder how much we should value that kind of inside view you would get into the future from being there. If you think about, I don't know, many, many existing ideologies, like, I don't know, think of an Islamist or something is who might say, listen, if you could just like come to Iraq and feel the bliss of fighting for the caliphate, you would understand better than you can understand from the outside view, just sitting on a couch eating Doritos, what it's like to fight for a cause.
And maybe their experience is kind of blissful in some kind of way, but I feel like the outside view is more useful than the inside view there. Well, so I think there's a couple of different questions there.
One is what would the experience be if you had it from the inside? Um, and then there was this, I think a subtly different question, which was what, which is, what would your take on this be if you kind of fully understood where fully understanding is not just, um, a matter of having the internal experience of being in a certain situation, but it's also a matter of understanding what that situation is causing, what sort of beliefs are structuring, the ideology, whether those beliefs are true, and all sorts of other factors. And it's the latter thing that I have in mind.
So I'm not just imagining, oh, the future will feel good if you're there, because sort of by hypothesis, the people who are there, at least one hopes they're enjoying it, or one hopes they're thumbs up. If the people who are there aren't thumbs up, that's a strange utopia.
But I'm thinking more that in addition to their perspective, there's a sort of more holistic perspective, which is the sort of full understanding. And that's the perspective from which you would endorse this situation.
I see. And then, yeah, so another respect in which it's interesting to think about what they might think of us is, you know, like, well, what would they think of the crazy risk we're taking by not optimizing for existential risks? And so, you know, one analogy you could offer, I think what McCaskill does this in his new book is to think of us as, uh, you know, teenagers in our civilization's history.
And then, you know, think of the crazy things you did as a teenager and how, um, and yeah, so, uh, I mean, maybe there is an aspect to which like one would wish they could take back the crazy things they did as a teenager. But my impression is that most adults probably think that while the crazy things were, um, kind of risky, um, they were, they're very formative and important and, um, they feel nostalgic, nostalgic

about the things they did in the past. Do you think that the future looking back, they are going to,

um, regret the, the, the way we were living the 21st century or, uh, or will they look back and

think, Oh, you know, that, that was kind of a cool time. I mean, I guess this is kind of

conditional on there being a future, which takes away a lot of the mystery here, but

Thank you. the 21st century or, uh, or will they look back and think, oh, you know, that, that was kind of a cool time.
I mean, I guess this is kind of conditional on there being a future, which takes away a lot of the mystery here, but I doubt that they will look back with, um, uh, with pleasure at, uh, the sort of risks and, uh, and horrors of the, uh, the 21st century. I mean, if you just think about how, uh, we least I, tend to think about something like the Cuban Missile Crisis or World War II, I don't personally have a kind of nostalgia, oh, you know, sure, it was risky, but it made me who I am or something like that.
I also want to say, you know, I think it's true that when you look back on your teenage years, there is often a sort of, you know, let's say you did something like crazy, you and your friends used to race, you know, around and you played chicken or something at the local quarry. And it's like, oh, right, right, right.
But, you know, you survived, right. And the real reason not to do that is the like chunk of probability where you just died.
And so I think there's a, you know, to some extent, the ex-post perspective of looking back on certain sorts of risks is not the right one, especially for death risks. That's not the right perspective to use to kind of calibrate your understanding of how to feel about it overall.
I see. Okay, so I think you brought up utopia, and you have a really interesting post about the concept of utopia.
So do you want to talk a little bit more about this concept and why it's important and um and also why do we have so much trouble thinking of a compelling utopia yeah so utopia for me just means a kind of profoundly better future and i think it's important because i think it's just actually possible i just think it's actually something that we could do. We could make, if we sort of play our cards right in sort of non-crazy ways, we could just build a world that is radically better than the world we live in today.
And in particular, I think we often, in thinking about that sort of of possibility underestimate just how big the difference in value could be between our current situation and kind of what's available. And I think often utopias are kind of anchored too hard on the status quo and sort of changing it in small ways, but imagining our kind of fundamental situation basically unaltered.
And I think such that it's a little bit like the difference between, you know, you have a kind of crappy job or like a beach vacation. And Utopia is like, everyone has beach vacation.
And, you know, I don't know how you feel about beach vacations, but I think it's much, I think the difference is more like being asleep and being awake, or sort of, it's more, yeah, it's sort of, it's like living in a cave or living under the open sky. I think it's like a really big difference, and that that matters a lot.
I that's interesting because I remember in the essay you had you had a section where you mentioned that you expect Utopia to be recognizable to a person alive now. I guess the way you put it just earlier made it seem like it would be a completely different category of experience than we would be familiar with.
Yeah. So is there a contradiction there or am I missing something? So I think there's at least a tension.
And the way I see the tension playing out or, you know, being reconciled is specifically via the notion I referenced earlier of kind of you would, if you truly understood, come to see the utopia as genuinely good. But I think that process, I mean, ideally, I think the way we end

up building utopia is we go through a long, patient process of becoming wiser and better

and more capable as a species. And it's in virtue of that process kind of culminating

that we're in a position to build a civilization that is sort of profoundly good and radically

different. But that's a long process.
And so I do think, as I say, if I just

Thank you. a civilization that is sort of profoundly good and radically, radically different.
But that's a long process. And so I do think, you know, if, as I say, if I just transported you right there and you skipped, you skipped the process, then you might not like it.
But, and, and it is quite alien in some sense, but I still, but if you went through the process of like really understanding and kind of becoming wiser, you would, you would endorse. to me that you think the process to get to utopia is more of a sort of, maybe I'm misconstruing it, but when you mentioned it's a process of us getting wiser.
And yeah, so it sounds like it's a more philosophical process rather than, I don't know, we figure out how to convert everything to hedonium and, you know, it's eternal bliss from then on. Yeah.
So am I getting it right that you think it's more philosophical process and then why is it that you think so? Yeah. So I definitely don't sit around thinking that utopia, we sort of know what utopia is right now and it's hedonium.
I'm not especially into the notion of hedonium though i think it's possible to um i think it's i think the brand is bad um i think i think you know people uh talk about pleasure with this kind of dismissive attitude sometimes and you know hedonium implies this kind of sterile um uniformity uh you know and you're sort of tiling people are talking about they're like gonna tile the universe with hedonium and it's like wow this sounds this sounds rough whereas I think actually you know, and you're sort of tiling, people are talking about, they're like going to tile the universe with hedonium. And it's like, wow, this sounds, this sounds rough.
Whereas I think actually, you know, the relevant perspective when you're thinking about something like hedonium is the kind of internal perspective from which the sort of experience of the subject is something kind of joyful and, you know, boundless and kind of energizing and, you know, whatever, whatever pleasure is actually like pleasure is not a trivial thing. I think pleasure is a profound thing in a lot of ways.
But I really don't assume that that's what utopia is about at all. I think we're at, I think, A, my own values seem to be quite complicated.
I don't think I just value pleasure. I value a lot of different things.
And more broadly, I have a lot of uncertainty about how I will think and feel about things if I were to go through a kind of process of significantly increasing my capacity to understand. I think sometimes when people imagine that, they imagine, oh, we're going to sit around and do a bunch of philosophy, and then we'll have like solved normative ethics, and then we'll implement our solution to normative ethics.
And that's not what I'm imagining by kind of wisdom. I'm imagining something richer and also that involves, importantly, a kind of enhancement to our cognitive capacity.
So it's sort of really, you know, I think we have very small, we're really limited in our ability to understand the universe right now. We have kind of, and I think there's just a huge amount of uncharted territory in terms of what minds can be and do and see.
And so I want to sort of chart that territory before we start making kind of big and irreversible decisions about what sort of civilization we want to build in the long term. I see.
And then another maybe concerning part of the utopia is that, yeah, as you mentioned in the piece, many, many of the worst ideologies in history have had elements of utopian thinking in them. To the extent that EA and utilitarianism generally are compatible with utopian thinking, maybe they don't advocate utopian thinking, but they are compatible with it.
Do you see that as a problem for the movement's health and potential impact? Is the question something like, is this a red flag? We look at other ideologies throughout history and they've been compatible with utopian thinking and maybe sort of effective altruism or utilitarianism or something is similarly compatible. So should we worry in the same way?

Is that the question?

Yeah, partly.

And also another part is maybe it's still right

that like morally speaking, yeah,

Utopia is compatible with this worldview

and the worldview is correct.

But the implications are that, you know,

somebody misunderstands what is best, they identify as an EA and this leads to bad consequences when they try to implement their scheme. Yeah, so I think there are certainly reasons to be cautious in this broad vein.
I don't see them as very specific to EA or utility. I don't identify as utilitarian, but to utilitarianism, I see them as more sort of better understood as risks that come from believing that something is very important at all.
And I think it's true that many acting from a space of conviction, especially where that conviction has sort of a flavor of,

you know, it's interesting what exactly constitutes an ideology, but I think it's reasonable to look at EA and sort of be like, this looks like an ideology. And I think that's right.
And I think that's sort of important to have the sort of relevant red flags about. I think it's pretty hard to have a view of the world that doesn't, in some sense, imply that it could be a lot better, or at least a plausible view of the world.
And when I say utopia, I don't really mean anything much different from that. I think it's sort of, I'm not saying a perfect thing.
I do have sort of a more specific view about exactly how much better things could be.

But more broadly, it seems to me many, many people believe in the possibility of a much better world and are fighting for that in different ways. And so I wouldn't I wouldn't pin the red flag specifically to the belief that sort of things can be better.
I think it would have more to do with sort of what degree of rigidness are you, you know, relating to that belief with how are you kind of how are you acting on it in the world? How much are you willing to kind of kind of break things or kind of act in uncooperative ways in virtue of that sort of conviction? And there, I think caution is definitely warranted. I see.
Yes. I'm not sure I agree that most people have a view or an ideology that implies anywhere close to the kind of utopia that utopian thinking one can have.
If you think of modern political parties in a developed democracy, like in the United States, for example, if you think of what is like a utopian vision that either party has, it's actually quite banal. It's like, oh, we'll have universal healthcare or I don't know, GDP will be higher in the next couple of decades, which doesn't seem utopian to me.
It just seems, and it does seem like a limited worldview where they're not really thinking about how much better or worse things could be, but it doesn't exactly seem utopian. Yeah, I'll let you react to that.
I think that's a good point. So maybe the relevant notion of utopian here is something like, to what extent is a concept of a radically better world kind of operative in your day-to-day engagement.
To some extent,

what I meant is that I think if I sat down and talked with most people, we could eventually,

with some kind of constraints on reasonableness, come to agree that things could be a lot better

in the world. We could just cure cancer.
We could cure X, Y, Z disease. We could just go through a

few things like that. We could talk about the degree of abundance that could be available.

And I think, you know, so, but the question is whether that's like a kind of structuring or important dimension to how people are relating to the world. I think you're right that it's often not.
And that's part of maybe the thing I'm hoping to kind of push back against with that post is actually I think this is a really important feature of our situation. I think it's true that it can be dangerous.
And if you're wrong about it or if you're acting in a sort of unwise way with respect to it, that can be really bad. But I also think it's just a really basic fact.
And I think we just sort of need to learn to deal with it maturely. And kind of pretending it's not true, I think, isn't the way to do that.
I see. But to me, at least, utopian or utopia sounds like some sort of peak.
And maybe you didn't mean it this way. But so are you saying in the essay and generally that you think there is some sort of caring capacity to how much good things can get or that beyond a certain point, things can keep getting indefinitely better.
But at this point, we're willing to say that we have reached utopia. Yeah.
So, I mean, I certainly don't have a kind of hard threshold. Here's exactly where I'm going to call it utopia.
You know, I mean something that is profoundly better. I do think that if you have a finite, so, you know, very basic level, if there's only a finite number of states that the sort of affectable universe can be in, and your ranking of these states in terms of how good they are is transitive and complete, then there will be a sort of top, a top.
And, you know, I don't think that's an important thing to focus on from the perspective of just getting it, just, you know, taking seriously that things could be radically better at all. I think like talking about, ah, but exactly how good and what's the perfect thing is often kind of distracting in that respect.
And it gets into these issues about like, oh, you know, how much suffering is good to have. And a lot of this sort of discourse on utopia, I think, gets distracted from basic facts about like, at the very least, we can do just a ton better.
And that's important to keep in mind. I see.
I see. You point out in the piece that many religions and spiritual movements have done the most amount of thinking on what a utopia could look like.
And there's a very interesting essay by Nick Bostrom in 2008, where he lays out his vision of what somebody speaking from the future utopia talking back to us would sound like. And when you read it, it sounds very much like a sort of mystical, uh, mystical essay, the kind of thing that, uh, change a few words and a Christian could write like CS Lewis could have written about like what it's like to speak down from heaven.
Um, yeah. So, so to what extent is there, uh, and I don't, I don't mean this pejoratively, but to, uh, what, to what extent is there some sort of like spiritual or religious dimension to utopian thinking that relies on some amount of faith that things can get indescribably better in some sort of ephemeral indescribable way? so i think there are definitely analogs and similarities between some ways of relating to

the notion of utopia and attitudes and orientations that are common in religious contexts and spiritual contexts. And I think it's, and I think personally, so I don't think it needs to be that like that.
As I say, I think, I don't think it requires faith. I don't think it requires anything mystical.

I think it's just a basic fact about our kind of current cognitive situation, our current civilizational situation, that things could be radically better. And it's ephemeral in the sense

that it's quite hard to imagine, especially for me, an important source of evidence here is sort of variance in the quality of human experiences. So if you think about your kind of peak experiences, they're often, it's a really big deal.
You're kind of sitting there going, wow, this is radically, this is serious. and kind of feeling in touch or feeling that this is, this is in some sense, something you would trade much, much sort of mundane experience for the sake of.
And I think it's important. So the thing that I think we need to do is sort of extrapolate from there.
So you sort of look at the trajectory that your mind moved along as you, as you moved into some experience or some broader

non-experiential, like your community got a lot better, your relationships got a lot better. Look at that trajectory and then sort of stare down, you know, where is that going? And I do think that requires a kind of, I don't want to call it faith.
I think it requires a kind of extrapolation into a sort of zone that is in some sense beyond your experience, but that is sort of deeply worthy and important. And I think that's something that is often associated with spirituality and religion, and I think that's okay.
But I actually think there are a number of really important differences between utopia and something like heaven. So, you know, centrally, utopia will be a sort of concrete, limited situation.
There, you know, there are going to be frictions, there are going to be resource constraints, it's going to be finite. There's a bunch of, it's still going to be in the real world, whereas I think many, you know, most religious visions have, don't have, don't have those constraints.
And that's an important, an important feature of their, um, uh, uh, yeah, of their, their situation. Yeah.
Speaking of constraint constraints, this reminds me of Robin Hanson's theory that, you know, eventually the universal economy would just be made up of, um, these digital people M's and that because of competition, their wages will be driven down to subsistence levels, which maybe that's compatible with some engineering in their ability to experience such that, you know, it's still blissful for them to work at subsistence levels of compute or whatever. But yeah, so it seems like this sort of like first order economic thinking implies that there will be no, there will be no utopia.
In fact, things will get, um, things will get worse on average, but maybe better, uh, overall, if you just add up all the experience, but worse on average. Uh, yeah.
So, so I don't know if this vision seems incompatible with yours of a utopia, what do you think? Yeah, I would not call Robin's world a utopia. And so, you know, a thing I haven't been talking about is what should our overall probability distribution be with respect to different quality of futures? And what, you know, exactly how possible is it? And how likely is it that we build something that is sort of profoundly good as opposed to mediocre or much worse? And I would class Robin's scenario in the mediocre or much worse zone.
So do you have a criticism of the logic he uses to derive that? To some extent, I think my main criticism or the first thing that would come to mind is that I think we will very likely, like I think competitive pressures are a source of kind of pushing the world in bad directions. But I also think there are ways in which, um, kind of wise forms of coordination and

kind of preemptive action can, uh, can stave off the sort of bad effects of, of competitive

pressures.

And I, and, and so that's, that's the sort of, um, that's the way I imagine avoiding,

uh, stuff in the vicinity of, of what Robin is talking about though, you know, there,

there are a lot of complexities there.

Yeah.

Yeah. Um, the last few years have not reinforced my, years have not reinforced my belief in the possibility of wise coordination.
But yeah. Anyways.
So one thing I want to talk to you about is you have a paper on what it would take to match humans' brains computational capacity. And then associated with that, you have a very good summary on open philanthropy.
Yeah. So do you want to talk about the approach you took to estimate this and then why this is an important metric to try to figure out? Yeah.
So the approach I took was to look at the evidence from neuroscience and the literature on the kind of computational capacity of the human brain and to talk to a bunch of neuroscientists and to try to, you know, see what we know right now about the number of floating point operations per second that would be sufficient to kind of reproduce the task relevant aspects of human cognition in a computer. And that's important.
I mean, it's actually not, you know, it's not clear to me exactly how important this parameter is to our overall picture. I think the way in which it's relevant to thinking that I've been doing and that OpenPhil has been doing is as an input into an overall methodology for estimating when we might see kind of human level AI systems that proceeds by first trying to estimate roughly the kind of computational capacity of the brain or the sort of size of a kind of AI system and its kind of overall parameter count and kind of compute capacity.
and that would be sort of analogous to humans. And then you extrapolate from that to the training cost,

the cost to humans. And then you extrapolate from that to the training cost, the cost to kind of create a system of that kind using current methods in machine learning and kind of current scaling laws.
And that methodology, though, brings in a number of additional assumptions that I think aren't like just transparent that that's, oh, yeah, of course, that's how we would do it or that. And so I think you have to sort of be a little bit more in the weeds to see exactly how it feeds in.
I see. And then, yeah, so I think you said it was 10 to the 15 flops for human brain.
But did you have an estimate for how many flops it would take to train something like the human brain. I know GPT three is like, um, only 175 billion parameters or something, which is can fit into a, you know, like a micro SD card even.
Um, but, uh, but yeah, it was like, oh, $20 million to train. So, um, yeah.
So do you have, were you able to come up with some sort of estimate for what it would cost to train something like this?

Yeah, so my focus in that report was not on the training extrapolation. That was work that Ajay Akatra at Open Philanthropy did using my report's estimate as an input.
And that her methodology involves assigning different probabilities to different kind of ways of using that input to drive an overall training estimate. And in particular, an important source of uncertainty there is the kind of amount of compute required or the sort of number of times you need to run a system per data point that it gets.
So in the case of something like GPT-3, you get a meaningful data point and a gradient update as to how well you're performing with each token that you output as you're doing GPT-3 style training. So you're predicting text from the internet, you suggest the next token, and then your training process says like, nope, do better next time or something like that.
Whereas if you're, say, learning to play Go and you have to play, I mean, this isn't exactly how, or this isn't how a Go system will work, but it's an example. If you have to play the full game out and that's sort of hundreds of moves, then before you get an update as to whether, you know, you're playing well or poorly, then that's a big multiplier on the compute requirement.
And so that's one of the central pieces that's called what Ajay calls the horizon length of, of training. And, um, that's a sort of very important, uh, source of uncertainty and getting to your overall, overall, uh, training estimate, I think, but ultimately, you know, she ends up with this big spread out distribution from something like, I think GPT three was like, um, 10 to the 24, yeah.
Four times 10 to the 23 or something like that. And, you know, she's, she spreads out all the way up to the evolution anchor, I think is something like 10 to the 41.
And I think her distribution is centered somewhere in the low thirties. Okay.
That's still quite a bit, I guess. How much does this rely on the, you know, the scaling hypothesis? If one thought that the current efforts and the current approach were not, not likely to or at least not likely to do in a sample efficient way towards human intelligence, it might be analogous to somebody saying we have enough deuterium on Earth to power civilization for millions of years.
But if you haven't figured out fusion, then it may be a relevant statistic. Yeah, so I think the approach does assume that you can train a human level or sort of transformative AI system with a sort of non-astronomical amount of compute and data using current, you know, without major conceptual or algorithmic breakthroughs relative to what's currently available.
Now, the actual methodology that it uses allows you to assign probabilities to that assumption too. So you can, if you want, you know, say I'm only 20% on that.
And then you have, then there are sort of other, there are a few other options. So you can also kind of rerun evolution, which is not, and so that's an anchor

that she provides to sort of, and this is often what people will say as a sort of upper bound on how hard it is to create human level systems is to do something analogous to simulating evolution. There are a lot of open questions as to how hard that is.
But I do think this methodology is a lot more compelling and interesting if you are compelled by the kind of available techniques in deep learning and by kind of scaling hypothesis like views, at least as an upper bound. I think it's important.
So, you know, there's different ways of kind of being interested in algorithmic breakthroughs. One is because you think deep learning isn't enough.
Another is because you think they will provide a lot of efficiency relative to deep learning, such that an estimate like a J is an overestimate. Because actually, we won't have to do that.
We'll make some sort of breakthrough. It'll happen a lot earlier.
And I put weight on that view as well. Yeah, that's really interesting.
So yeah, that, that implies that like, even if you think the current techniques are not, uh, not optimal, maybe that, maybe that should update you in favor of thinking it could happen sooner. That's, that's really interesting.
Um, uh, um, yeah. So yeah.
And then how, how did you go about estimating, uh, like, uh, the amount of flops it would take to emulate the interactions that happen in a brain.

Obviously, it would be unreasonable to say that you have to emulate every single atomic interaction. But then what is your proxy that you think would be sufficient to emulate? So I used a few different methodologies and tried to kind of synthesize them.
So one was looking at the kind of mechanisms of the brain and what we know about the kind of complexity of what they're doing and how hard it is to capture the kind of task relevant or our best guess about the task relevant dimensions of the signaling happening in the brain. And then I also tried to bring in comparisons with existing AI systems that are replicating kind of chunks of functionality that humans, that the human brain has, and in particular in the context of vision.
So sort of how do our current vision systems compare with the parts of the brain that are kind of plausibly doing analogous processing, though they're often doing other things as well. And then I use the third method, which has to do with physical limits on the kind of energy consumption per unit computation that the brain is plausibly doing.
And then a fourth method I sort of gesture at, which tries to extrapolate from the communication capacity of the brain to its computational capacity using comparisons with current computers. So it's sort of a triangulation of like, you look at a bunch of different sources of evidence, all of which, in my opinion, are pretty weak.
I think we're quite, well, the physical limit stuff is maybe more complicated, but it's sort of an upper bound. I think we are significantly uncertain about all of this.
And my distribution is pretty spread out. But the hope is that by looking at a bunch of things at once, you can at least get a sort of educated guess.
And then, yeah, so I'm very curious, is there consensus in neuroscience or other relevant fields that we understand the signaling mechanisms well enough that we can say, like, basically, this is what it's involved. This is what the system is reducible to.
And yeah, so this is how many bits you need to represent. I don't know, all the synaptic connections here.
Or is there a variance of opinion about like just how complicated the enterprise is? There's definitely disagreement. And it was interesting and in some sense disheartening to talk with neuroscientists about just how, you know, how difficult neuroscience is, you know, it's sort of, I think it's easy, a consistent message, and I have a section on this in the report, was kind of how far we are from really understanding what's going on in the brain, especially at a kind of algorithmic level.
So in some sense, the report is somewhat opinionated in that,

you know, there are experts that I found more compelling than others. There are experts who are much more in a sort of agnosticism mode of like, we just don't know, you know, the brain is really, really complicated, who sort of err on the side of very large compute estimates, a lot of emphasis on biophysical detail, a lot of emphasis on sort of mysterious things that could be happening that aren't happening.
And then there are other neuroscientists who are more, you know, more willing to say stuff like, well, we kind of basically know what's going on at a mechanistic level, which isn't the same as knowing kind of the algorithm, the sort of algorithmic organization overall and how to replicate it. I sort of lean towards the latter view, though I give weight to both and try to synthesize the kind of opinions of people I saw overall.
Just looking at the post itself, I haven't really looked deeper into the actual paper from what it's derived, but it seems like you were, to estimate the flaws mechanistically, you were adding up the different systems at play here. Yeah, should we expect it to be additive in that way? Or maybe it's like multiplicative or there's more complicated interaction.
Like the flops grow super linearly to the inputs. I know that probably sounds really nice having studied it, but just like from a first glance kind of way, that's a question I had.
Yeah. So the way I was understanding and breaking down the forms of processing that you would need to replicate in the brain made them seem not multiplicative in this way.
So an example would be, if you think about, yes, sort of simple examples. So suppose we have some neurons, and they're, you know, they're signaling centrally via spikes through synapses or something like that.
And then we have glial cells as well, which are signaling via like slower calcium waves. And it's a sort of separate network.
You know, you could think that if it were something like, you know, the rate of calcium signaling is dependent on the rate of spikes through synapses or something like that, then that's an important interaction. But, you know, overall, if you sort of imagine like this, this kind of network processing, these are kind of, you can just, you can estimate them independently and then, and then add it up.
It's, they're not, they're not actually multiplicative processes on that conception. I do think there are kind of you can just you can estimate them independently and then and then add it up it's they're not they're not actually multiplicative processes on that on that conception um i do think there are kind of correlations between the estimates for for um the different parts but i uh it's sort of additive at a fundamental level i see okay and then yeah how much credence do you put in um these sort of almost woo-woo hypotheses that i don't know roger pernose has was that thing about there's something like something quantum mechanical happening in the brain that's very important for understanding cognition.
To what extent do you put credence in those kinds of hypotheses? I put very little credence in those hypotheses. Yeah, I don't see a lot of reason to think that.
I see a good amount of reason not to think it, but it wasn't something I dug in on a ton. Okay, gotcha.
All right, so you have this really interesting blog post about infinite ethics. Do you want to talk about why this is an important topic, why it's important to integrate into our worldview, and so on? Sure.
So infinite ethics is ethics that tries to grapple with how we should act with respect to infinite worlds, and how should we rank them? How should they enter into our expected utility calculations or attitudes towards risk? And I think this for both kind of theoretical and practical reasons. So I think at a theoretical level, when you try to do this with a lot of common ethical theories and constraints and principles, they just break on infinite worlds.
And I think that's an important clue as to their viability, because I think infinite worlds are at the very least possible.

Even if our world is finite, and even if our causal influence is finite or our influence overall is finite, it's possible to have infinite worlds. And we have opinions about them, you know, like an infinite heaven is better than an infinite hell.
And, you know, so I think often in ethics, we expect our ethical principles to extend to kind of ranking scenarios or sort of acting in hypothetical scenarios or overall kind of all possible situations rather than just our actual situation. I think infinities come in there.
But then I think maybe more importantly, I think it's an issue with practical relevance. And a way to see that is that, you know, I think we should have non-zero credence that we live in an infinite world.
And it's a very live physical hypothesis that the universe is infinite, even if I think the mainstream view is that our causal influence on that universe is finite in virtue of things like entropy and light speed and stuff like that. But the universe itself may well be infinite and possibly you know, and possibly infinite in a number of different ways.
If that sort of Max Tegmark has some work on all the different kind of like large, you know, ways the universe can be really very large. There's a number of ways that I think it's just, we should have non-zero accretants that we can have infinite influence in our actions now.
So, you know, our kind of the causal influence, the limitations there could be wrong. It may be that there are ways, you know, in the future, we'll be able to do infinite things.
And then I also think somewhat more exotically that it's, there's sort of ways of having a causal influence on an infinite universe, even if you are limited in your causal influence. And that comes from some additional work I've done on decision theory.
And so if you try to incorporate that, if you're a sort of expected value reasoner, it just very quickly starts to dominate or at least break your expected value calculation. So you mentioned long-termism earlier, and a natural reason, a natural argument for getting interest in long-termism is, oh, in the future, there could be all these people, their lives are incredibly important.
So if you do the EV calculation, sort of your effect on them is what dominates. But actually, if you

have even a tiny credence that you can do an infinite thing, you know, either that dominates

or it breaks. And then if you have tiny credences on doing different types of infinite things,

and you need to compare them, you need to know how to do it. And so I just think this is actually,

you know, it's actually a part of our epistemology now, though it's, I think we often don't treat it that way because we're often not doing EV reasoning or really thinking about that, that these are questions that just apply to us. Yeah, yeah.
So that's super fascinating. If it is the case that we can only have an impact on a finite amount of stuff, then maybe it is true that like there's infinite suffering or happiness in the universe at large, but, uh, the Delta between the best case scenario for what we do in the best worst case scenario is finite.
Um, but yeah, I don't know. That still seems less compelling if the, the, the, the hell or heaven we're surrounded by is, uh, overall and not, uh, yeah, it don't know.
That still seems less compelling if the hell or heaven we're surrounded by is overall and that doesn't change. Can you talk a bit more? I think you mentioned your other work on having impact, having infinite impact beyond the scope of what light speed and entropy would allow us.
Can you talk a bit more about how that might be possible? Sure. So, you know, a common decision theory, though it's not, I think, the mainstream decision theory is a contender in the literature is evidential decision theory, where you should act such that you would be, you know, roughly speaking, happiest to learn that you had acted that way for that reason.
And so the reason this allows you kind of a causal influence, so a way of thinking about it is suppose that you are a deterministic simulation and there's a copy of you being run sort of too far away for you to ever causally interact with it, right? But you know that it's a, it's, uh, it's a deterministic copy. And so it'll do exactly what you do absent some sort of computer malfunction.
Um, and now, uh, you're deciding whether to give, uh, you know, you have two options. You can send a million dollars to that.
Well, it's a little complicated because he's too far away, but, um, uh, you know, just in general, like if I raise my hand or if I want to write stuff on my whiteboard, right, or if I'm going to, you know, there's, let's say I have to make some ethical decision, like whether I should take an expensive vacation, or I should donate that money to save someone's life, because that the other guy is going to act just like I do, even though I can't cause him to do that. In some sense, when I when I make my choice, after doing so, I should think that he made the same choice.
And so evidential decision theory treats his action as in some sense under my control. And so if you imagine an infinite universe where there are an infinite number of copies of you, or even not copies, people whose actions are correlated with you, such that when you act a certain way, that gives you evidence about what they do.
In some sense, their actions are under your control. And so if there are an infinite number of them on evidential decision theory and a few other decision theories, then in some sense, you're having influence on the universe.
Yeah, this sounds really similar to the solid experiment and quantum mechanics called EPR pair, which you might have heard of. But the basic idea is if you have two entangled bits, and you take them very far away from each other, and then you measure one of them and you do like before they're brought apart, you come up to some rule that like, Hey, if it's plus, we do this, if it's minus, we did the other thing.
It seems at first glance that measuring something yourself has an impact on what the other person does, even though it shouldn't be allowed by light speed. It gets resolved if you take the many worlds view.
But yeah, yeah. So that's very interesting.
Is this just a thought experiment or is this something that we should anticipate for some cosmological reason to actually be a way we could have influence on the world? So I haven't dug into the cosmology a lot, but my understanding is that it's at the very least a very live hypothesis that the universe is infinite in the sense that there are, you know, it's a sort of infinite in extent and there are, you know, suitably far away, there are copies of us having just this conversation. And then, you know, even further

away, there are copies of us having this conversation, but wearing raccoons for hats, and, you know, and all the rest, which, you know, is itself something to wonder about and sit with. But, you know, my understanding is this is just a live hypothesis.
And more broadly, kind of infinity's plane, you know, infinite universes are just sort of a part of mainstream cosmology at this point.

And so, yeah, I don't think it's just a thought experiment. I think infinite universes are alive.
And then I think, you know, these sort of non-causal decision theories are actually my sort of best guess decision theories, though that's not a mainstream view. So it's fairly, I think it comes in fairly directly and substantively if you have that combination of views.
But then I also think it comes in, I think everyone should have non-zero credence in all sorts of different infinity involving hypotheses. And so infinite ethics gets a grip regardless.
I see. And then, so taking that example, if you're having an impact on every identical copy of yourself in the infinite universe, it seems that for any such copy, there's infinite amount of other copies that are slightly different.

So it's not even clear if you're increasing, maybe it makes no sense to talk about proportions in an infinite universe.

But, you know, if there is another infinite set of copies that scribbled the exact opposite thing on the whiteboard, then it's not clear that you had any impact on the total amount of good or bad stuff that happened. I don't know.
My brain breaks here, but maybe you can help me understand this. Yeah.
So, I mean, there's a general, I think there's a couple of dimensions here. there.
So one is trying to understand actually what sort of difference does it make if you're in this sort of infinite situation and you're thinking about a causal influence. What even did you change at a sort of empirical level before you talk about how to value that? And I think that's a pretty gnarly question.
Um, even if we've settled that question though, in terms of like the empirical, uh, a causal

impact, uh, there's a further question of how do you rank that or how do you deal with the sort of the normative dimension here? And there, so that's the sort of ethical question, and there things get really gnarly very fast. And in fact there are kind of um impossibility results that show that even very basic constraints that you really would have thought that we could get um at the same time in our ethical theories uh you can't get them at the same time um when when you come when it comes to infinite universes um and uh so we know that something is going to have to go and change if we're going to extend

our ethics to infinities. I see.
But then, is there some reason he was settled on, I guess you mentioned you're not a utilitarian, but on some version of EA or long-termism as your tentative moral hypothesis, despite the fact that this seems unresolved? And then like, how do you sit that tension while tentatively remaining in EA? Yeah, so I think there's two dimensions there. One is that I think it's good practice to not totally upend your life and if you encounter some destabilizing philosophical idea, especially one that's sort of difficult and, you know, you don't totally have a grip on it.
But isn't that what long-termism is? Yeah. So I think there's a real tension there in that I think many, you know, how seriously should we take these ideas? At what point should you be making? What sorts of changes for your life on the basis of different, different things that you're, you're thinking and believing, You know, it's a real art, right? And I think some people go, you know, they grab the first idea they see and they start doing crazy stuff and in an unwise way.
And some people are too, it's kind of sluggish and they're not willing to take ideas seriously and not willing to reorient their life on the basis of changes in what seems true. But I think, you know, nevertheless, I think especially things that involve like, ah, turns out it's fine to, you know, do terrible things or, you know, there's no reason to eat your lunch or whatever, like things that, you know, sort of really, really holistically breaking of your ethics views, I think one should tread very cautiously with.
So that's one aspect. At a philosophical level, the way I resolve it is I think for many of these issues, the right path forward, or at least a path that looks pretty good, is to survive long enough for our civilization to become much wiser.
And then to use that position of wisdom and empowerment to act better with respect to these issues. And that's what I say in the end of the Infinite Ethics post is that I think future civilization, if all goes well, will be much better equipped to deal with this.
And we are at square one in kind of really understanding how these issues play out and how to respond. And so I think both at an empirical level and at a kind of philosophical level.
And so it looks convergently pretty good to me to survive, become wiser, keep your options open, and then act from there. And that ends up pretty similar to a lot of long-termism and existential risk.
It's just that it's focused less on, and the main event will be what happens to future people. And it's more about getting to the point where we are wise enough to understand and reorient in a better way.
Okay. Yeah.
So what I found really interesting about this is that you can, yeah, different people tend to have like different thresholds for epistemic learned helplessness,

where they basically say, this is too weird. I'm not going to think about this.
Let's just stick with my current moral theories. So for somebody else, it might be before they became a long-termist, where it's just like, yeah, shillings of future people.
What are we talking about here? We're not changing my mind on stuff. And then, yeah, for you, maybe it's before the infinite ethics stuff.
Is there some principled reason for thinking that this is where that stop should be? Or is it just a matter of temperament and openness?

so i don't think there's a principled reason and and i should say i don't think of my attitude towards infinite ethics as solely oh this has gotten too far down the crazy the crazy path

i'm out um it is this thing about the wisdom in the future is pretty important to me as a reason, as a mode of orientation. A first pass cut that I use is when do you feel like it's real? If you feel like a thing is real, as opposed to a kind of abstract

fun argument, then that's important, or that's a real signal. And I generally encourage people, if the sort of mode that I, I don't know, I'm drawn to is something like, if there's an idea that seems compelling intellectually, that's a reason to investigate it a lot and think about it and really grapple with, you know, if this doesn't seem right to you, or if it seems too crazy, why? And really kind of processing, you know, it's a reason to pay a lot of attention.
But if you've paid a lot of attention, at the end of the day, you're like, well, I guess at an abstract level, that sort of makes sense, but it just doesn't feel to me like the real world. It just doesn't feel to me like wisdom or like a healthy way of living or whatever.
Then I'm like, well, maybe you shouldn't do it, right? I mean, and I think some people will do that wrong and they will end up bouncing off of ideas that are in fact good. But, you know, I think overall, these are sort of sufficiently intense and difficult issues that kind of being actually persuaded and not just sort of chopping off the rest of your epistemology for the sake of some like version of the abstraction is, it seems to me important and it's a sort of a healthier way to relate.
Yeah. So another example of this, um, is that you have this really interesting blog post on ants,

uh, and it's a sort of a healthier way to relate. Yeah.
So another example of this, um, is that you have this really interesting blog post on ants, uh, and your, uh, the, your,

your, your thoughts after, uh, sterilizing a colony of them. So, um, I, uh, yeah, so this

is another example of a thing where almost everybody other than, I don't know, maybe a Jane

who wears a face mask to prevent bugs from going into his mouth would say like, okay, at this point, if we're talking about how many hedons are in a hectare of forest from all the millions of insects there, then you've lost me. But then, you know, somebody else might say, okay, well, there's not a strong reason for thinking they have no absolutely no capacity to feel suffering.

Yeah. So I wonder how you think about such questions, because you can't like stop living and not you're you're not even going to stop going on road trips where you're probably killing hundreds of insects by just driving.
But yeah. So what do you think about such conundrums? I have significant uncertainty about, you know, exactly.
And I think this is the appropriate position about exactly how much kind of consciousness or suffering or other forms of moral, you know, other ways, other kind of properties that we associate with moral patienthood, how much those apply to different types of insects. I think it's a strange view to be, you know, extremely confident that what happens with insects is uh totally morally neutral and i think it actually doesn't fit with our common sense so let's say you see if you see a child like frying ants uh with uh with a magnifying glass i think we you know there is some uh you know what you could say ah well that just indicates that they're going to be cruel to other things that matter um but I don't think so.
I think, you know, and you see the ants like, you know, and they're twitching around. And so I think we aren't, you know, as in many cases with animal ethics, I think we're a bit like kind of schizophrenic about what cases we view as sort of morally relevant and which not.
You know, we have, you know, pet laws, and then we have factory farms and stuff like that. So I don't see it as a radical position that ants matter somewhat.
I think there's a further question of what your overall practical response to that should be. And I do think the kind of costs, as in a lot of ethical life, there are trade offs and you have to make, you have to make a call about what what sort of constraints you're going to put on yourself at the cost of other goals.
And, you know, in the case of insects, it's not my current moral focus, and I don't pay a lot of costs to kind of, to lower my impact on animals. And I don't, you know, I don't, I don't sweep the sidewalk or any, or sorry, on, on, on ants in particular.
And so I think it's, I think, and I think that's, you know, that's my best guess response. And that, and that has to do with other ethical priorities in my life.
But I think, you know, there's, there's a middle ground between I shall ignore this completely and I shall, you know, be a Jane, which is recognizing that this is a real trade-off, there's uncertainty here, and taking responsibility for how you're responding to that.

Yeah, this seems kind of similar to the infinite ethics example, where if you put any sort of credence that they have any ability to suffer, then at least if you're not going to say that, it doesn't matter because like the far future, trillions and trillions advance. It seems like this should be a compelling thing to think about, but then the result is, yeah, it's not even like become a vegan where it's like your change of diet.
Um, uh, and then, so, you know, as you might know, this is used as a reductive ad absurdum of veganism where, you know, if you're going to start caring about other non-human animals, why not also care about insects? And even if they're worth like a millionth of a cow, then you're probably still killing like a million of them on any given day from all your activities, indirectly maybe. I don't know, the food you're eating, all the pesticides that are used to create that food.
I don't know how you go about resolving that kind of stuff. I mean, I guess I'd want to really hear the empirical case.
I think it's true. There are a lot of insects, it's easy.
You know, I think if you want to say like, taking seriously, sort of the idea that there's some reason to not like squash, squash a bug, if you see it leads immediately to kind of Jane like behavior, absent long termism or something like that. I really, I feel like I want to hear the empirical case about like exactly what impact you're having and how, and I'm not at all persuaded that that's the practical upshot.
And if it is, if that's a really strong case, then I think that's an interesting, you know, that's an interesting kind of implication of this view and, you know, worth concern. But I wouldn't jump, it feels to me like it's easy to jump to that almost out of a desire to get to the reductio without kind of, I would try to move slower and really see it's like, wait, is that right? There are a lot of trade-offs here.
What's the source of my hesitation about that? And kind of, and not jump too quickly to something that's sufficiently absurd that I can be like, ah, therefore get to reject this whole mode of thinking, even though I don't know why. I see.
Yeah. Okay.
So let's talk about the two different ways of thinking about observable effects and their implications. So do you want to explain, you have a four-part series on this, but do you want to explain the self-indication assumption and the self-sampling assumption? I know it's a big topic, but yeah, as much as possible.
Sure. So I think one way to start to get into this debate is by thinking about the following case.
So you wake up in a white room, and there's a message written on the wall. And let's say you're going to believe this message.
And the message says, I, God, it's from God. I, God, created, I flipped a coin.
And if it was heads, I created one person in a white room. And if it was tails, I created a million people all in white rooms.
And now you are asked to assign probabilities to the coin having come up heads versus tails. And so one approach to this question, which is the approach I favor, or at least think is better than the other, is the self-indication assumption.
These names are are terrible. But so it goes.

So SIA says that your probability that the coin came up heads should be approximately one in a million and that's because SIA thinks it's more likely that you exist in worlds where there are more people in your epistemic situation or more people who have your evidence which in this case is just waking up in this white room. And so that can be a weird conclusion and go to weird places.
But I think it's a better conclusion than the alternative. SSA, which is the main alternative I consider in that post, which is the self-sampling assumption, says that you think it more likely that you exist in worlds where people with your evidence are a larger fraction of something called your reference class, where it's quite opaque what a reference class is supposed to be.
But broadly speaking, a reference class is the sort of set of people you could have been, or that's kind of how it functions in SSA's discourse. So in this case, in both cases, everyone has your evidence.
And so the fraction is the same. And so you stick with the one half prior.
But that's not true. So SSA in other contexts, not everyone has your evidence.
And so it updates towards worlds where it's a larger fraction. So famously, SSA leads to what's known as the doomsday argument, where you imagine that there are two possibilities.
Either humanity will go extinct very soon, or we won't go extinct very soon, and there will be tons of people in the future. And in the former case, and then you imagine everyone is sort of ranked in terms of when they're born.
In the former case, people born at roughly this time are a much larger percentage of all the people who ever lived. And so if you imagine, you know, God first creates a world and then he inserts you randomly into like some group, it's much more likely that you would find yourself in the 21st century if humanity goes extinct soon than if there are tons of people in the future.
If God randomly inserted you into these tons of people in the future, then it's like really, it's a tiny fraction of them are in the 21st century. So SSA in other contexts actually, it has these important implications, namely that in this case, you update very, very hard towards the future being short.
And that matters a lot for long-termism because long-termism is all about the future being big in expectation. Okay.
So what does the SIA take on this? Yeah. So I think a way to think about SIA's kind of story.
So I gave this story about SSA, which is, it's sort of like this. It's like, first, God creates a world.
This is SSA. First, he creates a world.
And then he takes, and he's dead set on putting you into this world. So he's got your soul, right? And he really wants, and your soul is going in there no matter what, right? but the way he's going to insert your soul into the world is by throwing you randomly into some

set of people, the reference class. And so if you wake, so you should expect to end up in the world where the kind of person you end up as is sort of more like a more likely result of that throwing process.

It's a sort of larger fraction of the total people you could have been.

What SIA thinks is different.

The story that I'll use for SIA, though, it doesn't assist in the only gloss,

is God decides he's going to create a world.

And say there's like a big line of souls in heaven.

And he goes and grabs them kind of randomly out of heaven and puts them into the world. Right.
And so in that case, if there are more people in the world, then you've got more shots. And you're one of these souls.
You're sort of sitting in heaven, hoping to get created. On SIA, God has more chances to grab you out of out of heaven and put you into the world if there are more people like you in that world.
And so you should expect to be in a world where there are more such people. And that's kind of SIA's vibe.
Doesn't this also imply that you should be in the future, assuming there will be more people in the future? Tell me more about why it would imply that. Okay.
In an analogous scenario, maybe like go back to the God tossing the coin scenario where you just substitute for people in right rooms, you substitute being a thing, a conscious entity. And if there's going to be more conscious entities in the future, like you would really expect to, just like in that example of being in that

scenario where there's a lot more rooms, just as maybe you should expect to be in that scenario where there's a lot more conscious beings, which presumably is the future. So then it's still odd that you're in the present under SIA? Yes.
So in a specific sense. So it's true that on SIA, say that we don't know what room you're in first, right? So you wake up in the white room and you're wondering, am I in room one or am I in rooms two through a million, right? And on SIA, what you did first, so you woke up and you don't know what room you're in, but there's a lot more people in the world with lots of rooms.
And so you become very, very confident that you're in that world, right? So you're very, very confident on tails. And then you're right that conditional on tails, you think it's much more like you sort of split your credence evenly between all these rooms.
So you are very confident that you're in one of the sort of two through a million rooms and not room one. But that's before you've seen your room number.
Once you see your room number, it's true that you should be quite surprised about your room number. But once you get the room number, you're back to 50-50 on heads versus tails, because you had sort of equal credence in being in room one conditional on tails, or sorry, you had equal credence in being in tails in room one and heads in room one.
And so when you get rid of all of the other tails in rooms two through a million, you're left with 50-50 overall on heads versus tails. And so the sense in which SIA leaves you back at normality with the doomsay argument is once you update on being in the 21st century, which admittedly should be surprising.
Like if you didn't know that you were in the 21st century and then you learned that you were, you should be like, wow, that's really unexpected. And fair.
So, and that's, that's true. But I think once you do that, you're back at, um, uh, you know, whatever your prior was about, about extinction.
Maybe I'm still not sure on why the fact that you were surprised should not itself be the doomsday argument. Yeah.
I think there's an intuition there, um, which is sort of like, yeah, is SIA making a bad prediction? So you could kind of update against SIA, because SIA would have predicted that you're in the future. I think there's something there.
And I think there's a few other analogs. Like, for example, I think SIA naively predicts that, you know, you should find yourself in a situation where there are just tons of people that, you know, a situation obsessed with creating people with your evidence.
And, you know, this is one of the problems with SIA. So you should expect to find, you know, in every nook and cranny, a simulation of you.
As soon as you like, you know, you open the door, it's actually this giant bank of simulations of you in like your previous epistemic state. And so, you know, I think there are, and then you don't see that, you might be like, well, I should update against the anthropic theory that predicted that I would see that.
And I think there are arguments in that vein. Yeah.
So maybe let's back up to go to the original example that was used to distinguish these two theories. Yeah.
So can you help me resolve my intuitions here where my intuition is very much as I say, because yeah, it seems to me that you knew you were going to wake up, right? You knew you were going to wake up in a white room. Before you actually did wake up, your prior should have been like one half heads or tails.
So it's not clear to me why having learned nothing new, your posterior probability on either of those scenarios should change. So I think the SIA response to that would be, or at least I think a way of making it intuitive would be to say that you didn't know that you were going to wake up, right? So if we go back to that just so story where God is grabbing you out of heaven, it's not at all.
It's actually incredibly unlikely that he grabs you. There are.
There are so many people. I mean, there's a different thing where SIA is in general very surprised to exist.
And in fact, that's the, so you could make the same arguments like SIA says you shouldn't exist. Isn't that weird that you exist? And I actually think that's a good argument.
So, but once you're in that headspace i think the way the way to think about it is that it's not a guarantee that you you were god is not dead set on creating you you are a particular contingent arrangement of the world um and so that that you should expect that arrangement to to come about more often if there are more arrangements of that type um rather than sort of assuming that no matter what, existence will include you. Yes.
Okay. Can you talk more about the problems with SSA, like scenarios where you think it breaks down, like why you prefer SIA? Yeah.
So an easy problem, or sort of one of the most dramatic problems is that SSA predicts that it's possible to have a kind of telekinetic influence on the world. So imagine that there's a puppy, you wake up and you're in an empty universe except for this puppy and you and this boulder that's rolling towards the puppy, right? And the boulder is inexorably going to kill the puppy.
It's very large boulder. It's basically guaranteed that the puppy is dead meat.
But you have the power to make binding

pre-commitments that you will in fact execute. And you have also to your right, a button that would allow you to create tons of people, like zillions and zillions and zillions of people, all of whom are wearing different clothes from you.
So they would be in a different epistemic state than you if you created them. Now, SSA, so you make the following resolution.
You say, if this boulder does not jump out of the way of this puppy, like the boulder leaps in some very weird, very unlikely way, then I will press this button and I will create zillions and zillions of people, all of whom are in a different epistemic state than me, but let's assume they were in my reference class. SSA thinks that it's sufficiently unlikely that you would be in a world with zillions of those people, but you at the very beginning with different colored clothes, because that was a tiny fraction of the reference class if those people get created, that SSA thinks it's actually more likely once you've made that commitment that the boulder will jump out of the way.
And that looks weird, right? It just seems like that's not going to work. You can't just make that commitment and then expect the boulder to jump.
And you get, so that's the sort of exotic example. You get similar analogs, even in the God's coin toss case, where like naively, it doesn't actually matter whether God has tossed the coin yet, right? So suppose, yeah, so like, let's say you wake up and learn that you're in room one, right? But God hasn't tossed the coin.
It's like he created room one first before he tossed, and then he's going to toss and that's going to determine whether or not he creates all the rooms in the future. If you, on SSA, once you wake up and learn, learn that you're in room one, you think it's incredibly unlikely that there's going to be these future people.
So you, now you say before that it's a fair coin, God's going to toss it in front of you. You're still going to say, I'm sorry, God, it's a one in a million chance that this coin lands tails.
Or sorry, one in a million, something like a very small number. I forget exactly.
And that's very weird. That's a fair coin.
It hasn't been tossed. But you, with the power of SSA, have become extremely confident about how it's going to

land. So that's another argument.
There's a number of other, I think, really bad problems for SSA. While I digest that, let me just mention the problems you already pointed out against SIA in the post and earlier

where if one thinks SIA is true,

one should be very confident that you're in the universe with many other people who have been sampled just like you. And so then it's kind of surprising that we're in a universe that is not filled to the brim with people.
There there's a lot of, um, you could imagine like Mars is just completely made up of bodies, um, or, you know, like every single star has like, you know, a simulation of a trillion people inside. Um, the fact that this is not happening seems like, uh, it seems like very strong evidence against SIA.

And then there's other things like the presumptuous philosopher that you might want to talk about as well. But yeah, so do you just bite the bullet on these things or how do you think about these things? My main claim is that SIA is better than SSA.
And I think it's just a horrible situation with Amtropics's, I think overall, SIA is an update towards bigger, more populated universes. I think, you know, the most salient populated universes don't involve like hidden people on other planets, but they're probably, I don't know, maybe we're in a simulation and people are, you know, obsessed with simulating us or, or, or something like that.
Or, and then I think this is actually more important and worrying is I think the way I see this dialectic is first SIA, I mean, so a big problem with SIA is it immediately becomes certain naively that you live in an infinite universe or a universe with an infinite number of people. And that, and then it breaks because, and it doesn't know how to compare um uh kind of infinite universes now to be fair ssa also isn't great at comparing infinite universes um and they both have some you can do things that are actually quite analogous to things you can try to do in infinite ethics where you have like expanding spheres of space-time and you you count you know you have some fraction or some density of people in those spheres.
And there's this general problem in cosmology of like trying to understand what it means to have like a fraction or a density of different types of observers. But, you know, my own take is kind of what happens here is you hit infinite universes fairly fast, and then they kind of break your anthropics in analogous ways to how they break your ethics.
And that's kind of where I'm currently at. And I'm hoping to understand better how to do anthropics with infinities.
And some of my work on the universal distribution, which is a sort of, I have a couple of blog posts on that, was attempting to go a little bit in that direction though. It has its own giant problems.

Okay. Interesting.
Do you know if,

just vaguely it seems to me that the Robin Hanson's grabby aliens thing

probably uses SSA. But do you, do you, do you know if that's the case?

If he's using SSA in there?

I don't, I haven't looked closely at that work.

Okay. Okay, cool.
I don't know. It's hard for me to think about it.
Maybe it'll take me a few more weeks before I can digest it fully, but yeah. Okay.
So that's really interesting. You have a really interesting blog post about believing in things you cannot see.
And one, I mean, this is almost an aside in the post itself, but I thought it was a really interesting comment. You make an interesting comment about futurism.
Here's what you say. Much of futurism, in my experience, has a distinct flavor of unreality.
The concepts, mind uploads, nanotechnology, settlement, and energy capture in space are, I think, meaningful, even if loosely defined. But at a certain point, once models become so abstracted and incomplete that the sense of talking about a real thing, even a possibly real thing, is lost.
Yeah. So why do you think that is? And is there a way to do futurism better? I think it comes partly because imagination is just quite a limited tool.
And it's just easy, you know, when you're talking about the whole, like the future is a big thing to try to model with this tiny mind. And so, you know, of necessity, you need to use these extremely lossy abstractions.
And so, you know, it puts you in a mode of having these like, you know, really sketchy and gappy maps that you're trying to manipulate. I think that's one dimension.
And And then I think there's also a way in which, you know, this isn't all that unique to futurism. And insofar as just in general, I think it's hard sometimes to keep our intellectual engagement kind of rooted and grounded in the kind of real world.
And, you know, I think it's just easy to kind of move into a zone. And especially if that zone is inflected with kind of social dynamics, or it's kind of like an intellectual game, or you're enjoying it for its own sake, or it's like a sort of, there's sort of status dimensions and the way people talk and other things that I think start to move our discourse in directions that aren't about like, we're talking about the real world right now, let's actually get it right.
And I think that happens with futurism. And maybe more so because it can feel like, like, I think some people, there's sort of topics that they treat as like, that's a real serious topic.
That's about real stuff. And then there are other topics where it's like, this is the chance to kind of make stuff up.
And, you know, my experience is sometimes people relate to futurism that way.

There are other topics where people move into a zone of like, one can just say stuff here.

And there are kind of no constraints. And I think that's actually wrong.
And with futurism,

I think there are important constraints and important things we can say. But I think that vibe can seep in nonetheless.
Yeah. And it's interesting that it's true of the future and the past.
Um, I recently interviewed somebody who wrote a book about the Napoleonic war and yeah, it is, I mean, it's very interesting to talk about it in a sort of abstract sense, but then also you can, um, which is very seldom done. You can like think of the reality of like a million men marching out of Russia and freezing and eating the remains of horses and other people and then starving.
And then the concrete reality when you're not talking about abstractions like, oh, the border changed so much in these few decades or something. Yeah.
Just how you think about history changes so much. And it becomes.
Yeah. Even recently, I was reading this book about the use of meth by the Nazis.
And there's this really cynical part of the book where the leaders in the Nazi regime, they're talking about like, oh, meth is a perfect drug because it gives them courage to kind of just blitz through an area without any sort of, without thinking about how cold it is, without thinking about how scary it is to just be no man's land. And just this idea of like this messed up soldier who's like been forced to just go out into the middle of nowhere.
And yeah. And then all like marching to Russia or something in the winter.
I don't know if that was going to lead up to a question. I don't know if you have a reaction, but yeah.
Yeah. I mean, I think, so I think that's a great example of, or, you know, specifically the sort of image of the difference between relating to history as this sort of, how is the border changing versus the concreteness of these people? And, you know, often I think engaging with history is horrifying in this respect is when you really bring to mind the lived reality of all these events.
It's a really different experience. And I think to some extent, one of the reasons that concreteness might be often lacking from futurism is that you can't, any attempt to specify the thing will be wrong.
So, you know, we can, you can, you might be right about some abstract thing. Like you might be like, oh, we will, you know, we will have the ability to manipulate matter at like blah, you know, blah level, you know, you know, scale.
But if you try to dig in and then you're like, and here's what it's like to wake up in the future, you know, and then, you know, you're eating the, or whatever, and it's you're wrong immediately. That's not how it's going to be.
And so you don't have the ability to really hone in on concrete details that are actually true. And so in some sense, you need to, there's this like back and forth where you need to sort of imagine a concrete thing and then be like, okay, that's wrong, but there will then take the flavor of concreteness that you got from that and say, but it will be a concrete thing.
It just won't be the specific one I imagined. Um, and then keep that flavor of concreteness, even as you talk in more abstract ways.
And that's, I think a delicate dance. Yeah.
Yeah. As many viewers will know that Peter has this like very, uh, talking point that he often brings up about, uh, are, uh, that we've become indefinite optimist, um, and that he prefers a sort of definite optimism where you have a concrete vision of what the future could be.
Um, um, okay. So yeah, uh, I guess to close out what, what, one of the things I wanted to ask you about was, uh, so you said this was a side project, this blog.
Um, I thought it was one of the, actually, before you mentioned that you're the one of your main work is AI. I thought this was at least part of your main work.
And so it's surprising. It's really surprising to me then that, uh, you're able to keep up the regularity.
It's like, basically you're publishing a small book every, I don't know, every week or so. And, um, uh, filled with a lot of insight.
And I mean, it's like, well, so, uh, unlike many other blogs on the internet, we're just plain style. Um, yeah, you've great pros.
How are, like, what is your, how are you able to like maintain such a productivity on your side project? I should say a few of, a few of my recent, my most recent posts, which were especially long. I was, I had taken, taken me some time off from work and I, and I was working on those partly in an academic context.
But the first, the first year and a half or so of the blog was just on the, and I've gone back to having it be on the side now. I think one thing that helps is my blog posts are too long.
And so there's, you know, I have dreams of writing these, you know, taking my long blog posts and then really crunching them down and making it into this like pithy, elegant, uh, statement that that's really concise and condensed. Um, but, uh, that would be more, so, you know, one, one way I sort of, uh, increase my output is by not doing that, uh, editing.
And I feel, I feel bad about that. Um, but that's one, that's one thing at least.
Um, uh, what is that quote where, I don't know if somebody's asked like, uh, how did you, I think it's something like I would have, you know, I would have written you a long letter, but I would have, I didn't have time to write you a short letter. So I wrote you a long letter or something like that.
Yeah, exactly. I have a friend who says like the actual thing it should be.
I didn't have time to write you a short letter. So I wrote you a bad letter.
And, you know, I'm like, I hope it's not that bad. But I do, you know, I do think if I had more time for these posts, I would I would try to kind of cut them down.
And that's that's one time saving, you know, for better or worse. Yeah, at least as a reader, it often seems to me that the people like you who write, maybe this is to describe your process, but Scott Alexander says he kind of just writes stream and that, that, you know, it just turns out to be really readable.
Your blog posts are really readable. Um, and even like the stuff I write, like the things that I read are that are, I'm like consciously not trying to make edits while I'm going on.
They end up reading much better than the ones where I'm trying to optimize each sentence, uh, and then taking two steps back for every one I take forward. Um I've, I, I don't know if it's just, it could just be like a selection effect of the, the, the things that are harder to convey.
You're spending more time editing, but, um, yeah, it's, it's kind of interesting. Yeah.
I wonder, I wonder, I mean, my, my feeling is that my writing is, is quite a bit better if I have a chance to edit it. Um, and it's just, it's just a time thing.
Um, but I do think people people vary quite a bit. And, you know, it's interesting.
I don't know if I was recently reading this book, George Saunders, who I think a writer I really admire, has this book about fiction writing called Swim in the Pond in the Rain. And the vibe he tries to convey, and I think this is relatively common amongst writer types, is like this obsessive focus on, you know, even at a sentence by sentence level, really thinking about what, where is the reader's mind right now? How are they engaging? Are they interested? Are they surprised? Am I losing them? And, and, you know, his writing is really, really engaging in ways that it's like not even obvious.
You just sort of start reading along and you're like, oh, wow, I'm really into this. But it's also quite a daunting picture of the level of attentiveness required.
And it's like, wow, if I'm going to write everything like that, it's like, that's going to cut down a lot on my kind of overall output. And so I do think there's a balance there.
And, you know, to the extent you're one of these people who you can just like stream of consciousness and that's like close to what you would get out of editing, which I'm not sure I am, you know, all the better. It sort of like, you're lucky.
Yeah. There's also an additional consideration where if you think there's going to be some kind of power law to how interesting a piece is or how many people see it and how many people find value in it, then it's not clear whether that advises you to spend so much time on each piece to increase the odds that that one piece is going to blow up, given that there's a big difference between the pieces that blow up and don't, or whether you should just do a whole bunch and then just try to sample as often as possible.
Yeah. And I think, I think actually the blog, I started the blog partly as an exercise in just getting stuff out there.
I think I had had, I had had the idea that I would one day write up a bunch of stuff that I've been thinking about, but, you know, it was somehow a, and I would write it up in this grand, you know, I would finally write it up and it'd be this beautiful thing. And I would, you know, take all this time.
And then I had ended up, you know, for various reasons, feeling like I was approaching some

aspects of my life with too much perfectionism or too much.

And I needed to just like, um, get stuff out there faster.

And so the blog was an exercise in, in, in that.

And I think has, uh, you know, I think that's paid off in ways and that I don't know, I

don't think I would have done it otherwise.

I see.

All right.

Final question.

Um, I'm curious if you have, uh have three book recommendations that you can give the audience. Probably my primary recommendation that this is somewhat self-serving because I helped with this project is the book, The Precipice by Toby Ord.
It may be familiar to many of your listeners, but I think it's a book that really conveys the ideas that matter, you know, most to me or that, that have had, you know, close to the biggest impact in my own life. Other books, I love the play Angels in America.
I think it's just, I think it's epic and amazing. And, you know, that's not quite a book, but you can read it.
I actually recommend watching the HBO miniseries, but that's, you know, that's something I recommend. And then, I don't know, last year I read this book, Housekeeping by Marilynne Robinson, and it had this sort of numinous quality that I think a lot of her writing does.
And so I really liked that and recommended it to people. That's also a piece of fiction.
If you're looking for philosophy, I don't know, a lot of my work is, is in dialogue with Nick Bostrom and, and his yeah, his, his overall kind of corpus. And I think that's really, really valuable to engage with.
I see. Cool.
Cool. All right.
Yeah. Joe, thanks so much for coming on the podcast.
It's a lot of fun. A lot of fun.
Yeah. Thanks for having me.
Oh, I'll also say, you know, everything I've said here is just purely my personal opinion. You know, I'm not speaking for my employer and I'm speaking for, you know, anyone else, just, just myself.
So just, just keeping that in mind. Cool.
Cool. And then where can people find your stuff? So just if you want to go over your blog link

and then your Twitter link and other things.

Yep.

So my blog is handsandcities.com

and my Twitter handle is jkcarlsmith.

Those are good places to reach me.

And then my personal website is josephcarlsmith.com.

Okay.

And then we're going to find your stuff on AI

and those kinds of things?

The stuff on AI is linked from my personal website. So that's the best, that's the best place to go.
All right, cool, cool. Thanks for watching.
I hope you enjoyed that episode. If you did and you want to support the podcast, the most helpful thing you can do is share it on social media and with your friends.
Other than that, please like and subscribe on YouTube and leave

good reviews on podcast platforms. Cheers.
I'll see you next time.

Joseph Carlsmith - Utopia, AI, & Infinite Ethics

Listen and Follow Along

Full Transcript

More episodes from Dwarkesh Podcast

Mark Zuckerberg – Llama 4, DeepSeek, Trump, AI Friends, & Race to AGI

Why Rome Actually Fell: Plagues, Slavery, & Ice Age — Kyle Harper

AGI is Still 30 Years Away — Ege Erdil & Tamay Besiroglu

2027 Intelligence Explosion: Month-by-Month Model — Scott Alexander & Daniel Kokotajlo

AMA ft. Sholto & Trenton: New Book, Career Advice Given AGI, How I'd Start From Scratch