Data Centers in Space + A.I. Policy on the Right + A Gemini History Mystery

1h 11m
“As you may have noticed, it is not easy to build data centers here on Earth.”

Press play and read along

Runtime: 1h 11m

Transcript

Speaker 2 This episode is supported by Blockstars, a podcast from Ripple.

Speaker 4 Join Ripple for blockchain conversations with some of the best in the business.

Speaker 3 Learn how traditional banking benefits from blockchain, or how you're probably already using blockchain technology without even realizing it.

Speaker 2 Join Ripple and host David Schwartz on Blockstars, the podcast.

Speaker 8 Crypto investments are risky and unpredictable.

Speaker 9 Please talk to a financial expert before you make any investment decisions.

Speaker 11 This is not a recommendation by NYT to buy or sell crypto.

Speaker 12 Casey, what's going on?

Speaker 13 Oh my gosh. So the other day I'm walking down Market Street and for context, this is like, you know, maybe like one of the main thoroughfares in San Francisco.

Speaker 13 And over the past year, this is a little bit obnoxious, but I would say four or five times someone has recognized me from the podcast and stopped me and wanted to take a picture.

Speaker 13 It always makes my day. Hard fork listeners are the best.
It had happened to me just the previous week. Well, then this weekend, I'm coming home from the gym.

Speaker 13 And you know how you are when you're coming home from the gym, your face is flushed. Yeah, you're sweaty.

Speaker 12 You're sweaty.

Speaker 13 Your hair's, you know, all over the place. And this very sweet young woman comes up to me and asks for a picture.

Speaker 13 And of course, I'm thinking, I kind of look, you know, gross right now, but anything for a hard fork listener, right?

Speaker 13 And she's there with a guy who I assume is, you know, her, her boyfriend or her husband.

Speaker 13 And so I, you know, I put on a show and I'm introducing, you know, hey, and, you know, what's your name and all that?

Speaker 13 She hands me her phone and they go and they stand up against the street with their backs turned, you know, so they can get kind of San Francisco in the background.

Speaker 13 And that's when I realize these people people have no idea who I am. They are just tourists and they want a picture of themselves in San Francisco.

Speaker 12 I'm Kevin Russa, tech columnist at the New York Times.

Speaker 13 I'm Casey Newton from Platformer.

Speaker 12 And this is Hard Fork.

Speaker 13 This week, Google's crazy new plan to build data centers in space. Is this the final frontier of the AI bubble?

Speaker 13 Then, former Trump White House policy advisor Dean Ball tells us what Republicans really think about AI. And finally, it's a history mystery.

Speaker 13 Professor Mark Humphries is here to talk about how an unidentified new Gemini model offered mind-blowing results on a challenging research problem. It was about Canada.

Speaker 12 It was not about Canada. It was basically about Canada.
It was about sugar.

Speaker 13 It was about the sugar trade in Canada.

Speaker 12 Well, Casey, today we are going to start by talking about space.

Speaker 13 Finally, the final frontier, some call it.

Speaker 12 Yes, because I have been looking into this story that I have become obsessed with, which is that we are going to build freaking data centers and put them in space.

Speaker 13 I'm very excited to talk to you about this.

Speaker 13 I would say I have sort of been skimming the headlines, so I have a lot of questions for you about this, but I think whenever we can start an episode in space, that is a great place to start because I don't know if you've looked around lately, but who wants to be on planet Earth right now?

Speaker 12 Okay, let's,

Speaker 12 I like an alternative. I'll say that much.

Speaker 12 Yes. So this has been a thing that has been quietly percolating in the tech industry.

Speaker 12 Obviously, we have this giant data center build out going on here on Earth.

Speaker 12 Every company wants to build these giant data centers, fill them with these GPUs, use them to train their AI models and do things like that.

Speaker 12 As you may have noticed, It is not easy to build data centers here on Earth. No, I've tried.
I got nowhere. I mean, I felt like I I was building IKEA furniture.
It's like, you want me to do what?

Speaker 12 And you need land, you need permits, you need energy to power the data center.

Speaker 12 You need to do all of this relatively quickly. And people sometimes get mad when you try to put up a data center where they live.

Speaker 12 Also, we are facing an energy crunch for these data centers. There is literally like not enough capacity on our terrestrial energy grid to power everything.

Speaker 12 That may get worse as people demand more and more AI and the growth continues exponentially. Yes.

Speaker 12 So a couple of companies, including just recently Google, have now announced that they are exploring a data center in space.

Speaker 13 Which sounds like a joke when you say like any building anything in space seems so impractical, so expensive, so doomed to failure that it truly does just sound like a joke.

Speaker 12 But what you're saying to me right now, Kevin, is that there is a legitimate serious plan to try to do this yes i also thought this was like some kind of crazy science fiction moonshot thing um and it is like an experimental thing no one is doing this like today

Speaker 13 but uh google has uh put out a paper on what it calls project suncatcher yes suncatcher which sounds like a lost led zeppelin single but is is somehow a project to build data centers in space.

Speaker 12 Yes. So this is, they're calling this a moonshot.
They're saying, you know, this might not happen happen for several more years, but this is an active area of research for them.

Speaker 12 There are a couple other companies that have been doing this. Um, Jeff Bezos, Eric Schmidt, other sort of big tech folks are really interested in this idea.

Speaker 12 And I think we should talk about it today just to kind of give people a sense of like what the future may hold if we continue to demand all of this power and all of these data centers to run these giant AI models.

Speaker 13 Yes, I think it is so worth talking about because, among other things, it indicates that we are at the stage of this bubble where people have come to feel like we cannot provide enough electricity for the future we want to build on the planet that we live on.

Speaker 13 We actually have to get off the planet to realize our ambitions. So, if nothing else, that just tells you how ambitious these companies are getting and the crazy big swings that they're about to take.

Speaker 13 Totally.

Speaker 12 Where should we begin?

Speaker 13 Well, let's talk about Project Suncatcher first. What exactly is Google proposing to do? and what did it say about it last week?

Speaker 12 So this was a blog post and a paper that came out last week.

Speaker 12 They are calling this a future space-based, highly scalable AI infrastructure system design.

Speaker 12 And basically they have started doing some testing to figure out if a space-based data center would actually be possible. And the problem that they're trying to solve here is twofold.

Speaker 12 One, as we mentioned, it's like very hard to build stuff here on Earth. You need all the permits and approvals and energy.
The second is like the sun is a really freaking good source of energy, right?

Speaker 12 It emits something like 100 trillion times as much energy as the entire output of humanity. But building solar panels on Earth has some issues.

Speaker 12 Mainly the sun sets for half the day, so you can only get power for half the day.

Speaker 13 Which has long been one of people's primary criticisms of the shot.

Speaker 12 Yes.

Speaker 12 But if you put the solar panels and the data centers into low earth orbit and you put them on something called the dawn dusk orbit path, which I did not just look up this week.

Speaker 12 I definitely knew what that was from my high school astronomy class.

Speaker 12 You can effectively give them nearly constant sunlight and the solar panels can be much more productive, up to eight times as productive as solar panels here on Earth.

Speaker 13 So let me ask you this, because when you say data center, I picture one of these like giant anonymous,

Speaker 13 office complexes that's like the size of six

Speaker 13 football fields that they're building all over the heartland right now. I assume that they are not going to build something like that in space.

Speaker 12 No, these would be, if you look at some of the mock-ups that some of these companies, there's another company called Star Cloud that's sort of like a startup that's got some funding from NVIDIA.

Speaker 12 And if you look at the mock-up that they have made, it kind of looks like a giant bird, but like the wings are these like very thin solar panels, these sort of like arrays of solar panels.

Speaker 12 And the kind of the center of it is kind of this, these clusters of computers, essentially. And it's just kind of out there orbiting in space.

Speaker 12 And the wings are kind of catching all of the sun and they're feeding that energy into the computers at the center of the cluster.

Speaker 13 Got it. So we're in one of these giant, terrifying bird-like structures that are sort of swarming over the earth in this future.
And they're getting so much energy from the sun, and it's so efficient.

Speaker 13 And that is sort of driving all of the compute that's happening inside the computers.

Speaker 13 How does whatever is happening inside the giant terrifying bird get back to us down here on Earth in a timely fashion?

Speaker 12 That's a great question.

Speaker 12 And I asked this to a couple of people I talked to over the past week or so who have been working on this stuff.

Speaker 12 And what they told me is, this is actually not that much different from something like Starlink, right? You're sending data from a satellite or a a series of satellites back to Earth.

Speaker 12 It's not that far away, right? It's not like these are light years away. It's like it might take a couple more milliseconds than you would take to transmit something here on the Earth.

Speaker 12 And that is actually something that we know how to do. Got it.

Speaker 13 Okay, Kevin. So last week, Google puts out a blog post about this.
Give us a sense of where they are in this experiment.

Speaker 12 I would say they feel like they are pretty early in this process. There are still some technical barriers to overcome, and we can can talk about those.

Speaker 12 But they have started actually running tests to figure out things like, well, if we send our TPUs, our AI training chips out into space, like, will they just sort of fall apart because of all the radiation out there?

Speaker 12 And they actually did an experiment that they described in this paper where they took just a normal like TPU, like the kind that they would put in their data centers here on Earth, and they like took it to a lab and they hit it with a proton beam that was supposed to like simulate a very intense kind of radiation that these chips would experience if they were floating out in space.

Speaker 12 And they found that their newer TPUs actually withstood radiation much better than they thought.

Speaker 12 So these things can apparently handle radiation well beyond what's expected of them in a five-year mission.

Speaker 13 Now, if you watched the Fantastic Four first steps earlier this year, you know that cosmic radiation is what transformed the Richards family and Ben Grimm into the Fantastic Four.

Speaker 13 Has Google addressed that at all about sort of any of those concerns?

Speaker 12 they did not address that to my knowledge uh they did address some other potential hurdles one of them is like if these chips glitch out or break, how do you fix them if they're in space?

Speaker 12 And I asked a couple people who have worked on similar projects and they basically said, yeah, we got to figure out how to like get robots up there to like fix the data centers.

Speaker 13 Got it. So they'll focus on using robots for that.
I guess that makes sense.

Speaker 13 Now, am I right that Google is actually planning to do some kind of like test launch within the next couple of years on this?

Speaker 12 Yeah, they are planning to test this in 2027

Speaker 12 by launching two prototype satellites in partnership with Planet, which is a company that sort of sends up these little tiny satellites into space for mapping and things like that.

Speaker 12 And that is their plan. There are also other companies, including StarCloud, which is also planning to send up some prototypes pretty soon.
So they are moving forward with testing on this.

Speaker 12 I will say, I think this is probably not

Speaker 12 going to happen in any real way for at least a couple of years, in part because things are still very expensive to send up into space.

Speaker 12 It is not right now economically feasible to send up a whole bunch of chips and a whole bunch of satellites up into space.

Speaker 12 It costs many times more than what you would need to build a comparable data center here on Earth.

Speaker 13 Yeah, and people here on Earth are saying that building the data centers that we're doing here on Earth are not economically feasible, right?

Speaker 13 So I can't imagine how much more out of control the costs are going to be once you leave orbit.

Speaker 13 One thing I thought was interesting in the Google blog post was that the company tried to play Suncatcher in the lineage of self-driving cars, so it is now Waymo, and quantum computing, which hasn't quite become a mainstream technology yet, but has made a lot of strides.

Speaker 13 You know, just within the past year, we did an episode on it, not all that long ago.

Speaker 13 And they're sort of saying, like, Suncatcher is kind of one of those where we are willing to work on this for 8, 10, 12, 15 years to make it into a mainstream technology.

Speaker 13 And so I took that as Google saying, like, hey, this is not just like some crazy little experiment that a couple engineers are working on in their spare time.

Speaker 13 Like, it seems like they're serious about this.

Speaker 12 I think they're serious about this. And I think they are looking out to a future,

Speaker 12 you know, five, 10, 15 years away where kind of the demand for AI and AI related tasks is just essentially infinite, right? It's like, this is not something that 10% of people are using every day.

Speaker 12 This is something that 100% of people are using.

Speaker 12 uh constantly uh that there are like sort of entire companies or sectors of the economy that have been sort of fully turned over to ai and maybe that happens and maybe it doesn't but if it does happen we're gonna need a lot of energy and a lot of data centers and we may run out of land and power here on Earth.

Speaker 13 Now, something that I did not realize until after I had read about Suncatcher is just how many other companies are looking at doing the same thing.

Speaker 13 Can you kind of give me a high-level overview of like who else is playing here? And does it seem like anyone else is further along than Google is right now?

Speaker 12 Yeah.

Speaker 12 So as I mentioned, there's this company, StarCloud, which is a Y Combinator startup that got some funding from nvidia they are sort of the main ones here doing this there's also a company called axiom space that is doing this and we think that there are some chinese companies or at least one chinese effort to do a space-based data center although they've been a little bit vague about the details there um and then the information had an article

Speaker 12 about uh some comments that uh eric schmidt and Jeff Bezos have made suggesting that maybe they are also interested in or looking at doing something like this.

Speaker 13 Well, you know, Jeff Bezos just put Lauren Sanchez in the space.

Speaker 13 So you have to wonder if that was kind of a first step toward

Speaker 13 something in this vein.

Speaker 12 Yes.

Speaker 13 You know, one thing I think that is interesting about this approach, Kevin, is that, as you know, we've seen an increasing amount of resistance from people in sort of local communities to having data centers put in their towns or near their towns.

Speaker 13 They're worried about how it's going to affect the cost of energy for them, right? They're worried about water usage or the environmental impact. And so I think that, you know,

Speaker 13 if this sort of thing comes to pass, we'll have gone from like, you know, just like the NIMBY saying, not in my backyard to this new group of people that I'm calling the NOMPs that are saying, not on my planet, you know, and they want all the data centers just built.

Speaker 13 up in the sky. So do you think NOMPs are going to become a sort of major political force?

Speaker 12 I do. Although I also think that eventually people may sort of start to not want them in space either.

Speaker 12 but it's going to be harder for them to protest. You got to get in a rocket, go up there into low Earth orbit.
It's very inconvenient. Now, why wouldn't people want them in space?

Speaker 12 Well, there are various people who think that this is going to create a lot of like space debris and things like that that would eventually be bad.

Speaker 12 I talked to some folks who work on this stuff, and they were like, they don't think that's really going to be a big deal. There's all kinds of stuff up in space now.

Speaker 12 We generally don't pay much attention to it.

Speaker 12 But I can see this sort of sounding to people like Elon Musk, you know, proposing to build colonies on Mars or something. Like it's just like, it's like too futuristic.

Speaker 12 It's too sci-fi. And it sounds like these very, you know, rich companies and individuals trying to kind of flee from their problems here on Earth by like sending stuff into space.

Speaker 13 Here's what I would say. I would love to be like living at a time when one of the top 10 concerns I had in my life was space debris.
If I ever get there, Kevin, I will be in heaven.

Speaker 12 Heaven.

Speaker 12 Well, you'll be in lower thorns. I'll be lower.

Speaker 12 Exactly.

Speaker 12 Now, I have a question for you. Yeah, yeah.

Speaker 12 Would you go to space?

Speaker 13 Yes, absolutely.

Speaker 12 Would you go to space to fix a data center?

Speaker 13 I mean, what is the salary for that job?

Speaker 12 Very high.

Speaker 13 I mean, there's probably a certain price for which I would do it. But here's the thing.
You know, I'm not handy around the house. Yeah.

Speaker 13 It's like, if I, you know, if ChatGPT doesn't know what to do, I'm calling the handyman.

Speaker 12 Okay.

Speaker 12 I will just say that I think we should make an offer to Google, which is if you guys get this Project Suncatcher up into low Earth orbit, we will do a podcast episode where we go up there and cut the ribbon.

Speaker 13 You're just dying to be exposed to massive levels of solar radiation.

Speaker 12 You know, I just think it'd be fun.

Speaker 13 When we come back, the ball is in our court. Dean Ball talks about how he crafted the AI action plan.

Speaker 2 This episode is supported by Blockstars, a podcast from Ripple.

Speaker 3 Join Ripple for blockchain conversations with some of the best in the business.

Speaker 3 Learn how traditional banking benefits from blockchain or how you're probably already using blockchain technology without even realizing it.

Speaker 7 Join Ripple and host David Schwartz on Blockstars, the podcast.

Speaker 8 Crypto investments are risky and unpredictable.

Speaker 9 Please talk to a financial expert before you make any investment decisions.

Speaker 10 This is not a recommendation by NYT to buy or sell crypto.

Speaker 14 Over the last two decades, the world has witnessed incredible progress.

Speaker 3 From dial-up modems to 5G connectivity, from massive PC towers to AI-enabled microchips, innovators are rethinking possibilities every day.

Speaker 17 Through it all, Invesco QQQ ETF has provided investors access to the world of innovation with a single investment.

Speaker 6 Invesco QQQ, let's rethink possibility.

Speaker 18 There are risks when investing in ETFs, including possible loss of money.

Speaker 15 ETF's risk is similar to those of stocks.

Speaker 21 Investments in the tech sector are subject to greater risk of more volatility than more diversified investments.

Speaker 21 Before investing, carefully read and consider fund investment objectives, risks, charges, expenses, and more in perspectives at Invesco.com.

Speaker 20 Invesco Distributors Inc.

Speaker 22 Picture this: you land the perfect name for your startup, only to find Peter from Delaware owns the dot-com. Your options, pay up or settle for a domain that looks like a Wi-Fi password.

Speaker 22 But thanks to.tech domains, there's another solution. With.tech, you get the domain name you want that instantly says you're building tech.

Speaker 22 Tech companies worldwide use.tech domains like CES.tech and 1x.tech. Don't settle.
Visit a trusted platform like GoDaddy and get your.tech domain today.

Speaker 12 Well, Casey, recently we've been talking about some state-level AI regulations that have been passed and signed into law. But today we're going to have a discussion about national AI policy.

Speaker 13 Yeah, I think that the states have been acting because the federal government has not really passed any legislation related to AI just yet.

Speaker 13 And that's left us with a lot of questions around how the administration has been thinking about AI.

Speaker 12 It's been a little confusing.

Speaker 12 I think especially, you know, in this administration, it has not been particularly particularly clear to me what President Trump and his allies believe about things like whether we are headed towards some kind of an AGI moment or how the federal government should try to protect against some of the risks of very powerful AI systems.

Speaker 12 So the conversation that we're going to have today, I think, will help us answer some of these questions and just kind of get a better sense of like what is happening in Washington, especially on the right, when it comes to AI and AI policy.

Speaker 12 Yeah.

Speaker 12 So earlier this year, Dean Ball spent several months working as the White House's senior policy advisor for artificial intelligence and emerging technology.

Speaker 12 He was brought into the White House in order to lead the drafting of the White House's AI action plan.

Speaker 12 And in that role in the White House, Dean not only got to see how the AI policy sausage was made at the highest levels of government, he actually got to make the sausage himself.

Speaker 12 He was sort of responsible for taking all these different ideas from the various parts of government and putting them together into a document that would represent the administration's sort of official view on AI.

Speaker 13 Yeah. And while he was there, Dean also got a good sense of who are the various factions on the right when it comes to AI policy.
What do they believe? What are the competing incentives?

Speaker 13 Who has whose ear? And I think if you want to understand the likely path forward for AI regulation over the next few years, that's a really important part of the conversation.

Speaker 12 Yeah. So Dean left the White House in August after the AI action plan was released.

Speaker 12 And since then, he's become a senior fellow at the Foundation for American Innovation and the author of Hyperdimensional, a newsletter about AI and policy.

Speaker 13 And because we're going to be spending a lot of time in this segment talking about AI, let's do our disclosures.

Speaker 12 I work for the New York Times, we're just suing open AI and Microsoft over alleged copyright violation.

Speaker 13 And my boyfriend works at Anthropic.

Speaker 12 Let's bring him in.

Speaker 12 Dean Ball, welcome to to Hard Fork.

Speaker 23 Thank you both for having me. It's so good to be here.

Speaker 12 So how did you end up at the White House earlier this year working on AI policy? What was your background before that?

Speaker 23 I was a think tanker. A lot of it was not tech policy.
A lot of what I did was state and local policy, but I was always very interested in tech.

Speaker 23 And basically, when the AI policy conversation really took off sort of early 2023, I made the decision to start writing about AI, basically as a part-time gig, just like purely on the side, wasn't being paid for it or anything.

Speaker 23 And then eventually I decided I really liked it and I was finding my voice.

Speaker 23 And I was hired by the Mercatus Center at George Mason University to go spend some time there, spent about a year there and then was recruited to the White House on the basis of primarily my writing on Substack.

Speaker 23 And my Substack is called Hyperdimensional. It's where I talk about, you know, AI stuff.

Speaker 12 The Substack to White House pipeline.

Speaker 23 I feel feel like that is you are not the only person who has posted their way into a job in the federal government you can post your way to the federal government it's really true and probably I'm I'm probably like a big chunk of it was probably my posts on X really

Speaker 23 which is maybe even more scary but yeah

Speaker 12 so okay you get this call you go to the White House.

Speaker 12 What did you find there with respect to AI policy? Was there like a coherent single view of how AI should be governed and regulated?

Speaker 23 I would say

Speaker 23 there are coherent intuitions, but the field is so nascent, and there haven't been a lot of fights where dividing lines have really firmed up yet.

Speaker 23 I think, by the way, this is true on the left as well.

Speaker 23 I don't think that those intuitions have formed yet into

Speaker 23 like a lot of different sort of very specific policy positions. I don't think they've concretized yet, is really what I'm saying.

Speaker 23 I think, though, you know, there's a combination of excitement and some worry and some confusion, probably equal parts, which is, you know, in a macro sense, that's probably roughly where I am too, actually.

Speaker 23 And that sounds about right to me.

Speaker 13 You say there were some coherent intuitions about AI in the administration. What were those intuitions?

Speaker 23 I think coherent intuition number one is AI is the most important

Speaker 23 technological, economic, scientific opportunity that this country and probably the world at large has seen in decades and quite possibly ever. I think basically everyone shares the assessment.

Speaker 23 This is going to be extremely powerful and it's going to be really important.

Speaker 23 And

Speaker 23 second intuition that directly follows is there are going to be some risks associated with this that are sort of familiar to us and things that are cognizable under existing sort of policy frameworks and others which might be more alien and might be like risks that we don't really even have concepts for as clearly yet.

Speaker 23 And then, you know, maybe the third intuition is

Speaker 23 regardless of those risks, it feels like AI is going to play a very big role in the future of like American global leadership.

Speaker 12 Yeah, that's really helpful and kind of helps me get a sense of like the lay of the land when you arrived.

Speaker 12 I'm wondering if you can help me understand the kind of intra-right factions when it comes to AI, because I've...

Speaker 12 I think I've identified like at least two different views of AI that I've heard coming from prominent Republicans. And maybe you could call them like the David Sachs view and the Steve Bannon view.

Speaker 12 David Sachs, the president's AI czar,

Speaker 12 is constantly talking online and on his podcast about

Speaker 12 these AI doomers who he thinks are sort of ridiculous and are overhyping the risks of AI and trying to sort of

Speaker 12 get their way on policy, calling them woke, implying that they're sort of trumping up these fears of

Speaker 12 no pun intended, of job loss and things like that to sort of get their way when it comes to policy. Then there's Steve Bannon, who has been

Speaker 12 out there talking about the existential risks from AI.

Speaker 12 And you and I were both at this curve conference.

Speaker 12 Actually, all three of us were there a few weeks ago, where one of Steve Bannon's sort of guys was there and gave this very fascinating talk about how he thought like he was sort of in league with the so-called doomers who believe that this could all go very badly very soon.

Speaker 12 Are there more views on the right than those two? Are those sort of the primary camps?

Speaker 23 No, I think that there's a whole spectrum. I can't speak for either David or Steve, of course, but I would put them on like roughly polar opposites in terms of how, you know,

Speaker 23 about how conservatives talk about this issue. But I think there's a whole spectrum in between.
So, first of all, you've got national security people. You've got national security people who don't

Speaker 23 actually know a ton about, and this is, again, both sides here.

Speaker 23 But, you know, they're just, they think of this as a strategic technology that's important for U.S. competition with China and other things.

Speaker 23 And also, maybe they think there's some national security risks,

Speaker 23 but they're not really thinking about like the domestic policy. They're not really thinking about regulation.
They're not thinking EA versus Doomer. So that would be one.

Speaker 23 I think also,

Speaker 23 you know, related to the sort of Bannon viewpoint, but maybe

Speaker 23 you know, more toward the middle would be like people that are worried about kids' safety primarily.

Speaker 23 There's a lot of conservatives who would distance themselves from the AI Doomer view, but who would also distance themselves from the pure accelerationist view.

Speaker 23 And they would use the lessons we've had with social media as an example. So sort of that kid safety viewpoint.

Speaker 23 For these people, very often the issues of things like LLM psychosis.

Speaker 23 Of course, teen suicidality with chatbots being another very salient issue for this group, for everyone, I hope.

Speaker 12 But yeah, so there are others in between.

Speaker 23 And I guess I would put myself somewhere in kind of the middle in a

Speaker 23 weird fusion.

Speaker 13 Where does industry fit into that spectrum? Like my sense from the outside is that industry groups and lobbyists have had a lot of success in this administration in getting what they want.

Speaker 13 Where are they in those conversations?

Speaker 23 I think it really depends on incentives. People in policy conversations very often will refer to like industry as being this kind of monolithic, coherent entity.
It's of course not.

Speaker 23 And there's different people that have different incentives. So, you know, if you're a U.S.
hyperscaler, you don't hate the export controls.

Speaker 23 You know, you don't want more competition for the same chips that you're trying to make.

Speaker 12 Meaning, like Microsoft or Google or an Amazon.

Speaker 23 Yes, Microsoft, Google, Amazon Web Services, et cetera. You don't hate that because like, A, you don't want Chinese firms competing for your chips.

Speaker 23 But even if it's not the same chips you're competing over, you don't want to be implicitly competing over space at TSMC fabs to make the chips. So, you know, hyperscalers, you know,

Speaker 23 they will definitely have like nuanced positions on export controls, but by and large, like their incentives are not to hate them, and they largely don't.

Speaker 23 Frontier Labs, I mean, they want to make money selling tokens to people. So they want access to chips.

Speaker 23 But, you know, I think there's some people who believe, and it's from a political theory perspective, it's not wrong to believe that ultimately they want to create moats.

Speaker 23 And I think there's a lot of ways you can make moats. It seems to me like the main way they're trying to make moats right now is through infrastructure.

Speaker 23 That they've basically all come to Anthropic today and announced a $50 billion commitment to build their own data centers.

Speaker 23 Google obviously does this. OpenAI does this through Stargate.
Meta does this. XAI does this.
Everyone does this. Everyone's building infrastructure.

Speaker 23 And the basic view is like, well, the models maybe are not your moat per se. Like the parameters of the model are not your moat, but perhaps the infrastructure is.

Speaker 23 And so, you know, these are all competing interests and no one's making illegitimate arguments here. Everyone's operating from incentives.

Speaker 23 And of course, the job of government is to sort of solve for the equilibrium.

Speaker 12 Dean,

Speaker 12 is there a MAGA view of AGI?

Speaker 12 Not yet. No, not really.

Speaker 23 I don't know that there's any political persuasion view of AGI. I think MAGA might actually be the closest to having one.

Speaker 23 And I think it's at the moment, maybe the persuasion, at least from what I see online, is like maybe it's sort of more doomery.

Speaker 13 I believe we saw a bipartisan bill introduced over the past week that would require reports of job losses due to automation, which suggests that there is some increasing attention to that likelihood.

Speaker 23 Yeah, well, I mean, so there's this big question, you know, in the AI field, like at places like the curve and places like Lighthaven, there are these gatherings of various sort of doyens of the AI community, and they get together.

Speaker 23 And the main question that people talk about is like, when are the pitchforks going to be out for this technology? And what is going to cause the pitchforks to come out?

Speaker 23 And I have come to the conclusion that rather than it being a singular issue, it's going to be this kind of miasma of issues. It's going to be like, you know, it's slopification.

Speaker 23 It's not safe for kids. It's driving up your electricity prices.
It's using all the water. It's

Speaker 12 taking your job.

Speaker 23 It's taking your job and also it's going to kill everyone. And also, by the way, it's fake.
It'll be all those things and kind of this weird sort of vichy soise.

Speaker 13 The aspect of the AI action plan that I find the most annoying is the attention on the ideology of the chat bots and the suggestion, you know, that they should be able to, you know, respond in some ways, but not in other ways.

Speaker 13 Can you kind of illuminate the discussions that were being had and what the administration actually wants out of these models?

Speaker 23 Yeah. So I think the main point here, first of all, like the most important thing, you're talking about the woke AI executive order, is what it is, how it's traditionally phrased.

Speaker 23 This is an executive order that deals with federal procurement policy. In other words, this is not an executive order.

Speaker 23 It is not a regulation on the versions of AI models that a company like Anthropic or OpenAI or any other company ships to consumers or private private businesses.

Speaker 23 This is purely about the versions of their models that they ship to the government. And the government is saying in this case, we do not want

Speaker 12 to

Speaker 23 procure models which have top-down ideological biases engineered into them.

Speaker 23 We would like our government employees to have access to models which are, you know, I think objective is a really hard word.

Speaker 23 Obviously, we've been like debating about what is truth for, you know, since there was language, right? So I don't think we're going to resolve that.

Speaker 23 I have a feeling the General Services Administration guidelines will not resolve that issue. You know, I think it's folly to even try.
And I think the executive order doesn't try.

Speaker 23 You know, the executive order steers clear of doing so. The executive order says instead,

Speaker 23 you just, we don't want you as the developer imposing some sort of worldview on top of the model.

Speaker 12 Well, good luck with that, I guess.

Speaker 12 Well, I want to ask one follow-up on that because my sense is that, you know, the Trump administration and Republicans in Congress have been very upset with how the Biden administration sort of jawboned, how they applied pressure to social media companies to take down, you know, misinformation or what they considered misinformation about the COVID vaccines or things like that.

Speaker 12 That was seen as like very inappropriate. In fact, there are like ongoing investigations of the contacts between the Biden White White House and the social media companies over this issue.
Yes.

Speaker 12 And then we turn around and we see this like woke AI executive order where it's like, I understand the subtle point you're making about, you know, this is not regulating the models that the companies are releasing to the public.

Speaker 12 It's just the ones that they're selling to the government.

Speaker 23 But like we all know that there's there's one set of models, right?

Speaker 12 And they get built and they get sold to various customers.

Speaker 12 And I think, you know, it's reasonable to see that and think, okay, this is the Trump Trump administration doing exactly what it got so mad at the Biden administration for doing, which is to contact the tech companies and tell them, hey, this is how your products should be working.

Speaker 12 This is the kind of things it should be allowing and not allowing. And I don't know, does that seem at all to you hypocritical?

Speaker 23 Well, so I look, I think that there is an inherent tension here.

Speaker 23 And this is a tension that has existed on the right, and it's particularly existed sort of post-Trump 45, post-President President Trump's first term.

Speaker 23 There is this argument that exists of should we stick to our principles that the government shouldn't be doing this kind of job owning,

Speaker 23 or should we accept that the government has this power and now

Speaker 12 we need to throw it back at the left, right?

Speaker 23 And I can tell you that I personally have always definitively been on one side of that argument, which is my former view.

Speaker 23 We should stick to principles. We should not fight.

Speaker 12 We should not. No job boning from anyone.

Speaker 23 Yeah, you shouldn't do that. I mean, like, you know, you shouldn't do that.
At the same time,

Speaker 23 I think the government totally has a right to say, and again, what we're talking about here, like, I wouldn't think of this as like a model training thing.

Speaker 23 I would think of this as the sort of thing that can be relatively,

Speaker 23 like trivially easily changed by the developer, right? So models that are sold to the government already have compliance burdens that are significantly higher than this executive order, right?

Speaker 23 They have to comply with the Freedom of Information Information Act, they have to comply with the Presidential Records Act if they're sold to the White House.

Speaker 23 There's all sorts of data stewardship laws that are way more difficult than anything in the Woke AI executive order.

Speaker 23 The Woke AI executive order basically says, like, you need to disclose in the procurement process to the agency from whom you're procuring,

Speaker 23 you need to disclose like what the system prompt is. You can change a system prompt for a specific customer.
It's not that hard. And I would only point out that, like,

Speaker 23 I will just say it here right now: that, like,

Speaker 23 if you did try to use federal law to compel a developer to change the way they train the models that they serve to the public, that is unambiguously unconstitutional.

Speaker 23 It is a violation of the First Amendment. You are violating that company's speech rights and you are violating the American citizens' speech rights who might use that model.

Speaker 23 So it would be quite dire and grave for the government to do that. And I am confident that the woke AI executive order was not intended to do that.

Speaker 13 So, Dean, I really enjoy your newsletter. I've been reading it since before you joined the government.
I continue to read it today.

Speaker 13 And one point of view that you advocate for with great frequency is that most

Speaker 13 if not all, AI regulation should be done at the federal level.

Speaker 13 And you spend a lot of very valuable time looking into how states are attempting to regulate AI in ways that I think you believe are mostly bad.

Speaker 13 Could you kind of give us a high-level overview of your interest in this subject and what you see states doing that concerns you so much?

Speaker 23 Yeah, so I come from a state and local policy background, I should say. And so like my view is that a lot of the real governance in this country happens at the state and local level.

Speaker 23 And I mostly, now that I live in DC, I mostly say, thank God that that's the case. That being said, there are some things that inherently implicate interstate commerce.

Speaker 23 And I think that models which are trained to be served to the entire world, which cost a billion dollars to train,

Speaker 23 that the standards by which those models are trained and evaluated and measured, you know, I think those have to be federal standards because you can't have competing standards. Now,

Speaker 23 maybe we don't end up having competing standards. Maybe what happens is the biggest state regulates and that happens all the time in America.

Speaker 23 There's many, many technologies where the state of California or the state of New York or somewhere like that, Texas sometimes, has

Speaker 23 an implicitly federal effect, one state doing lawmaking.

Speaker 23 I think that's a failure mode.

Speaker 23 I think it's an issue of our, a structural issue of our Constitution that the founders couldn't really possibly have contemplated because, like, the notion of economies of scale didn't quite exist for them.

Speaker 23 And so, I think it's a really, really difficult issue of Supreme Court jurisprudence. Right now, it's the case that California, by default, is the central regulator of AI in America.

Speaker 23 Thus far, I think they've done a better job than I would have guessed, but still not a great job.

Speaker 23 So I was broadly supportive of their flagship AI bill from this year, which was called SB 53. It is a transparency bill that applies only to the largest developers of AI models.

Speaker 23 And to me, it seems rather reasonable overall.

Speaker 13 Bring it back to maybe some more like contemporary AI concerns, though, which is, you know, earlier when you were describing some of the kind of, you know, landscape in Washington and who's concerned about what, you mentioned there's this group of Republicans who are very concerned about

Speaker 13 chatbot psychosis, child safety, teen suicidality.

Speaker 13 Those are all harms that are present today that seem to be encouraged on some level by products that are out on the market.

Speaker 13 And we have a Congress that is very loath to pass really any regulation at all when it comes to the tech industry, whether that's for ideological reasons or just logistically, it's very difficult to get Republicans and Democrats to agree.

Speaker 12 Or the government's shut down half the time.

Speaker 13 That's also been increasingly an issue. And so in such a world, I can very much understand the point of view of a state lawmaker who says, well, I don't want the kids in my state to kill themselves.

Speaker 13 Like we're going to do something about this right now. And we're not as dysfunctional as the federal government.

Speaker 12 So we're going to get in there and we're going to try to do something.

Speaker 13 So how do you view that dynamic? And is your desire truly that the states would just say, hey, we're not going to get involved and that's on Congress?

Speaker 23 No. So I, I mean, look, I understand the incentives of the state lawmakers, like for sure.
I think Congress needs to act. Like my, my view is more proactive.

Speaker 23 My view is like, but Congress needs to deal with this. This is a problem that Congress needs to deal with.
I don't blame the state lawmakers. I blame, sometimes I do.

Speaker 23 Sometimes I blame them for poor statute drafting. There's no excuse for that, right? Their job.

Speaker 23 Like, and I say this sometimes to legislators and they're like, well, we'll let the courts figure that out. And I say, no, you took an oath to the Constitution too, not just the judges.

Speaker 23 But in the general case of like, I want to protect kids in my state, no, of course I don't blame them for that.

Speaker 12 Yeah.

Speaker 12 I want to zoom out a little bit and ask a question about AI and polarization.

Speaker 12 It feels to me right now like AI is kind of in this weird, confusing, pre-polarized state.

Speaker 12 Like there's this sort of machine that sort of, when an issue gets important enough or salient enough to enough people, it kind of gets run through the polarization machine and like like it comes out the other side and like Republicans take one position and Democrats take another position.

Speaker 12 Do you think something similar is going to happen with AI where like it will become very predictable

Speaker 12 which view you hold on AI based on which party you vote for?

Speaker 23 I think what's more likely is that over time

Speaker 23 it splinters and there's like different things that people talk about. So there's going to be data centers and there's going to be, you know, China competition.
That'll be an issue.

Speaker 23 And there'll be like the software side regulation. There'll be the kids' issues.
Just like today, you know, we don't talk about computer policy or internet policy.

Speaker 23 We talk about internet policy used to be a thing. In the 90s, internet policy was a thing.
But now it's like social media, you know, privacy, whatever else. I think it'll splinter in that way.

Speaker 23 Will those issues themselves be polarized? Yeah, probably, I mean, in some ways they will be. Yeah.
I do hope, though, that there's certain parts of, and this is a very important part of,

Speaker 23 you know, the action plan, in my view, too.

Speaker 23 The action plan, like not every single aspect of an issue has to be polarized.

Speaker 23 There are legitimate tail risk type events, national security issues that I think it is the obligation of the federal government to deal with in a mature and responsible way.

Speaker 23 I've heard Ezra Klein before,

Speaker 23 I love this turn of phrase of his.

Speaker 12 Who I've never, I've never heard of it. Yeah, we're not familiar with his work.

Speaker 13 Yeah.

Speaker 23 I've heard him describe government as a grand enterprise in risk management. And I think that's true.
In a fundamental sense, I think that's very true.

Speaker 23 And so there are certain things that we just do need to deal with. And the action plan tries to make some incremental progress on some of those things.

Speaker 23 And of course, there's a lot of things we need to do to embrace the technology and let it grow and all that too. And I think that's an important part as well.

Speaker 23 But that's less controversial to say as a Republican. I think the maybe more controversial thing right now to say is like, yeah, there are like legitimate risks.

Speaker 23 And I hope those things can be bipartisan, that dealing with those risks can be bipartisan. Because, really,

Speaker 23 if we can't deal with catastrophic tail risk, then we do not have a legitimate government.

Speaker 12 Like,

Speaker 23 the whole point of government is to deal with this issue. And we should just, as Michael Dell said about Apple in the 90s, we should throw the thing out and return the money to the shareholders.
If

Speaker 23 we can't manage these things, I really do believe that.

Speaker 13 So, let's talk about that point specifically.

Speaker 13 When I look at AI policy in America today, I mostly see the big frontier labs getting just about everything they want, right?

Speaker 13 Like it seems like there is a high degree of alignment between the labs and the government.

Speaker 13 And when it comes to like safety restrictions, for example, I don't see a lot that is holding them back from, you know, building their next two or three frontier models.

Speaker 13 So there are components of the AI action plan that are meant to address some of those catastrophic catastrophic risks that you mentioned. Tell us how you envision that actually working.

Speaker 13 Where is the moment where the industry stops getting everything that it wants?

Speaker 23 Well, I would say there's so much you can say here.

Speaker 23 I think the first thing is that many of the people who work at the frontier labs, I can't speak for the labs, of course, but having knowing a lot of them personally, including up to senior, very senior levels, I can say that they have an earnest desire to deal with these problems.

Speaker 23 And they invest real resources as companies. And part of the reason they do that is because they have incentives, because their companies would be bankrupt if they, E.G., caused a pandemic.
Right.

Speaker 23 And the other thing is that like a lot of these problems are super tractable. Like we don't have to act as though these things are like the hardest problems we've ever dealt with.

Speaker 23 To me, as someone with experience in public policy, and by the way, this is the posture of like people that I met in government who are 30-year veterans of thinking about tail risks.

Speaker 23 To them, you bring up like AI bio-risk or AI cyber risk and they're like, yeah, sounds like a serious risk. Okay, there's a hurricane that's tracking toward Florida.
Let me go deal with that, right?

Speaker 12 Like

Speaker 23 these things come across your desk every day when you're in government. These are eminently tractable problems in the near term.

Speaker 23 With current technology and technology that I think we're going to have in the near future, without spending a ton of money.

Speaker 23 There's a lot of traction you can get on them that doesn't involve really in any meaningful way slowing down AI development. I want to push back that there's this trade-off between

Speaker 23 sort of mitigating tail risks and slowing down AI development. Now, will that always be the case? No.
At some point, there will be trade-offs. We'll have to make those trade-offs and they'll be hard.

Speaker 23 And it's like hard for me to know where I'll come down on that because it'll depend on the particulars. But

Speaker 23 right now, we have this great opportunity of like, oh, we can accelerate AI development and we can also have better biosecurity, which by the way, was a problem before ChatGPT existed.

Speaker 23 There was a whole pandemic about it.

Speaker 12 So, like, yeah.

Speaker 12 Sometimes I talk to people who work on AI policy or just,

Speaker 12 you know, work on AI and think about policy, and they'll say things like, you know, I don't think we're going to get any meaningful AI regulation until there's a catastrophe.

Speaker 12 Do you, Dean, think that it will take something like that to really catalyze significant movement on AI policy in Congress?

Speaker 23 Possibly. I mean, like,

Speaker 23 I can't say that I like that. Certainly a catastrophe is plausible and could catalyze movement in Congress, for sure.

Speaker 12 I think there are other ways to achieve this.

Speaker 23 I really do. Like, I think you can make incremental advancements in the absence of a catastrophe.

Speaker 23 Now, it depends on like a lot of people in the AI safety community

Speaker 23 will say this, or people that are at labs who care about AI safety also, they will say this. That's like a very anthropic type of position.

Speaker 23 And I don't say that as a pejorative.

Speaker 12 I mean, to me, to be totally transparent, like I've heard this from people

Speaker 12 at lots of different labs where they're sort of like, yeah, I don't really think like we're capable of, and it's not so much a knock on like this particular Congress or anything.

Speaker 12 It's just like, I don't think the government is capable of regulating things in advance.

Speaker 23 I am okay with government being in a mostly reactive posture,

Speaker 23 particularly with respect to things that aren't tail risks. Tail risks are the one exception

Speaker 23 because

Speaker 23 those things can be very, very damaging. And so you want to do some stuff in advance to mitigate that.

Speaker 23 But when it comes to like most other harms from AI, I'm comfortable with government just really reacting to realized harms in areas where it's like, okay, well, it's a realized harm that we've seen.

Speaker 23 We think that's going to continue happening. It doesn't appear to be resolved adequately by the existing system of common law liability that allows people harmed to sue the people who harmed them.

Speaker 23 And it can be meaningfully addressed through a targeted law. And if all those conditions are satisfied, then we should totally pass that law.
I think kids' safety is in this category.

Speaker 13 Yeah.

Speaker 12 Yeah.

Speaker 12 Well, Dean, thanks so much for coming. Really fascinating conversation.
And people should check out your writing. Your website is hyperdimensional.

Speaker 23 It was a real pleasure, guys. Thank you.
Thank you.

Speaker 12 Thanks, Dean.

Speaker 13 When we come back, we'll have more to say about the Canadian fur trade than we've ever said before.

Speaker 12 It was not the Canadian fur trade, it was the upstate New York sugar trade. They're related in ways I don't understand.

Speaker 2 This episode is supported by Blockstars, a podcast from Ripple.

Speaker 3 Join Ripple for blockchain conversations with some of the best in the business.

Speaker 3 Learn how traditional banking benefits from blockchain, or how you're probably already using blockchain technology without even realizing it.

Speaker 2 Join Ripple and host David Schwartz on Blockstars, the podcast.

Speaker 8 Crypto investments are risky and unpredictable.

Speaker 9 Please talk to a financial expert before you make any investment decisions.

Speaker 10 This is not a recommendation by NYT to buy or sell crypto.

Speaker 22 This podcast is supported by AT ⁇ T. America's first network is also its fastest and most reliable.

Speaker 22 Based on Rootmetrics United States Root Score Report 1H2025, tested with best commercially available smartphones on three national mobile networks across all available network types, your experiences may vary.

Speaker 22 Rootmetrics rankings are not an endorsement of AT ⁇ T. When you compare, there's no comparison.
AT ⁇ T.

Speaker 15 Over the last two decades, the world has witnessed incredible progress.

Speaker 3 From dial-up modems to 5G connectivity, from massive PC towers to AI-enabled microchips, innovators are rethinking possibilities every day.

Speaker 17 Through it all, Invesco QQQ ETF has provided investors access to the world of innovation with a single investment.

Speaker 3 Invesco QQQ, let's rethink possibility.

Speaker 18 There are risks when investing in ETFs, including possible loss of money.

Speaker 15 ETFs' risks are similar to those of stocks.

Speaker 21 Investments in the tech sector are subject to greater risk of more volatility than more diversified investments.

Speaker 21 Before investing, carefully read and consider front investment objectives, risks, charges, expenses, and more in perspectives at investco.com.

Speaker 20 Investco Distributors Inc.

Speaker 12 Well, Scooby Gang, it's time to get in the old mystery machine because today we've got a mystery.

Speaker 13 That's right, gum shoes. Grab your notebook and your magnifying glass because there are a few clues and we're about to crack the case wide open.

Speaker 12 And this one is a history mystery. It involves an experiment that a historian ran using an AI model.
And we're going to talk about it all with the historian in just a second.

Speaker 12 But Casey, to set the scene here a little bit, there are a lot of rumors going around right now about this new Google Gemini 3 model.

Speaker 13 There really are.

Speaker 13 Gemini 2 came out almost exactly a year ago, came out last December.

Speaker 13 And while Google has updated it throughout the year, we have been hearing an increasing number of whispers this fall about Gemini 3 and rumors that it really is pretty great.

Speaker 13 So Alex Heath reported a few weeks back that he expected Gemini 3 to come out in December. And one thing that happens in the run-up to the release of new models is that companies quietly test them.

Speaker 13 And that brings us to our story today.

Speaker 12 Yes. So Mark Humphreys is a history professor at Wilfrid Laurier University in Ontario, Canada.

Speaker 12 He does research involving a lot of old documents and trying to decipher the handwriting on these documents. And he is also kind of an AI early adopter.

Speaker 12 He's got a sub stack called Generative History, where he's been writing about his experiments using AI to solve some of his research problems.

Speaker 12 And recently he had a post that really caught our attention called, Has Google Quietly Solved Two of AI's Oldest Problems?

Speaker 12 in which he explained a really fascinating experiment that he ran using one of these kind of test models inside Google's AI Studio, which is a Google product where you can kind of experiment with different models.

Speaker 12 And he says that the responses that he got back from this mystery model made the hair on the back of his neck stand up.

Speaker 12 Like this was so astounding to him, not just because they were very good, but because they seemed like a different kind of capability than ones he had seen in any other AI model.

Speaker 13 Yeah. And so the mystery is what model was Mark using?

Speaker 13 But I think the bigger story is what does it mean that this historian was as impressed as he was with this very unusual thing that he found a large language model doing?

Speaker 12 Yes, and we should say, like, it is very hard to determine exactly which model anyone is sort of being shown at any given time, the way these pre-release tests go.

Speaker 12 Companies will, you know, show 1% of users one model and another 1% of users a different model and kind of ask them to compare the two.

Speaker 13 And they give them weird code names. They don't tell you what you're using.
Exactly.

Speaker 12 So there's still some uncertainty around this. This may have just kind of been a one-off.
We will obviously need to see what Gemini 3 actually does when it comes out.

Speaker 12 But for now, I think this is a very interesting story because it points to the way that these AI models are starting to do things that surprise even experts in their fields.

Speaker 13 Yes. And so for those reasons, it's time to bring in Mark Humphreys and talk about what he found.
Kevin, you know the difference between an American and a Canadian historian? What's that?

Speaker 13 Canadian historians process data while American historians process data.

Speaker 12 Is that true? Yeah, that's true.

Speaker 12 Well, let's talk to Mark, and he can pronounce it however he wants.

Speaker 13 Hell yeah, brother.

Speaker 12 Mark Humphreys, welcome to Hard Fork.

Speaker 24 Thanks for having me.

Speaker 12 Where are we catching you today? Are you up in Canada? What's going on up there?

Speaker 24 I am. I'm in Waterloo, Ontario, in Canada, in my office at the University of Wolfrid Laurie University.

Speaker 13 So Waterloo, so you must just be surrounded by AI computer scientists at all times.

Speaker 24 There are a lot of startups and a lot of AI researchers and a lot of computer companies in Waterloo. Yes.

Speaker 13 Home of the Blackberry. That's right.

Speaker 24 That's right. Yes.
Rim Park.

Speaker 12 So before we get into the specifics of your most recent brush with this new mystery AI model, can you just tell us how you've been using AI in your history research over the last year or so?

Speaker 24 Sure. So my research partner and I, Leanne Letty,

Speaker 24 whose lab this all comes out of as well,

Speaker 24 have been working on trying to develop ways of processing huge amounts of data, mostly handwritten, related to the fur trade. And that involves a couple of things.

Speaker 24 It involves trying to recognize the handwriting accurately, but it also involves trying to basically generate metadata for all of

Speaker 24 tens of thousands of records to try and understand what's in those records and make connections between them.

Speaker 24 So we're kind of operating at tasks that are kind of at the, just at the threshold of what AI models are capable of doing.

Speaker 24 So it's been kind of interesting to watch over the last couple couple of years, the models get better and become capable of doing some of these things and then finding out new limitations as we go along.

Speaker 13 And tell us a little bit about the kind of work that you do in general. I know you're really focused on using older documents in your work.

Speaker 13 What kind of stories are you trying to put together?

Speaker 24 Yeah, so,

Speaker 24 you know, I've always been really interested in stories of ordinary people.

Speaker 24 So in the fur trade, when you're trying to understand, you know, what happened to ordinary people in the 18th and the 19th centuries, the problem is many of them were literate, didn't write.

Speaker 24 And although they kind of appear in a lot of documents that are generated in the kind of course of living, these are marriage, death records, account books, stuff like that, it's a lot of detective work.

Speaker 24 It's a lot of trying to piece together stories from fragmented documents, what somebody bought in one place, a contract they signed somewhere else, a baptismal record somewhere else.

Speaker 24 And so a lot of this is trying to do that. And that's what Dr.

Speaker 24 Letty and I have been trying to do is with our graduate students is to try and piece together what these stories about ordinary people can tell us in the fur trade and in the western part of North America from kind of the period of about 1760 through until the early 19th century.

Speaker 13 You know, it's interesting, Kevin, because every time I go to a Starbucks and they try to give me a receipt, I think, I don't need any paperwork about what just happened here, right?

Speaker 13 I'm just going to take my Mocha and get out of here. But what Mark is saying is that that document could be of huge value to a future historian in understanding our lives.
Exactly.

Speaker 12 Yes, they will want to know.

Speaker 13 Let's get into it.

Speaker 12 So, Mark, tell us about this

Speaker 12 experience that you had with Gemini, the AI model that you were trying to use for this transcription, basically taking this very old document about the fur trade and

Speaker 12 plugging it in and saying, transcribe this, tell me what this says.

Speaker 24 Yeah, so, you know, I think to understand why this is kind of a significant or looks like it could be a fairly significant development, it's important to understand kind of where we've come from in the last two years on this, right?

Speaker 24 So when GPT-4 first came out in 2023, it could kind of sort of read handwritten documents. It would be mostly errors, but you could kind of see that it was beginning to be able to do this.

Speaker 24 And it's been really easy for kind of companies and systems to get up to about 90% accuracy. And then everything above 90% has been pretty difficult.

Speaker 24 And the problem is that that last 10% is the most important part, right?

Speaker 24 So that if you're interested in people's names, you're interested in amounts of money, you're interested in where they were, you've got to get that stuff right in order to make it useful.

Speaker 24 And up till about, I guess,

Speaker 24 you know, when Gemini 2.5 Pro came out back last spring, we were kind of still in that era. And Gemini 2.5 Pro got up to about 95% accuracy.
And that's really good.

Speaker 24 So what I was interested in is when we began to see reports kind of on X that

Speaker 24 there were... you know, new models being tested by Google in AI Studio, which is their kind of playground app.

Speaker 24 I was just curious, how much better would this get?

Speaker 12 So, okay, you are hearing these rumors that there's this new mystery model inside the AI studio that Google tests new models in before they're released. What do you do?

Speaker 24 Yeah, so

Speaker 24 we have a Dr. Letty and I have a corpus of 50 different documents that we've been using to benchmark kind of how these models improve over time.

Speaker 24 They're all documents that we are pretty sure are not in the training data because we've either taken them ourselves or they've been kind of from sources that are not typically online.

Speaker 24 And you can't be 100% sure, but it seems to be the case. So I started to put a few of those documents in.

Speaker 24 And for your listeners who are not maybe aware, the way that the testing of these types of models often works is you kind of have to put in the document dozens of times before you get the hit on the model you're hoping to test because it kind of randomly pops up.

Speaker 24 So it's not an easy thing to do. I managed to test about, you know, five of our 50 examples, about a thousand words.

Speaker 24 And

Speaker 24 The results were impressive, to say the least, in the sense that the error rate again declined by about 50% from where it had been with Gemini 2.5 Pro.

Speaker 24 And it got to about a 1% word error rate, which means every one in 100 words, obviously, you're getting wrong, but that can include capitalization, errors, punctuation, stuff like that.

Speaker 24 So that in itself is really significant. No models come close to that.

Speaker 24 Human experts who do transcription for a living offer about a 1% error rate. So that itself is fairly important.

Speaker 13 And your sense of having used this new experimental model, did that just come from you're inputting dozens and dozens of queries and every once in a while you would just get a result that was radically better than the others and you thought, aha, I must be getting the new one?

Speaker 13 Or were there any other signs about what Google was showing you?

Speaker 24 Well, it's A-B testing. So what that means is normally in AI Studio, you put in a query and you get a response.

Speaker 24 And when you get the A-B test, you get two responses and it asks you to rate which one's better, right? And

Speaker 24 the labs do this in order to get feedback on, you know, is a model actually better on specific types of tasks than other ones, right?

Speaker 24 So you might have to do that 20 or 30 times until you get one of those two responses. And then the differences were pretty notable.

Speaker 12 So you said the overall error rate fell by about 50%,

Speaker 12 but that was not actually what impressed you the most about this new model. What impressed you the most?

Speaker 24 Yeah. So first of all, that was, you know, impressive.
And then I was curious, okay, if it's gotten to this point, how is it going to do on tabular data?

Speaker 24 And as historians, one of the things you work with, you know, to go back to your Starbucks example, are receipts and ledgers that come from, you know, merchants in the past.

Speaker 24 And a lot of that's fairly boring.

Speaker 24 But if you want to know where somebody is, where they bought their coffee one morning, and you want to trace that person's movements, you can use these types of documents to do that.

Speaker 24 You can see what they bought and all of those types of things. The thing is, to this point, models have been pretty bad on tabular data.

Speaker 24 It's often very,

Speaker 24 it's kept kind of like a cash register receipt system is kept. So it's kind of just on the fly and nobody's expecting people to necessarily read it down the road.

Speaker 24 So it's difficult to interpret just by looking at it. It's also sometimes quickly written.
So it's even worse handwriting than people are used to.

Speaker 24 And because it's historical documents, in this case, I'm dealing with records from 18th century New York state, upstate New York in Albany. And

Speaker 24 those records are written in pounds, shillings, and pence. So that's the old, it's a different base than we're used to using, in which you have basically

Speaker 24 a different form of currency measurement. And so when I dropped in a page, just kind of at random from this ledger, I was just curious to see what I'd get back.

Speaker 24 And suddenly, it not only came back in a near-perfect transcription, which itself was kind of remarkable given how difficult it is to make sense of what's actually on the page.

Speaker 24 But as I started to go through it, I was looking for errors. I was trying to find errors.

Speaker 24 And I began to realize that some of the things that I was seeing on there that looked like errors were actually clarifications and they required the model to do some really interesting things.

Speaker 12 Give us an example.

Speaker 24 Sure. So in the actual ledger document, right, what we're dealing with is a series of kind of entries that are made in a day book.

Speaker 24 So this is as people come into a store, they're buying things, and it's being recorded just like on a cash register receipt.

Speaker 24 And in the one case that I was in particular looking at here, what it basically says

Speaker 24 in one of the entries is Samuel Slit came in on the 27th of March and it says to one loaf of sugar at four

Speaker 24 145

Speaker 24 at 140191.

Speaker 24 And what that means when you actually break it all out is that this guy named Samuel Slit came into the store. He bought one loaf of sugar.

Speaker 24 If you're not aware, in the 18th century, sugar comes in hard conical shapes and they break off pieces and they sell it to you. And it says 145 sold at

Speaker 24 one shilling, four pence per pound. And then the total is zero pounds, 19 shillings and one pence.
And this is the old kind of notation, right?

Speaker 24 And what I saw in the actual model's response, though, was that it had figured out that, in fact, it was one loaf of sugar measured out at 14 pounds of sugar, five ounces, sold at one shilling, four pence, and then for the total, right?

Speaker 24 And what's insignificant about that is that in order to figure out that what was written on the page, this random number 145,

Speaker 24 to figure out that that was 14 pounds and five ounces, the model had to be able to work backwards from a different currency system with a different base.

Speaker 24 The thing that makes that important is that models shouldn't be able to do that, right? That these models are basically

Speaker 24 the way they're trained is in pattern recognition. What they're trying to do is they're trying to predict the next token.

Speaker 24 And so, the first problem here is that predicting numbers is actually very difficult for models to do, right?

Speaker 24 In the sense that the model has no idea whether Samuel Slit is buying 14 pounds five ounces or 13 pounds six ounces, right? I mean, that's a random number, effectively. It's not probabilistic.

Speaker 24 The other problem is that although there would be, you know, a lot of material in the training data that would be relate to this kind of old currency system, the reality is there's not that much of it in terms of the actual percentage of material that's there because there's so little of this that's out there in terms of the

Speaker 24 overall sum total of all the records that exist and so when we're thinking about it the model is having to do some interesting things there what it looks like to me is it's a form of symbolic reasoning i have to know in my head that i'm dealing with different units of measurement which don't have a common kind of a base pair to to multiply or or divide by and then I have to kind of abstractly realize that these units of measurement do in fact, they are comparable as long as we do some conversions.

Speaker 24 And we have to then, you know, move them around in our heads to figure out this is something that I had to think about it for a second and realize, in fact, the model had done something that was mathematically correct and unexpected.

Speaker 13 So, what are the implications for you in your work of a model being able to do this kind of abstract reasoning?

Speaker 24 Yeah, and so as an historian, what it means is that

Speaker 24 assuming that this replicates once we start to see the actual model come out,

Speaker 24 you're going to be able to trust the models to do a lot of stuff that historians would normally need to do, right? So it's one thing to transcribe a document.

Speaker 24 It would be another to say, here's a ledger, go through and add up all the sugar that was bought and sold in this ledger. And right now, you can't trust a model to do anything like that, right?

Speaker 24 You can't trust it to necessarily recognize sugar. You could come up with quantities, do that type of math.
If we're getting to a point where models can begin to do that,

Speaker 24 you can begin to get them to do tasks that would take humans a very long time.

Speaker 12 Right.

Speaker 12 It sort of sounds like the equivalent of the moment where like AI coding tools went from being a useful assistant for a person who's a professional programmer to like actually being able to go out and program things just on their own with very minimal instruction.

Speaker 12 It's like that for history, right?

Speaker 24 Yeah, and I think that's a really good example.

Speaker 24 But I think that the interesting thing about history here is that I think it's a very typical kind of knowledge work kind of area, right?

Speaker 24 In the sense that a lot of the stuff we're doing is pretty esoteric, and your listeners will probably be wondering, you know, who's really interested in how much sugar people bought in Albany in the 18th century.

Speaker 12 Well, Case is, but he's a special case. Yeah.

Speaker 24 Yeah, that's fair.

Speaker 13 I'm really interested in this Samuel slit and why he needed 14 pounds of sugar. Like, take it easy, Sam.

Speaker 24 It's true. Well, he's a merchant.
He also wants to go and sell it to other people, right?

Speaker 13 Oh, he's a dealer. There we go.
Now he's a good man.

Speaker 12 He is a sugar dealer.

Speaker 24 But the interesting thing about this, I think, right, is that the stuff stuff we do as historians with these historical records, this is what all knowledge work people do, right?

Speaker 24 Is that you take information and you synthesize it.

Speaker 24 You take it from one format, you put it into another, you realize the implications of the things that you're reading and you draw conclusions and analysis based on that, right?

Speaker 24 And it can be 18th century sugar, but it can very easily be any other kind of widget that a knowledge worker uses. So what I'm seeing turning on here

Speaker 24 for historians is highly likely to start turning on in other areas as well.

Speaker 24 That up to this point, the models, you know, we've been getting the sense that they're starting to get good enough where you can feel like, yeah, I think I can trust the outputs on this, but you're getting to the point where it just works.

Speaker 24 And as somebody who uses coding assistants all the time now, that is, it's a very similar situation where you used to have to cut and paste back and forth and it would, you know, it would never run the first time.

Speaker 24 You'd have to run it, you know, three or four times, paste the errors back and forth, and eventually it would work. And now you can just kind of hit the button and it almost always works.
Right.

Speaker 13 And that's what you're going to see here with knowledge work so i want to zero in on what makes this so interesting so we don't know at this moment that this is gemini 3 but i think kevin and i feel like it it's highly likely to be gemini 3 right and we also don't know a lot about how if it is gemini 3 exactly how it was trained but i think we can assume that it was trained in a way that its predecessors were which was in sort in part by just feeding it lots more data lots more compute right just sort of uh following the scaling laws and there's been so much debate over the past year about have are we seeing diminishing returns, right?

Speaker 13 Have we sort of figured out the limits of what we can get out of these scaling laws?

Speaker 13 The story that you're telling us, Mark, is a suggestion that no, we have not gotten everything that there is to be gotten out of this increased scaling.

Speaker 13 And in fact, we should expect to see continued emerging properties from this ongoing scaling. And you've just given us an example of it right there.
So that's why I think this is so fascinating.

Speaker 12 Yeah, and I was fascinated by this experiment and I wanted to see if I could actually get to the bottom of what happened here.

Speaker 12 So I asked some folks who would be in a position to know, like, hey, there's this history professor in Canada. He thinks he like stumbled onto this unreleased Gemini 3 AB test, and it was really good.

Speaker 12 And they said, lose my number. No,

Speaker 12 they were very tight-lipped. They did not want to talk about it.
They are keeping things very secretive over there. But I was able to confirm that Google does test new models in AI Studio

Speaker 12 before they sort of appear elsewhere. And so I think if I were a betting man,

Speaker 12 it's a pretty good bet that what you experienced was in fact

Speaker 12 an unreleased model, probably Gemini 3.

Speaker 13 So, Kevin, I have not been in the AI Studio myself recently to see if I could try this model. Have you made any efforts to try to access whatever this model is?

Speaker 12 Yes. So I use AI Studio.
People don't know this, but like like Google has like, you know, 800 AI products right now. They're like, you know, a billion ways to use Gemini.

Speaker 12 And the most...

Speaker 12 effective way, the best way to use Gemini is inside this product that basically no one except developers and nerds like us uses, which is called the Google AI Studio.

Speaker 12 And if you go in there, I don't know, for whatever reason, Mark, do you find this too? But like the model, like the version of Gemini in AI Studio is better than the one like on the web.

Speaker 12 I don't know why.

Speaker 12 But this is something I'm consistently able to get AI Studio to do things like transcribing long interviews that the regular old Gemini won't do.

Speaker 12 So anyway, I was in there this morning actually doing some research for our segment about

Speaker 12 Suncatcher, this like Google project about putting AI stuff in space. And I was trying to have it summarize this research paper and give me some ideas in comparisons to what other companies are doing.

Speaker 12 And I got this A-B test, this like, you know, choose between these two two answers. And

Speaker 12 I am looking at it right now. It says, which response do you prefer? And it has these two side-by-side things.

Speaker 12 And they basically both look pretty good.

Speaker 12 I think the problem I'm identifying is, Mark, is that unlike you, I am not smart enough to come up with like problems that are challenging enough where the difference between one pretty good model and a very good model is readily apparent.

Speaker 12 So maybe you can help me with that.

Speaker 13 Well, I mean, here's an idea. I know, you know, Mark really focuses on the 17 and the 1800s in in the fur trade.
What about the 1500s?

Speaker 13 I bet you can make a debt.

Speaker 12 Yeah. Well, I will, I'll look into that.
All right.

Speaker 12 Well, totally fascinating experience. And I can't wait to hear more about what you're doing with AI and history.
This is a really interesting mystery that I hope we've shed some light on.

Speaker 12 Thank you, Mark.

Speaker 24 Thank you very much for having me.

Speaker 2 This episode is supported by Blockstars, a podcast from Ripple.

Speaker 4 Join Ripple for blockchain conversations with some of the best in the business.

Speaker 3 Learn how traditional banking benefits from blockchain, or how you're probably already using blockchain technology without even realizing it.

Speaker 2 Join Ripple and host David Schwartz on Blockstars, the podcast.

Speaker 8 Crypto investments are risky and unpredictable. Please talk to a financial expert before you make any investment decisions.

Speaker 10 This is not a recommendation by NYT to buy or sell crypto.

Speaker 22 This podcast is supported by ATT, America's first network is also its fastest and most reliable.

Speaker 22 Based on Root Metrics United States Root Score Report 1H 2025, tested with best commercially available smartphones on three national mobile networks across all available network types, your experiences may vary.

Speaker 22 Rootmetrics rankings are not an endorsement of AT ⁇ T.

Speaker 22 When you compare, there's no comparison. AT ⁇ T.

Speaker 15 Over the last two decades, the world has witnessed incredible progress.

Speaker 3 From dial-up modems to 5G connectivity, from massive PC towers to AI-enabled microchips, innovators are rethinking possibilities every day.

Speaker 17 Through it all, Invesco QQQ ETF has provided investors access to the world of innovation with a single investment.

Speaker 6 Invesco QQQ, let's rethink possibility.

Speaker 18 There are risks when investing in ETFs, including possible loss of money.

Speaker 15 ETF's risks are similar to those of stocks.

Speaker 21 Investments in the tech sector are subject to greater risk and more volatility than more diversified investments.

Speaker 21 Before investing, carefully read and consider front investment objectives, risks, charges, expenses, and more in perspectives at Invesco.com.

Speaker 20 Investco Distributors Incorporated.

Speaker 13 Hard Fork is produced by Rachel Cohn and Whitney Jones. We're edited by Jen Poyan.
Today's show was fact-checked by Will Peischold and was engineered by Chris Wood.

Speaker 13 Original music by Alicia Ma'itup, Marion Lozano, and Dan Powell. Media production by Sawyer Roquet, Pat Gunther, Jake Nicol, and Chris Schott.

Speaker 13 You can watch this whole episode on YouTube at youtube.com/slash hard fork. Special thanks to Paula Schuman, Hui Wing Tam, Dahlia Haddad, and Jeffrey Miranda.

Speaker 13 You can email us at heartfork at nytimes.com on what else you think we should build in space.

Speaker 22 We've all done it. Stock our fridge with good intentions, only to sacrifice nutrition for convenience.
Keep your body and mind nourished with whole-body mealshakes from cachava.

Speaker 22 It's got 25 grams of protein, 6 grams of fiber, greens, and so much more. But it actually tastes delicious.
Try one of cachava's indulgent flavors today.

Speaker 22 Shop now through December 2nd to get 30% off your first purchase of two or more bags. Go to cachava.com and use code NYT.
That's k-a-ch-h-av-a.com code NYT.