Meet AlphaEvolve: The Autonomous Agent That Discovers Algorithms Better Than Humans With Google DeepMind’s Pushmeet Kohli and Matej Balog

42m
Much of the scientific process involves searching. But rather than continue to rely on the luck of discovery, Google DeepMind has engineered a more efficient AI agent that mines complex spaces to facilitate scientific breakthroughs. Sarah Guo speaks with Pushmeet Kohli, VP of Science and Strategic Initiatives, and research scientist Matej Balog at Google DeepMind about AlphaEvolve, an autonomous coding agent they developed that finds new algorithms through evolutionary search. Pushmeet and Matej talk about how AlphaEvolve tackles the problem of matrix multiplication efficiency, scaling and iteration in problem solving, and whether or not this means we are at self-improving AI. Together, they also explore the implications AlphaEvolve has to other sciences beyond mathematics and computer science.

Sign up for new podcasts every week. Email feedback to show@no-priors.com

Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @pushmeet | @matejbalog

Chapters:

00:00 Pushmeet Kohli and Matej Balog Introduction

0:48 Origin of AlphaEvolve

02:31 AlphaEvolve’s Progression from AlphaGo and AlphaTensor

08:02 The Open Problem of Matrix Multiplication Efficiency

11:18 How AlphaEvolve Evolves Code

14:43 Scaling and Predicting Iterations

16:52 Implications for Coding Agents

19:42 Overcoming Limits of Automated Evaluators

25:21 Are We At Self-Improving AI?

28:10 Effects on Scientific Discovery and Mathematics

31:50 Role of Human Scientists with AlphaEvolve

38:30 Making AlphaEvolve Broadly Accessible

40:18 Applying AlphaEvolve Within Google

41:39 Conclusion

Press play and read along

Runtime: 42m

Transcript

Speaker 1 Hi listeners and welcome back to Know Priors. Today we're joined by two of the key folks behind one of the most compelling developments in AI this year, AlphaEvolve.

Speaker 1 Pushmi Kohli and Matei Balag worked on this autonomous coding agent that uses Gemini models and evolutionary search to discover new algorithms.

Speaker 1 It marks a major leap in AI's ability to contribute to core computer science and math, and perhaps sciences beyond that. It's not just a stochastic parrot or a boilerplate generator.

Speaker 1 It has shown what you might consider technical creativity in the way that Move37 did with AlphaGo, something humans hadn't done before, even in thousands of years of play.

Speaker 1 It might even be a real step on the path to self-improving AI. Pushmi, Matei, thank you so much for being here.

Speaker 2 Thank you for having us. It's a pleasure.

Speaker 1 Congratulations on the success and the launch of AlphaEvolve. Can you give me a brief description of what it is broadly?

Speaker 2 Yeah, so in maybe one sentence, Alpha Evolve is an AI AI coding agent that is able to discover new algorithms that are able to make new discoveries on open scientific problems.

Speaker 2 And at the same time, those algorithms can be so practical that they are already deployed in key parts of Google's own infrastructure.

Speaker 1 And what is the origin story of working on this particular form of coding agent or this problem statement?

Speaker 2 So we are not new to this space of algorithm discovery. As you might know, the mission of all of DeepMind is to build AI responsibly to benefit humanity.

Speaker 2 And the way our particular team has been doing it for years now is to look for ways how AI can discover new algorithms. New algorithms are everywhere around us.

Speaker 2 So, this is a very, very important question and can have very high impact when we can discover algorithms that solve important computational problems with higher efficiency than what we have been able to do so far.

Speaker 2 And kind of the first breakthrough we had in this space was in 2022 when we released a system called Alpha Tensor.

Speaker 2 And so that was a system that was an AI system using reinforcement learning that for a very specific but fundamental computational task, so multiplying matrices, for the first time showed that AI agents can discover better algorithms than what humans had been able to do before them.

Speaker 2 So this was the first system that gave weight to this idea that indeed with AI, we'll be able to go into the superhuman region of algorithms that we as humans have not been able to discover ourselves.

Speaker 1 How do you differentiate AlphaEvolve from like Alpha Tensor and FunSearch and some other projects in the sort of lineage of this?

Speaker 2 One way to also describe what we have done is if you look back at the history of Deep Mind

Speaker 2 and

Speaker 2 see a number of sort of projects that have come even before we started working on computer science. Our earlier sort of work,

Speaker 2 and if we go back to

Speaker 2 the project on AlphaGo,

Speaker 2 where the AlphaGo agent was able to beat the world go champion in the game of Go. And one was the remarkable sort of thing

Speaker 2 in that agent was that it was able to explore this amazingly large search space of all possible sort of Go positions in such an efficient manner that it can sort of come up with what is the optimal move at that time.

Speaker 2 And that really surprised people,

Speaker 2 both

Speaker 2 Go professionals as well as scientists. Scientists believed that that

Speaker 2 event would come much, much later because it was a very hard problem.

Speaker 2 And so

Speaker 2 what that gave evidence for is that is the ability of these large-scale neural network-based systems to be able to reason and do very efficient exploration in these large search spaces and come up with amazing new insights about

Speaker 2 the particular domain. And in the game of Go, I mean, there is this move called Move 37, which is a very creative new move that the agent discovered that was not in the Go literature, right?

Speaker 2 That really surprised the Go professionals.

Speaker 2 So in some sense, we asked ourselves the question that if you have an agent which can do very efficient search in the domain of Pago, why can't you use the same kind of philosophy to search for algorithms in the space of algorithms?

Speaker 2 And in fact, that sort of was the underlying basis of the work on our first sort of attempt at that problem, which culminated in Alpha Tensor.

Speaker 2 So how we structured the algorithmic discovery problem is we looked at at first a very important problem.

Speaker 2 And that problem was matrix multiplication. It is a problem that is ubiquitous in computer science.

Speaker 2 It's one of the key fundamental operators that underlies not only computer science, but also neural networks and machine learning

Speaker 2 and AI. We said, can we find a way? to improve matrix multiplication algorithms.
So there's a history of matrix multiplication, which is very interesting for people who might be interested in it.

Speaker 2 Like it's even though it's such a fundamental operator,

Speaker 2 people thought that the complexity or the time it takes to multiply two matrices is order cube.

Speaker 2 And around 50 years back, more than 50 years back now, a German mathematician Strassen came up with this very counterintuitive construction, which showed that in fact the complexity was not n to the the power

Speaker 2 3 or what's not cubic, where n is the sort of dimensionality of the matrix, it's lower. And so, and that was a very counterintuitive sort of result,

Speaker 2 but it stayed for more than 50 years and until sort of alpha tensor came up and we said, well, can we actually improve this result? And remarkably, we were able to show that. that alpha tensor, while

Speaker 2 by having this amazing ability to do search in this very large space, even much larger than the space of possible Go moves,

Speaker 2 was able to come up with this amazingly new algorithm which improved things. But then the question was: well, we now have proved the thesis that you have these

Speaker 2 super intelligent agents which can go beyond what human computer scientists have been able to do, but can we generalize them?

Speaker 2 This is sort of Alpha Tensor was very smart, but was only sort of purposefully constructed for the matrix multiplication problem.

Speaker 2 Can we build an agent that is more general, both more general in the sense of it can handle more

Speaker 2 general problems, but can also

Speaker 2 search in the space more naturally, in the space of

Speaker 2 programs rather than in the space of very specific sort of operations that were required for matrix modification. And that was the origin of sort of the first

Speaker 2 attempt

Speaker 2 of us with FundSearch, which was an LLM-based agent, which for the first time, by searching in the space of programs, showed that you can come up with completely new solutions.

Speaker 2 And made the first scientific discovery from NLM. And Alpha Evolve is basically an extension of that.

Speaker 1 I'm very inspired by

Speaker 1 the idea, I think as many people are, that AI will actually have creativity, does actually have like technical creativity, as you are describing, as one way to conceptualize this where you're outside of the patterns that we already know.

Speaker 1 as engineers. I want to go back to some of the mechanics here and the limits to generalization and how to think about automated evaluators and a lot of different topics.
But

Speaker 1 when you think about these problems that are clearly economically valuable and interesting, like matrix multiplication, the potential efficiency of it, what is your intuition for

Speaker 1 why

Speaker 1 those solutions have not been found before?

Speaker 1 Is it simply like the search space is too large or people in this field were complacent in that they believed a certain solution was like the maximum efficiency?

Speaker 1 Because clearly there's value to be had here.

Speaker 2 My sort of opinion on this is basically that if you look at the structure of the of the algorithm what Strassen produced was quite sort of ingenious.

Speaker 2 It was not a natural thing that you would sort of think of and that is that was for only two by two matrices. As you sort of go to larger sizes, the space is so huge.

Speaker 2 The constructions are not sort of something which is very natural. These are very involved and integrate

Speaker 2 intricate sort of constructions that would be very hard to discover by chance.

Speaker 2 So it's quite interesting that

Speaker 2 it has this very special

Speaker 2 structure, but it's not something that comes naturally

Speaker 2 or to a human computer scientist. Just to add to that, so I definitely agree.
The search space is just unbelievably vast. The solutions are maybe non-intuitive.

Speaker 2 And the third thing I want to emphasize is that I really believe the people who worked on this in the past were definitely not complacent.

Speaker 2 And in fact, the problems we chose to apply Alpha Evolve to in the first instance, both on the scientific side and the practical side, we deliberately chose problems which have been worked on for a very long time by the very best people.

Speaker 2 So

Speaker 2 on the scientific side, since we're talking about matrix multiplication, this has been a known open problem for decades, and many people have been working on it.

Speaker 2 And similarly, for the practical applications

Speaker 2 that we mentioned in our Alpha Evolve release in key parts of Google's infrastructure, again, like these are things that have been heavily optimized inside Google because they are so important.

Speaker 2 And so by having a system like Alpha Evolve or any other

Speaker 2 discover something new on these problems, I think that's as strong a demonstration as I can imagine of the fact that this is indeed something that is new because no one found it before.

Speaker 2 And also it is something that was not easy to discover because those results stood for such a long time and have been worked on by

Speaker 2 such strong people.

Speaker 1 I noted that this is not a comment on the

Speaker 1 broad efforts of the computer science industry to date on matrix multiplication or data center optimization.

Speaker 1 I think this is a good moment to try to demystify what's happening under the hood for a broader set of people.

Speaker 1 Can you walk us through a concrete example of how AlphaEvolve actually evolves code? And say, let's take the example of trying to optimize data center scheduling, right?

Speaker 1 What does the step-by-step process look like from initial random code to final solution that saves millions of dollars of power?

Speaker 2 I can walk you through that.

Speaker 2 So the user of a system like AlphaEvolve, they basically specify what is the problem that they are trying to solve. So that's the most important thing.

Speaker 2 And you specify it by providing what is called an evaluation function.

Speaker 2 What this function does is whenever there is a proposed solution for solving the problem, you're able to tell how good this solution is. So you basically define what makes a good solution.

Speaker 2 For discovering an algorithm for scheduling jobs on a data center, this evaluation function could be something like a simulator of jobs in a data center.

Speaker 2 That given an algorithm for doing the scheduling, it simulates how good this algorithm is. So that's what the user provides.

Speaker 1 And this is a simulator you already had.

Speaker 2 Yes, so that's a simulator that we already had.

Speaker 2 And I would say it's something that is quite natural to have

Speaker 2 in many domains because whenever you want to innovate on something, you need to have a way of telling, okay, is the innovation actually good or not?

Speaker 2 So it's a very natural object to have, at least in principle. So you define the what by providing the evaluation function.
And then AlphaEvolve fills in the how. So that's the job of our system.

Speaker 2 And you can do it in two fairly different ways. One is you tell AlphaEvolve, I have no idea how to solve this problem.

Speaker 2 Let's start completely from scratch and let's try to be creative and come up with something completely new. So that's one option you can take.
Another option you can take is:

Speaker 2 actually, we have already worked on this problem for a really long time. Here is a very strong initial solution that we can provide to the system, and you can start from here.

Speaker 2 And that's what we did for the application to discovering new algorithms for scheduling jobs in a data center.

Speaker 2 So Alpha Evolve takes this initial solution, and then it, on a high level, it it combines the creative power of large language models to propose creative new ways how to improve that solution, the strictness of the evaluation function provided by the user that is able to actually filter out the things that work from the ones that don't.

Speaker 2 And then this is wrapped inside an evolutionary algorithm that makes sure that we kind of discover the whole space of algorithms in that region so that we don't commit to a very specific type of solution early on, but instead we maintain a diverse pool of potential solutions.

Speaker 2 Over time, maybe we combine ideas from different solutions that are already strong until we actually have an algorithm that's so strong that we are happy to deploy to a critical part of Google's infrastructure.

Speaker 2 Let's see.

Speaker 1 And intuitively,

Speaker 1 not in the machine learning sense, but in the evolution sense, you have different generations where you're getting closer to an optimal solution.

Speaker 2 Yeah, that's right.

Speaker 2 Like you would expect that in each iteration of evolution, what you're doing is you're looking at the previous iteration, looking at maybe maybe the strongest solutions you have, and then trying to be creative about how can I combine ideas from those solutions or maybe bring in completely new ideas to come up with something even better.

Speaker 2 And so yes, each generation gets stronger and stronger.

Speaker 1 How much scaling are we talking about? Like, is there a way to predict how many generations it takes, or how do you constrain the number of iterations that the model can use?

Speaker 2 So there are two parts to your question. One is about, okay, how does scaling work and then how can you predict it?

Speaker 2 So for the the first part, this is actually a really nice feature of Alpha Evolve that it can adapt to the difficulty of the problem.

Speaker 2 If you ask AlphaEvolve to find a solution to a problem that's actually unexpectedly easy, then it will just do it very, very quickly. like almost immediately you will have the solution.

Speaker 2 But if you ask it a problem that's really, really difficult, and by really, really difficult, I mean like really difficult, maybe an open question that has stood for decades in the sciences or

Speaker 2 you want a practical algorithm for a really high-value application in Google, then you would, of course, expect this is not an easy problem.

Speaker 2 You might need to spend longer time considering different solutions, exploring the space, combining ideas.

Speaker 2 But what's really nice about Alpha Evolve is that it is able to sustain this scaling in a way that it keeps improving over time.

Speaker 2 And it keeps improving for so long that you can make discoveries on this level of difficulty, like breaking decades-old scientific challenges or discovering high-value algorithms. Now,

Speaker 2 I know it maybe sounds trivial that if you wait longer, you get better results, but in practice, that's actually

Speaker 2 like a really difficult thing to build automated agents that are able to sustain this continual improvement without plateauing quite early. This is, I think, a nice feature.

Speaker 2 There was a second part to the question about predicting how many iterations you will need.

Speaker 2 So that is something that is actually not so easy because it's like asking a priori, do you know how difficult this question is going to be?

Speaker 2 And especially in the sciences, that's something that often has a very surprising answer. Very trivial questions can turn out to be extremely, extremely difficult, and vice versa.

Speaker 2 But the nice thing is that you have continual improvement if you run this system. And as long as you can run it, you can expect to get better and better results.
And you just have to see

Speaker 2 where this gets you.

Speaker 1 Aaron Powell, Jr.: If you think about the coding agents that general developers have access to and are increasingly using today, One frustration with them is on relatively trivial problems it is set out to do autonomously and will get lost and blow itself up or plateau, as you said,

Speaker 1 in frustrating ways. Can you talk about if you think there are implications from AlphaEvolve to these other general coding agents?

Speaker 2 While large fragment models and coding agents are

Speaker 2 getting

Speaker 2 much better in their understanding of code, they're not perfect, right? So they do make mistakes. The other sort of element is to think about what is the task that these agents have been assigned.

Speaker 2 Mostly, if you are asking an agent to solve a particular task or write a particular program, you are providing a specification.

Speaker 2 You are specifying the task either in natural language or you're saying, well, I'm trying to do something completed, right? So it's not a complete characterization of what you want, it's

Speaker 2 a partial specification of what you want. And the agents then try to solve the problem and might get lucky and

Speaker 2 might get the right result, or they might hallucinate and get the wrong result. And the issue is, how do you know

Speaker 2 whether the result is right or wrong? And that depends on having a good evaluator. That's how Alpha Evolve solves the problem.

Speaker 2 So in some sense, we are able to leverage the hallucinations for a beneficial purpose, right?

Speaker 2 So, the creativity and the wrong answers that Alpha Evolve can somehow come up with, how do we know that they're wrong? They might be very good, we just don't see them in that way. And which is why

Speaker 2 the role of the evaluator is really important. And how do we even do the evaluation is very important? Because when you come up with a new idea,

Speaker 2 should you try to explore that idea much further, right? Or how deep should you go into stress testing that idea? Should you try that idea out on a few different

Speaker 2 instances or a sort of a thousand different instances or

Speaker 2 really stress test that the idea actually works for the whole thing? This is one of the interesting parts of Alpha Evolve.

Speaker 2 Getting that balance right is really important so that you can look at where are the creative solutions?

Speaker 2 How can you sort of filter out the ones that are promising and then use them later to refine the search process to get the final solution?

Speaker 1 If evaluation functions, automated evaluators are

Speaker 1 really like such a limiting constraint here in terms of what we can get agents to do. Any intuition from this project or others on how to overcome that?

Speaker 1 Like, can models get good at helping us create automated evaluators? Should we imagine simulators that are better for lots of different domains?

Speaker 1 If I, you know, lame product manager putting in incomplete natural language spec to coding agent,

Speaker 1 should I work with an assistant to like complete that spec? Do I use traces? How do you think that gets solved?

Speaker 2 That's a really, really great question. And I think you can view it from two perspectives that I think will happen at the same time.

Speaker 2 So one is that, yes, currently the strict evaluation function plays a key role in AlphaEvolve.

Speaker 2 And one takeaway you can take from this, thinking about the future, is that it shows the really high value of having these evaluators available.

Speaker 2 Because in many cases, it might be that you have a really important problem, but you don't actually have a very precise definition of what makes for a good solution.

Speaker 2 And one takeaway you can have from a system like this is that if you actually do build a very precise evaluation function, then this unlocks the possibility of having an agent like AlphaEvolve discover something that's way beyond what, let's say, humans have been able to discover or your best developers have been able to discover.

Speaker 2 So, that's one takeaway. But the other takeaway that I'm maybe even more excited about from the research perspective is that we don't actually think this is a conceptual limitation.

Speaker 2 So, today we have this was maybe the easiest way to get into this game of discovering new things by looking at problems that already come with these very precise evaluation functions.

Speaker 2 So that's just a natural first step to take. But I do believe that this assumption can be relaxed in very significant ways.

Speaker 2 And in particular, you already mentioned one example where maybe language models themselves will be able to evaluate whether proposed solutions look promising or not or whether they fail in some particular ways.

Speaker 2 And indeed, there is a parallel work from Flippmind as well called AI Co-Scientist, which demonstrates this very clearly, that

Speaker 2 if you propose ideas in in natural language, then you can get language models to provide meaningful critiques and identify the ones that work from the ones that don't.

Speaker 2 So I really do see a lot of hope on relaxing this assumption. And then even in between these two extremes of

Speaker 2 strict evaluation that exactly tells you how good a solution is on one end, and then natural language evaluation biolanguage model on the other end, there is a continual spectrum of simulators and auxiliary evaluation functions, which are maybe not perfect, but as long as they are correlated with the true signal, then we can build the algorithmic scaffolding of the evolutionary algorithm around this in such a way that we still make meaningful progress.

Speaker 2 And maybe it will take a few more iterations, but we can still go really, really far. So, just to add what Monte sort of mentioned, I think

Speaker 2 one of the takeaways is basically that LLM-based agents like Alpha Evolves, especially when we structure them in this way with population-based sort of search, right, with evolutionary approaches, they are extremely effective in searching.

Speaker 2 They can search very convincingly and very effectively in very large spaces and come up with very counterintuitive new solutions for important problems.

Speaker 2 problems that we have studied for many, many years and sometimes in some cases, decades. So that's one.

Speaker 2 The other sort of element of the evaluator, like as Mate mentioned, there is work on using other sources for evaluation. So you don't have the perfect evaluator.

Speaker 2 Even for Alpha Evolve, even if you have a simulator, that's not a perfect evaluator, right? Because you are sort of going to evaluate things on a specific distribution of problem instances.

Speaker 2 You might want to sort of prove certain properties of the solution, right? You might want to say that the solution always has certain performance.

Speaker 2 So, if you want to prove certain properties of the solution,

Speaker 2 that might require sort of other work, right? You might have to have a proof agent which sort of tries to prove certain properties of the solution.

Speaker 2 While on the other hand, you have these LLM-based evaluators which can look at the solution.

Speaker 2 And you don't have nobody has built a simulator, but they can just have a guess on how good that solution is. And in fact, that approach also works very well.

Speaker 2 And we have shown that this AI co-scientist, which we have used for hypothesis generation, it basically uses a multi-agent sort of setup and where LLMs themselves are able to sort of figure out that

Speaker 2 certain hypotheses are better in terms of novelty and significance and impact and should be propagated. Right.

Speaker 2 And that whole process ends up, and this might be surprising and counterintuitive to some, of producing much, much, much better results than the base large language model.

Speaker 2 So you are really able to discover new information beyond what the large language model itself alone was able to produce.

Speaker 1 That begs a question, which I think is like one of the biggest meta questions proposed by this sort of work, which is like, do we get self-improving AI?

Speaker 1 One of the things you demonstrated with AlphaEvolve is you can optimize the systems used to train Alpha Evolve, right? So you have this, you know, 23% speed up in part of the training infrastructure,

Speaker 1 if I recall correctly.

Speaker 1 Are we now witnessing the early stages of recursive self-improvement in AI? And what do you think the implications are, if that's true?

Speaker 2 I think in some senses,

Speaker 2 sort of yes. But at the moment, what we have seen is basically improvements in computation time.
So what Alpha Evolve has been able to do is basically make training more efficient.

Speaker 2 But you can ask the question, can you make the training,

Speaker 2 can you improve the training process such that the underlying model is not only sort of

Speaker 2 trained faster, but is actually fundamentally better in certain cognitive tasks. And that is something that has to be validated still, right?

Speaker 2 But it is a direction that is definitely very appealing and something that is being sort of actively sort of looked at by many people.

Speaker 1 Do you have a reason to believe it won't work?

Speaker 2 It should work. But as we sort of mentioned, that having good evaluators is an important element, right?

Speaker 2 And so having a sort of evaluator which can say, this proposal that you have just suggested for me to improve the training process will yield a good result. So So, if you have that kind of evaluator,

Speaker 2 then it will work. But there is no reason why such an evaluator does not exist.
But we need to sort of work on building such evaluation functions. Maybe just one thing to add to it is that

Speaker 2 I would also agree that we are maybe seeing the first sign of self-improvement, but one also needs to be very specific about what we have shown so far.

Speaker 2 Like, as Pushmit mentioned, it's the speeding up the training of the next generation of the Gemini model.

Speaker 2 So, the feedback feedback loop is fairly long, at least currently, maybe on the order of months. But there is, you can call it self-improvement for sure.

Speaker 2 Maybe the big question that many people are curious about is how does this extrapolate into the future? And you can have different types of self-improvement.

Speaker 2 One is where you get maybe just a one-off benefit, like the model improves itself once and that's it.

Speaker 2 Another one is, okay, the model keeps improving itself continuously, but maybe the improvements get marginally smaller and smaller and smaller and you converge to some limit.

Speaker 2 Or maybe the improvements will keep accumulating up and up and up. And that's a big open question that

Speaker 2 we don't have an answer to today.

Speaker 1 Let's take that projection to other fields. And obviously these are all interrelated.
But one of the things, Mente, you're really excited about is just how AI applies to these sciences.

Speaker 1 When you think about new mathematical constructions, improved solutions to

Speaker 1 open problems or problems that looked solved to humanity 50 years ago.

Speaker 1 What do you think the implication is in different fields? Like, is it a fundamental shift in how scientific discovery or mathematics gets done?

Speaker 2 First of all, yes, I'm super excited working in this area of using AI to accelerate the sciences, because in a way, it's the most exciting application of AI that I can imagine. Like,

Speaker 2 what could be more valuable or exciting to advancing the frontiers of human knowledge?

Speaker 2 So, yes, that is definitely there. And then of course in different fields of science, the speed of progress or the advance you get from AI might be slightly different.

Speaker 2 So in Alpha Evolve, we've primarily focused on mathematics and computer science, because these are the domains where it's the easiest to get these automated evaluation functions.

Speaker 2 Like you often get them basically for free. That's not to say that you cannot get them in other branches of science, but in maths and computer science, it's just

Speaker 2 most common.

Speaker 2 If you think about biology or chemistry, you want to design a molecule, then you can have an evaluation function again in the form of a simulator or a predictive model that given a candidate molecule will make a meaningful prediction about, okay, is this actually going to work in practice?

Speaker 2 And then, if you are in this regime, then again, alpha evolve would be applicable. And we are only talking about the version of alpha evolve that we have built today.

Speaker 2 And these these are problems that we can address today.

Speaker 2 But we don't think that the journey of Alpha Evolve finishes here. We have many ideas about how to make this system more powerful and more broadly applicable.

Speaker 2 And I'm fairly confident that we will see many applications across many branches of science. And then, this is only talking about Alpha Evolve.

Speaker 2 There are many other agents, Bushmid mentioned, AI co-scientists, and many others that I'm sure will keep transforming how science is being done across the whole spectrum.

Speaker 2 Yeah, so I think broadly, if you look at it, right, science is,

Speaker 2 a lot of science involves searching, right? Searching for the right idea, searching for the right construction, searching for the right sort of solution, the right drug candidate, and so on.

Speaker 2 And in some sense, like what scientists have been trying to do is sort of somehow make that process repeatable, right? There's at the moment, there is still sort of an element of

Speaker 2 serendipity to some of the discoveries, but we are, as we move towards sort of rational material discovery or rational drug discovery,

Speaker 2 you are sort of seeing computational approaches and very systematic evaluations playing a much more important role in many areas of science.

Speaker 2 And I think as that work propagates, you will have systems like Alpha Evolve, which will be able to search in those spaces and use these evaluations much more effectively.

Speaker 2 So it's like you can sort of see this as a tool that will give scientists a superpower in their ability to search over very complex and sometimes

Speaker 2 counterintuitive sort of solution spaces.

Speaker 1 When I think about one logical extension to this approach, it is,

Speaker 1 let's say, evaluate like automated evaluation in the real world, right? So lab, assay, you know, a bunch of robotic arms doing experimentation if you're screening molecules or something.

Speaker 1 What do you think the role?

Speaker 1 Let's just say like very near term,

Speaker 1 if that vision is true, of the human scientist or engineer is? Is it the problem framing, like determining the evaluation?

Speaker 1 Is it constraining the, like giving some intuition for like a starting point or a search space?

Speaker 2 Like,

Speaker 2 what should the human scientist be good at from here there are many sort of elements right first of all as we were as we have been talking about a lot the role of the evaluation function right so that needs to be defined like what do we really how do we want to assess these solutions but then there are many other sort of elements as well right when we are uh trying to find a solution it has to have certain properties what are those properties right giving hints uh giving sort of for example if you're trying to discover a new drug you want to make sure that that drug sort of treats the disease but does not kill the patient right it has sort of

Speaker 2 its side effects are low right or it can be what is what is the delivery mechanism for it so there are so many different

Speaker 2 requirements that a solution might want that might need to satisfy and some of them are encoded in the evaluator in function evaluator and some of them you might want to hard constrain them in the solution, right?

Speaker 2 And so, can you specify those so that an agent like Alpha Evolve can take that into account while it is thinking about how it explores the search space or how it constructs the solutions that it will sort of generate?

Speaker 2 These are all sort of very interesting places where human input might be required, but especially as we look at many different types of domains.

Speaker 2 So, yeah, I think we should definitely see this as an amazing tool for scientists, for computer scientists, mathematicians.

Speaker 2 And this is, in fact, this has been sort of our experience as well, that in the right hands, it is a very powerful tool, right? So like mathematicians who have tried to explore it and

Speaker 2 they have been able to specify what are the solutions that, what are the types of solutions that they're looking for, they can be much more

Speaker 2 productive and much more sort of effective in finding the solutions.

Speaker 2 I just wanted to highlight that even though we have been describing Alpha Evolve as this kind of autonomous agent that does things on its own, actually, in practice, using this agent often turns out to be surprisingly collaborative.

Speaker 2 We have seen this in particular with mathematicians that we have collaborated with.

Speaker 2 And there are a few reasons for this.

Speaker 2 But one is that Alpha Evolve evolve is an agent that doesn't just give you the solution. It searches for an algorithm that constructs that solution.

Speaker 2 And so depending on how you set up your problem definition, often it's actually the algorithm that's even more valuable than the solution itself.

Speaker 2 Because the algorithm, it tells you how to construct the solution. So that means you understand

Speaker 2 what are the ideas that go into building that solution. And maybe especially, or definitely it's true in mathematics,

Speaker 2 that's what people really care about, to understand the nature of our universe and build up the understanding of fundamental ideas.

Speaker 2 And so, it's actually often not interesting almost at all what the solution is, but what you care about is how you build it.

Speaker 2 And so, we had a first-hand experience collaborating with multiple mathematicians, and it's been really fascinating to see where we would share with them the output from Alpha Evolve, and they'll be really fascinated looking at the code that it that it found and trying to understand okay what is it actually doing and then understanding oh okay this this is doing this this is doing that and now I can see why why if you put it together then it leads to a really good solution yeah I can also confirm from my own personal experience that looking at the at the code or the algorithms that the system finds it's it's often a really interesting experience because it's it's code that kind of like looks uh human-like like it's something that you could have written but would would you have thought of writing it in exactly this way and then trying to understand, okay, what exactly is it doing?

Speaker 2 That's a really interesting experience.

Speaker 2 But at the same time, it's one of the key strengths of the system, not only for scientific applications where you can look at the code and get some understanding out of it, but also for many of the practical applications.

Speaker 2 It's hugely valuable that the artifact you get out of Alpha Evolve is a piece of code, and then you deploy that piece of code. And so before you do that,

Speaker 2 experts, engineers who have worked on that system can visually inspect that piece of code, understand it, and make the final decision of whether it's going to be deployed.

Speaker 2 So it's in a completely different league from, let's say, considering using a neural network to make decisions in some production system where you kind of need to trust that the neural network is going to always behave in the way that you hope it will.

Speaker 2 With the code, you can look at it, understand it, and make the decision yourself. I might add that that basically, not all code is interpretable by humans, right?

Speaker 2 The solutions and the programs that AlphaWall finds are sort of interpretable by human programmers. So,

Speaker 2 this is going to be a very interesting area of work in the future as to when you find these solutions, what can we learn from them? This was very interesting,

Speaker 2 like as Matev was sort of mentioning, this was a very interesting experience that we had working with Jordan Ellenberg in the first, in the earlier version of Alpha Evolve, when we were working on the cap set problem.

Speaker 2 The programs that it discovered had very interesting symmetries that

Speaker 2 mathematicians did not know about. And so, not only the solution was

Speaker 2 mathematically interesting, but like the actual sort of construction, but the algorithm for

Speaker 2 producing that construction had the structure of it was interesting in itself.

Speaker 1 For listeners who are thinking about accessibility or implications for themselves where they're not professional mathematicians in collaboration with Alpha Volve, what are the considerations in making some of these capabilities more broadly available?

Speaker 2 We want to make these capabilities accessible to

Speaker 2 as many people as we can to the wider community.

Speaker 2 Now, we have started a trusted tester program where we have asked people to submit proposals.

Speaker 2 And what we intend to do with that program is to figure out what are the right ways in which people can really leverage Alpha Evolve.

Speaker 2 So, we have internally used it across Google, but as you know, it requires certain things, sort of the need for a function evaluator.

Speaker 2 As part of the Trusted Testers program, we are going to be evaluating Alpha Evolve on a bunch of different types of applications, and that will inform our future release strategy as to how do we make it more broadly applicable.

Speaker 2 The second sort of element is that not only you need

Speaker 2 the evaluator, but you also need a significant amount of computational resources, right? Because it's not just one single LLM call.

Speaker 2 It requires a significant amount of

Speaker 2 function evaluation, depending on the difficulty of the problem, right? If it's an easy problem, then you can do it very quickly.

Speaker 2 But if you really are going for some very hard problems with a very large extended search space and you want to spend a significant amount of time searching over it, then how do you build the overall system that

Speaker 2 people can sort of can use effectively and efficiently? That's the other sort of thing that we'll be thinking about.

Speaker 1 Last question for you both. Is there sort of practical application within Google that you think will be interesting that you haven't tried AlphaEvolve on yet?

Speaker 2 In this white paper, we try to think about holistically when we look at the computational infrastructure of Google, what are the key parts in this infrastructure to demonstrate that Alpha Evolve can make discoveries across the stack, not only in one part of it, and that it can make discoveries that are highly valuable.

Speaker 2 And so we try to cover the entire spectrum. So we show that Alpha Evolve can improve the efficiency of the data center.

Speaker 2 it can contribute to hardware design and it can contribute to improving the efficiency of most important pieces of software that are being run inside Google.

Speaker 2 And one intention here was to demonstrate that this is a really versatile tool that you can apply across the spectrum.

Speaker 2 And as Pushmit was saying, this is a tool that is already available inside Google and it is being used for many, many problems.

Speaker 2 There are quite a few exciting ones.

Speaker 2 I'm not ready to share about the particulars yet, but as you can imagine, there is so many exciting computational problems in a place like Google within AI and also outside that, yeah, I'm sure there will be many, many really cool results coming in the future.

Speaker 1 I think that's a great note to end on. Push meet, Matei, anything we didn't cover?

Speaker 2 No, I think that was great.

Speaker 1 Thank you guys so much for being here.

Speaker 2 Congrats. Okay, great.
Thank you very much.

Speaker 1 Find us on Twitter at NoPriorsPod. Subscribe to our YouTube channel if you want to see our faces.
Follow the show on Apple Podcasts, Spotify, or wherever you listen.

Speaker 1 That way you get a a new episode every week. And sign up for emails or find transcripts for every episode at no-priors.com.