E175: Elon Musk: 10 Billion Humanoid Robots by 2040? w/NEA Partner Aaron Jacobson

27m
Aaron Jacobson is one of the most insightful thinkers at the intersection of AI, robotics, and cybersecurity—and in this conversation, he separates signal from noise. We explore the future of humanoids, the shifting threat landscape in cybersecurity, and why the next wave of industry-defining companies will be built on infrastructure, not just foundation models.

Aaron Jacobson is a Partner at NEA, where he invests in AI, cybersecurity, and cloud infrastructure. He’s backed companies like Databricks, Horizon3.ai and Veza, and previously worked in tech M&A at Qatalyst Partners. In this episode, we dive into what’s real vs. hype in AI, the future of humanoids, and where the biggest opportunities in infrastructure are emerging.

Listen and follow along

Transcript

Elon Musk has predicted that by 2040, there's going to be over 10 billion with a bee humanoids on planet Earth.

You've been studying this going back to your undergrad career over two decades ago.

What do you think about these predictions?

Huge fan of all the companies that he's built, as well as him being a technologist and futurist.

But I really view Elon's predictions like all of them.

They're super inspirational, but they're optimistic.

I think he's been promising self-driving cars now for about 10 years.

And they're here, but in a very narrow fashion.

And they're nowhere as massively distributed as

these bold predictions typically imply.

And I think what's interesting about this prediction is he's learned from the past and he's put a much longer time horizon on the prediction now.

But I still think he's off by multiple orders of magnitude because it's just going to take much longer than people expect for us to get enough data and also for us to have breakthroughs in the model architectures behind foundational robotic models to allow for general-purpose humanoids.

And even beyond the AI technology, we're also going to have to think about how we scale up the supply chain.

Because to get to that many robots, we're going to need massive investments in motors and all the various components that you would actually need to even build 10 billion humanoids.

What is the most difficult part about building a general purpose humanoid today?

There's two problems

that we're running into.

The first is just the robustness of humanoids.

Robots are historically very good at very narrow, very specific tasks, but as soon as you adjust one small thing in the task, right, you might train a robot to be very good at folding shirts, but you give it some jeans and it fails.

Or maybe you put it in a room that has slightly lower light and it fails on actually folding that shirt.

And so the robustness and the reliability of robotics based off of today's foundational models aren't there.

And look, I think there's really three fundamental challenges that we're going to need to overcome if we're going to beat that.

The first is really the scaling law.

I mean, language models, I mean, they improve so fast just because of there was a pretty quick understanding of the amount of data and compute required for us to actually get exponential improvement of performance of these language models.

But we're still quite early understanding the relationship in the robotics world just because of how much more complex navigating the physical world is relative to language.

I mean, just think about the human brain and how many years it took for us to evolve language relative to like spatial awareness and walking and moving and navigating a 3D space.

LLMs basically assume that, look, more quantity is better.

And eventually once we got to 15 trillion tokens, which is the internet and some, we started to see really magical results.

But with the robotics, we still won't really have the confidence in the amount of data that matters.

And we're actually going to start to see generalizable behavior like we see with LLMs emerge.

And I think there's other aspects about data with robotics and that quantity is not going to be the only thing that matters.

Quality is going to be important too, as well as the diversity of data.

Once you go to the real world, the amount of different,

the amount of diversity, variance,

the combinations of what a robot will need to figure out to actually be able to solve problems is so huge.

It's so many more orders of magnitude than what you would need for language.

This just starts to be a very, very hard task.

You get great at robots doing very specific tasks, but again, you vary the task or you then take that and you try to have a different robot do a similar task.

It just doesn't work.

You can ask an LLM today, a foundational model, to write like a poem in iambic pentameter about a snuffleuffagus, and you'll actually get a pretty good result.

And I bet that that model, AI, has never actually seen a poem about a snuffleuffagus.

But you go and you ask a robot to potentially pick up an item or walk down or navigate a hallway it's never been in before, and it still falls over enough of a time that we're not actually ready to commercialize this technology.

And I think a key part of, and I think another key part too, is really

we also have a challenge in the fundamental technology underpinning the creation of robotics models.

We need a lot of video data, we need a lot of physical data.

So, how are we actually going to cost-effectively gather that amount of data?

You mentioned that it's so difficult for a robot to complete generalized functions.

Is there a world where we have two or three humanoids in the home doing different types of tasks?

I think the home is an open question question because of the safety issue, right?

Unlike LLMs, you know, if your LLM fails, it's funny.

If your humanoid fails, potentially kills somebody.

And so I think the home, because of the safety requirements, and again, back to a challenge of robotics and even AI and LLMs in general, we don't really have great ways of evaluating these things, knowing how well they perform, knowing how robust they are, knowing how secure they are, which is incredibly important in the physical world.

So I think that that's also a fundamental limiter to consumer and home robots.

But I think your point's accurate in that we are going to see humanoids very much in industrial applications because these are very narrow applications.

You can surround the humanoids with cages.

You can probably tether the humanoids.

Actually, you'll probably see humanoids that don't even need to walk around.

They'll just be humanoid robots with two arms, which can slot into an existing assembly line or actually run a machine.

And they'll be stationary, they'll be tethered, so you won't have security concerns.

And because it's a very specific task, you'll be able to make the economics work in terms of the amount of data that you need to do, the robustness that you need to do.

And when it fails, you'll probably still have humans in the factory that can fix it or adjust it or get the robot up and running versus a consumer robot, right?

You're calling customer service, the thing's falling over, it's broken.

Like there's just so many more consumer challenges that need to be solved to actually get humanoids into the home.

The industrial use case and the economics on it scale much better too.

You know from day one how much you're making, how much money you're making versus with a humanoid, you have to do a marketing campaign.

You have to invest tens of billions of dollars before you even know if there's demand for it.

Absolutely.

I mean, the truth about robotics is the PowerPoint, it's always easy, right?

When you're selling to an industrial customer, it's just pure ROI.

How much does it cost me today?

What is the productivity?

And what is the ROI on your robot?

And I think the biggest challenge in the U.S., it's...

It's a combination of labor shortages as well as this pressure to bring manufacturing back into the U.S.

And robotics is perfect for that.

But then when you actually get into production, it's really hard to get the robot to perform at the economics and cost that you say it can do.

Everybody wants robots.

Everyone wants humanoids.

But actually, delivering a product that's reliable, robust, secure, delivers the economics, and is a really great business model as a startup in terms of margins and scalability, that's the hard part.

You alluded to it earlier.

A couple of breakthroughs that we may need before we have the humanoid scale.

What are some engineering or technological problems that we need to solve before we could scale humanoids?

Yeah, I think I mentioned this a bit earlier, right?

It's really the cost of data collection, right?

How are we going to gather video?

Is it gathering videos?

It's going to be through teleoperations.

How are we going to get enough robots out there, or potentially people out there, maybe holding some type of robotic hand, or maybe teleoperating robots in enough different situations that's diverse enough, that's seeing a variety of environments, different objects?

How are we going to do that that's cost-effective so we can get enough data?

And then, how much data is that going to be in terms of running it through the GPUs and the underlying compute cost to actually train a model that's reliable?

And I think another part is the underlying modern architecture.

I mean, transformers at the end of the day,

they're not that efficient.

They're good enough for LLMs.

We're able to get enough data and compute at a reasonable enough cost to have magic be created through OpenAI and Anthropic and LLA.

But we don't have that magic yet in robotics because it's still anyone's guess in terms of the order of magnitude of the data and compute we need relative to the existing transformers architecture.

If we found an architecture that was 100 times, 1,000 times more innovative, then I think that would really go a long way because it would start to function like the human brain.

I may have twins that are now almost three and a half.

I've been watching their evolution.

I've been watching their brains, their LLMs, work in real time.

And the things that they learn, and I will just amaze me.

I'll say, How did you learn how to do that?

How did you even pick that up?

Like, who told you that?

Like, you're asking questions.

You don't have any, like, how have you even seen enough data to be asking questions like this?

Because the human brain is really efficient at building a world model, thinking about how the way the world works.

And I think it's an open question whether Transformers

actually has a world model or whether it's regurgitating what it's seen before, obviously, with some adjustment and inference in terms of thinking differently based off of the patterns that it's been trained on.

Steal me on your thesis a little bit.

What would make you change your mind about your timeline for humanoids?

And what would make you think that the timeline is being accelerated?

I would want to see strong evidence of progress on that scaling law, where we actually see an order of magnitude of improvement on a robot's capabilities across multiple platforms in a variety of different environments, as well as tasks.

Maybe it's picking, packing, folding laundry, loading the dishwasher, being able to introduce new objects as never seen before, and having it figure out and do that with a significant improvement in terms of accuracy.

Like, if you start to see that,

that would start to make me believe.

Some are predicting 100x or 1,000x evolution in AI capabilities over just the next two, three years.

Why would that not lead to massive evolution in humanoids and a decrease in cost of development?

Let's talk about the AI space itself.

The jury has been out on whether LLMs are where value will accrue in the ecosystem.

What are your views on this?

Important part of LLMs is whether they're closed models or open models.

Closed models being like OpenAnthropic, where you have to access that model through either ChatGPT or maybe an API.

Open models being, you know, LLAMA out of Meta, DeepSeek and Quen out of China, in which you can actually take that model, run it yourself.

You have a lot more control and the ability to flex and change and customize that model.

And then you either run it yourself or you run that, you access that model on top of an infrastructure provider.

And so I think it's important to differentiate because there's going to be value capture in both of these at different layers.

When I think about the closed model side of things, I think that there's probably going to be one or two really successful, highly valuable companies.

And the value capture is going to be in those companies because the way you consume the model will be either through their consumer product like ChatGPT or a developer API.

When you get into the open model, it's much harder to commercialize the open model because anybody else can run it.

And so a lot of the value in the world of open models, it starts to accrue in really the infrastructure surrounding the model.

So this starts to get into companies like

our portfolio companies, Databricks, and Together.

They serve open models like Llama on their infrastructure.

They also train and release their own open models.

But the way they make money is when you run open models on their GPUs, or you consume open models through basically an API, and they handle all the underlying infrastructure beneath you.

And so they're able to commercialize and make a lot of money on that model, despite not having trained it.

And then they also surround that model with a bunch of additional value-added services as it relates to evaluating and troubleshooting that model, securing that model, deploying that model

in a distributed fashion so you get super low latency.

There's also a whole bunch of best-in-class startups that we funded, like Martian, which is pioneering router model routing, which is how to actually get the right prompt to the right model in order to decrease cost and increase performance.

And so, this world of open models gives way to a much richer ecosystem in terms of who can actually capture the value.

What is Facebook's strategy?

How would you categorize Facebook's llama

engine?

I think it's a really interesting question, right?

Because Facebook's spending a lot of money on putting these open models out.

And historically, it hasn't monetized those through

others.

The work and research that they've done on those models is going back into Facebook products.

So WhatsApp, Instagram, and that improvement

factors into advertising and all their various products, which generates a lot more revenue.

But the open question is like, okay, couldn't you do that without releasing the open models?

And so I think for me, it's a few things.

One,

it's a bit of an equalizer in the model ecosystem.

It basically forces folks like OpenAI and other closed model providers to become consumer companies.

They have to compete on Facebook's terms where they really have to innovate on the actual consumer product and also think more about monetizing, not through, you know, actually monetizing through.

business model, through advertising, subscriptions, things that Facebook monetizes through.

So I think that's one, to change the playing field.

Two, I think it's also a brand.

It hugely impacts

Facebook's brand.

There's a lot of

AI engineers that really believe that AI should be open because if it's controlled, that could potentially introduce disparities in terms of value capture, terms of safety.

And so I think by being a champion of open source, Facebook accrues a ton of brand value.

You're very active in the cyber space, cybersecurity, you're very bullish on the space.

Why is cybersecurity growing so quickly today?

There's two reasons you continue to see cyber grow.

The first is just the amount of breaches and ransomware attacks.

They just continue to increase.

Cybersecurity is getting worse, not better.

I think like global ransomware attacks went up in 2024 by 11%.

If you've been considering futures tradings, now might be the time to take a closer look.

The futures market has seen increased activity recently, and Plus 500 Futures offers a straightforward entry point.

The platform provides access to major instruments including the SP 500, NASDAQ, Bitcoin, natural gas, and other key markets across equity indices, energy, metals, forex, and crypto.

Their interface is designed for accessibility.

You can monitor and execute trades from your phone with a $100 minimum deposit.

Once your account is open, potential trades can be executed in just two clicks.

For those who prefer to practice first, Plus500 offers an unlimited demo account with full charting and analytical tools.

No risk involved while you familiarize yourself with the platform.

The company has been operating in a trading space for over 20 years.

Download the Plus500 app today.

Trading in futures involves the risk of loss and is not suitable for everyone.

Not all applicants will qualify.

There's over 5,000 incidents.

United Health, which had a terrible 24 given some horrible things that happened to them, also had a major ransomware breach, which I think is now approaching like $3 billion or so of loss.

And beyond ransomware, we've also had data breaches.

2024 was another bannered year for data breaches.

There was over 3,000 incidents.

There's actually slightly down from 2023.

But before you get excited, that number,

it's almost

doubled, almost doubled from 2022 to 2023.

And if you go back 10 years, it's actually like four or five X relative to the amount of data breaches.

So the amount of data breaches and

ransomware attacks, it's getting worse and worse, which means the cyber risk that folks are are facing is also getting riskier.

And so enterprises have to spend more money in order to keep up with the pace of advancement and threats on the cyber, in the world of cybersecurity, which is why you see budgets continue to grow.

And we also have these architectural shifts which happen too.

You've got the rise of new threat surfaces which need to be protected.

So 10 years ago, it was mobile and cloud.

Great, we have all these mobile applications and cloud applications.

All the security tools we have today on-prem aren't securing those tools.

So now we have to build up an an entirely new security stack to solve all the cloud and mobile security challenges.

Now we're seeing it with AI.

Now we're deploying models.

How are we going to secure all these models?

Now we've got our employees want to run agents to automate a lot of the workflows.

How are we going to secure all of those agents from running amok and accessing sensitive data and

accessing and then leaking it all over the internet or leaving buckets on S3 open that somebody can go and steal our data from?

And so the new threat services is also a key driver of the growth in the cyber market.

Tell me about penetration testing, pen testing.

What is that and why is that so important in cybersecurity?

Pen testing is the best practice where you test your network and your IT infrastructure from the outside in.

It's really mimicking the behavior of a hacker to figure out how you might actually be breached.

The whole goal is to actually find all the vulnerabilities and then feed them back into your IT or developers so that they can patch them before a black hat can actually take advantage of them.

This whole world of proactive, of

pen testing

is really around proactive cybersecurity practices, which you can also have things like access reviews in there, code scanners.

This is all a broader category, which is really

be coined as exposure management, which is how do we figure out all the risks to an organization and minimize the risks before they're actually exploited.

Just like in healthcare, prevention is the best medicine in in cybersecurity.

But the challenge remains today that existing teams are just overwhelmed relative to the amount of things out there that we need to find and fix,

which continually puts the advantage on the side of hackers.

What will cybersecurity look like in five, 10 years from now?

It's going to get a lot worse and scarier before it gets a lot better.

I'm ultimately positive about the future of

cybersecurity because of AI.

Certainly in the near term, we're going to see more sophisticated attacks.

We're going to see deep fake attacks, like we're going to be getting voice calls from our family members, from our CFO.

We're potentially going to get on Zoom calls.

There are fake that already happened.

There was a fake Zoom call that led to somebody wiring millions of dollars in a hack.

So we're going to see a lot more of that in the near term.

But in the long term, the reason it's going to get better is because AI is going to fundamentally solve the biggest challenge in cyber, which is the lack of talent.

We don't have enough good people, and we also don't have enough budget to employ those people relative to the asymmetry that exists with cybersecurity.

The offensive side, the black hat side, they can do thousands of attacks a day, and the cost of a failure is very minimal.

They just move on to the next target.

But if you're an enterprise, all you need is one bad day and you get hosed.

And that asymmetry is so wide, is so large, you would have to have 10, 100, 100, 1,000 times more people on the defensive side to actually stand a chance of closing the gap.

But now with AI, we can actually have that.

We now actually start to train agents that are as good as our best pen testers, our best cyber analysts, our best SOC analysts out there.

And now we can unleash those first in a loop and a human-in-the-loop fashion, and then ultimately in an autonomous fashion, because in the long run, this is ultimately going to be complex AI agents on the offensive side fighting complex defensive AI agents.

agents.

We can unleash those and that's going to actually close the gap on the defensive and the offensive side so that in real time we're going to have this AI versus AI war happening of things are being tested and fixed as fast as possible before the offensive side can discover something and hack you.

How much does insurance play into this ecosystem and insuring yourself from these kind of long tail consequences?

It's super important.

It's arguably cut a lot harder because of the amount of breaches that are happening, especially on the ransomware front.

I think that insurance, one of the things it actually does is encourage companies to embrace best practices because the whole point of insurance is to gather

as much information about an organization and their cybersecurity processes to understand risk, to try to create a probabilistic model of the chance of them actually being breached.

underwrite that model based off of the chance that you're actually going to think they're going to be breached, the cost of that breach, and then can you actually encourage the CISO or SMB to adopt certain tools to lower that and then getting them to pay your premium where if there is a risk, your payout to them, ultimately

it doesn't bankrupt the company.

And so, if you do cyber insurance right, if you encourage CISOs to embrace the right tooling stack, you actually do provide a better degree of cybersecurity.

What could individuals or companies do preventively to decrease the the chance of

being a victim of a cyber attack?

It's all about the basics.

Let's get back to the basics.

Multi-factor authentication, least privilege access.

Just look, this idea of least privilege access is really about who can access what within your organization.

And when you go into an enterprise, you've got thousands of SaaS applications, on-prem applications, you've got multiple clouds, you've got on-prem databases, you've got Databricks, you've got third-party APIs.

We don't do a very good job of understanding who can actually access what in an enterprise and what they can do.

And so you get what's known as access sprawl, where somebody

inevitably gets access to something that they shouldn't.

And either that person does something wrong that leads to a cybersecurity breach, or maybe that person gets hacked.

and the hacker is able to take the advantage of privileges which are escalated and shouldn't be in order to either

exfiltrate data or maybe move laterally within an organization to eventually cause a cyber breach.

And so if we could just get back to the basics of only give people access to what they need when they need, only allow them to do what they need to do in real time, that would solve a lot of problems.

Now the challenge, I say like, why don't we do this?

It's really, really hard to do this at scale because of how fragmented IT is.

This is why we, our actually most recent cybersecurity investment is in this company called Veza, which is solving for this access challenge.

They're able to digest all your SaaS applications,

all your databases out there, all your clouds, all your on-prem systems, custom applications, digest it, and give you a single pane of glass so you can actually see who can access what and where the privilege issues are so that you can fix them and actually enforce least privilege access.

So you don't have to worry about this issue down the road where somebody's able to do something because they somehow got access to something they shouldn't have otherwise had access to.

On a basic level, multi-factor authentication works because it's very unlikely that a hacker could hack both your computer and your phone because two different networks.

Is that a simple way to explain it?

That's the idea, yeah.

It's

hard for you to be in two places at once.

To be very clear, I want to be something specific about the phone.

Text-based messages, you actually can hack those through social engineering.

There's a lot of instances of folks actually figuring out to hack the multi-factor SMS side of things, which is why I really

encourage you to actually use applications and basically virtual keys as the primary way of doing your second factor authentication, not just text codes.

Social engineering being somebody calls you and pretends they're somebody that they're not.

Somebody calls

the telecommunications provider, Verizon or AT ⁇ T, and finds a customer support rep.

And like I said, everybody has a bad day.

And they convince that customer support rep to maybe transfer your number or maybe tell you a code that they just sent to you.

And so that social engineering, you know, you can hack the SNS side of that versus if you're actually generating some type of code within an application like Authy,

it's next to impossible to actually get that code unless you actually have my phone.

You steal my phone, you log in, you put a password in my Authy, and you actually see that code.

And then you log in, and then you have my phone.

You duel it all at once.

Before, if you have things configured right, I may also get a notification when someone else is logging in on my email.

So I will now see that.

And if I don't have my phone and somebody's logging in, that also now gives me awareness that I might have potentially been breached.

So many major breaches are just people forgetting to configure MFA properly.

And that's just like a base cybersecurity 101.

And

you started your career at Francatron's Catalyst.

It's a famous firm that competed directly against the large investment banks.

What did you learn at Catalyst that you bring to your role at NEA?

Look, my biggest learning is on what Catalyst made Catalyst successful.

It was the founder, Frank.

Catalyst was a startup.

It was a different kind of startup, right?

It was a services startup, but ultimately it was a startup.

And startups are so highly dependent on the DNA and ethos and the background of the founder.

Frank taught me a lot about what to look for in a founder to ultimately build a successful company.

Frank was super unrelenting and competitive.

He wanted to win every deal.

He had a chip on his shoulder given his unfair treatment after the fallout of the dot-com bubble.

He was a master of his craft.

He had an amazing, you know, unmatched experience and network.

Frank is also an incredible storyteller and he's a master of sales.

He knows how to connect with people, win them over.

Ultimately, being a founder is about resource aggregation.

How can I convince investors, employees, customers to follow me, to believe in me, to work with me?

It's the same thing with investment banking.

Freight Quattrone is a master of that craft.

And a lot of what I saw in Frank is something I hope to find in the founders that I end up backing at NEA.

Venture itself as an industry is evolving at a hyperscale.

You saw Lightspeed now going to Publix.

You saw COTI launching interval funds.

How do you see venture as an industry evolving over the next decade?

Look, it makes sense to me, right?

I think that there are synergies in terms of being able to invest at multiple stages.

It's why we're a multi-stage investment firm.

You can invest in a seed or series A company, and as that company scales, growth stage, public company, you can invest more capital.

The dollars and returns in venture are so slanted.

Once you have a winner, you want to try to get as much dollars into that company.

Also thinking about risk reward and time to value and all that, but you typically want to get a lot of money into that company.

And so it starts to make sense to start to have these different these different ways to win as you uh as you build and help create some of the generational companies that are out there how should people follow you on social

yeah you can find me on x aaron ej you can also find me on linkedin uh which uh you know i'm posting on and spend a lot of time on you can also reach out to me uh ajacobsonnea.com

always love meeting founders and talking about the future of cyber ai and robotics thank you aaron for uh taking the time.

Look forward to catching up soon.

Thanks, David.