Maintaining Robots and Humanoids: Why You Should Prepare for Scaled Humanoid Rob

Maintaining Robots and Humanoids: Why You Should Prepare for Scaled Humanoid Robots Deployments within a Two-Year Horizon

Published on 15.11.2025

The same forces that brought us ChatGPT overnight are now converging on humanoid robotics. Within two years, we'll see humanoid robots operating at production scale—not just in controlled factory environments, but expanding rapidly into warehouses, retail spaces, and eventually homes. This isn't speculation. It's the inevitable result of exponential curves that are already in motion.

The Deep Learning Revolution: A Small Community That Changed Everything

To understand why humanoids are imminent, we need to revisit how we got here. Neural networks were once considered an outlier technology—a curiosity that most scientists dismissed as a dead-end for artificial intelligence. But a small, interconnected community of researchers saw something others missed. They believed in scaling laws before anyone else did, and they knew each other, collaborated, and drove a revolution that seemed impossible to the mainstream.

The breakthrough moment came with the 2017 paper "Attention Is All You Need" by Vaswani et al., which introduced the Transformer architecture. This paper has been cited more than 173,000 times, placing it among the top ten most-cited papers of the 21st century. The authors—Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Gomez, Łukasz Kaiser, and Illia Polosukhin—were working at Google when they published this research. Their innovation didn't just improve existing models—it created an entirely new paradigm that powers everything from ChatGPT to code generation to, now, robotics control systems.

The names are now legendary: Jeff Hinton, the godfather of deep learning who pioneered backpropagation and won the Turing Award. Yann LeCun, who developed convolutional neural networks at Facebook AI Research (now Meta). Andrew Ng, who brought deep learning to Stanford and later to massive scale at Google Brain and Baidu. Demis Hassabis, who founded DeepMind and led the charge on reinforcement learning breakthroughs like AlphaGo.

Jeff Dean at Google became the architect of the infrastructure that made deep learning possible at scale, building the distributed systems that could train massive models. Ilya Sutskever, co-founder of OpenAI and chief scientist, was a student of Hinton's and became one of the key voices pushing for ever-larger models. Dario Amodei and Daniela Amodei, who worked at OpenAI before founding Anthropic, were part of this tight-knit community driving the scaling hypothesis.

Then came the entrepreneurs and visionaries who saw the commercial potential: Sam Altman, who took the helm at OpenAI and bet everything on scaling. Elon Musk, an early OpenAI co-founder who later launched his own AI initiatives and saw the connection to robotics early. Jensen Huang at NVIDIA, who recognized that GPUs weren't just for graphics—they were the engine for the AI revolution.

These weren't separate movements. These people knew each other. They shared ideas at conferences, collaborated on papers, and built on each other's work. When AlexNet won ImageNet in 2012—a watershed moment demonstrating that deep learning could outperform traditional computer vision—the community exploded. NVIDIA immediately understood that their GPUs were essential infrastructure. Google, Facebook, and others began massive hiring sprees.

The Scaling Laws: More Compute, More Data, Better Results

The insight that united this community was deceptively simple: scaling works. Add more compute, feed in more data, and the results get better. Not just incrementally better—exponentially better.

This pattern has held for over a decade. From image recognition to language models to game-playing AI, the trajectory has been consistent. Models that seemed impossibly large a few years ago are now routine. GPT-3's 175 billion parameters seemed enormous in 2020; by 2024, we're training models with trillions of parameters.

Recently, we've seen scaling laws extend into new domains. Sora, OpenAI's video generation model, demonstrated that the same principles apply to video. Generate enough training data, scale the model, and suddenly you can create photorealistic video from text prompts. This breakthrough is more important for robotics than most people realize.

The Convergence: The Same People, Now Building Robots

Here's what's critical: the same people who drove the deep learning revolution are now building humanoid robots.

Sam Altman is backing Figure AI and continues to see robotics as the physical instantiation of AI.
Elon Musk is all-in on Tesla Optimus, treating it as important as autonomous driving.
Jensen Huang declares at every keynote that "physical AI is the next big thing"—and NVIDIA is building the infrastructure to support it.
Google (through DeepMind and Google Robotics) is developing robotic systems powered by their AI breakthroughs.
OpenAI has robotics research as a core focus.

Companies like Figure, 1X, Tesla (Optimus), Sanctuary AI, and others are racing toward production-scale humanoids. Major investors are taking notice: SoftBank is betting heavily on robotics, investing in companies like AutoStore, ABB Robotics, Brain Corp, and ARM (for AI processing chips).

From ChatGPT to Physical AI — the next wave of AI.
This isn't a new industry finding its footing. This is the most successful technology community in history—the one that created the AI revolution—turning its attention to physical embodiment.

The Money Speaks: Hundreds of Billions Flowing Into Physical AI

Capital follows conviction. When the smartest investors and largest tech companies deploy tens of billions into a sector, they're not betting on distant futures—they're positioning for imminent transformation. The funding flowing into physical AI and humanoid robotics in 2024–2025 tells us everything we need to know about the timeline.

Record-Breaking Funding Rounds

Figure AI

$675M Series B (February 2024) at $2.6B valuation
Over $1B Series C (September 2025) at $39B valuation
Total raised: $1.75B+ in less than two years
In talks for an additional $1.5B
Manufacturing capacity: 12,000 units/year, targeting 100,000 within 4 years

Physical Intelligence

$70M seed (March 2024) at $400M valuation
$400M Series A (November 2024) at $2.4B valuation
Total raised: $470M in under 8 months

Other Major Raises (2024–2025)

UBTECH Robotics: $1B
1X Technologies: $125M
NEURA Robotics: €120M
Agility Robotics: $150M+
Sanctuary AI: $140M+
Apptronik: $50M+

Capital Deployment Overview

When including corporate R&D, NVIDIA data centers, manufacturing investments, and strategic acquisitions, $50–65B/year is flowing into humanoid robotics and physical AI—possibly up to $100B with adjacent sectors included.

This is comparable to the GDP of entire nations, and more than was invested into the early internet (1995–2000).

Why the Funding Deluge?

These aren't speculative bets. Investors are deploying capital because:

Proven scaling laws
Massive market demand
Under-12-month payback periods
Labor shortages
Competitive pressure
Best AI talent moving to embodied AI

The message is clear: humanoids are not a 2035 technology — they are a 2026–2028 technology.

The ChatGPT Moment: Exponential Adoption Happens Faster Than Anyone Expects

ChatGPT became the fastest-growing application in history. It reached 100 million users in two months—faster than TikTok, faster than Instagram, faster than any consumer technology ever. Hundreds of millions of people adopted a completely new interface for interacting with computers almost overnight.

This matters for humanoids because it demonstrates how quickly exponential adoption happens. For years, people used Google Translate and barely noticed improvements. Then suddenly, neural machine translation made it dramatically better—and adoption exploded. For years, people ignored voice assistants. Then ChatGPT made AI genuinely useful—and the world changed.

NVIDIA's Jensen Huang talks about a "double exponential curve": exponential growth in the user base, combined with exponential growth in usage per user (longer context windows, more complex tasks, more tokens processed). This compounds into hypergrowth.

Humanoid robots are poised for the same trajectory. Right now, they seem like expensive prototypes. But when they cross the utility threshold—when they can perform economically valuable tasks reliably—adoption will happen faster than anyone expects.

Once AI becomes useful, adoption goes exponential.

Humanoids will follow the same pattern — slow, then suddenly everywhere.

Autonomous Vehicles: The Proof of Concept

We're already seeing this pattern with autonomous vehicles. The miles driven autonomously are growing exponentially. Waymo is operating commercial robotaxi services in multiple cities. The safety data is becoming undeniable: autonomous vehicles are producing 90% fewer accidents than human drivers. They're already safer than humans in the domains where they operate.

The market cap opportunity is staggering. All car manufacturers and ride-sharing companies combined represent a massive market—and autonomous driving will capture much of that value. We're watching it happen in real-time: slowly at first, then all at once.

This is the template for humanoids. Slow progress, skepticism, then sudden exponential deployment.

Hardware Is Ready. Software Is the Bottleneck.

If you watch recent videos from Boston Dynamics (Atlas), Tesla (Optimus), Figure, Unitree, or Sanctuary AI, the hardware is astonishing. These robots can do backflips—something only elite human athletes can perform. They can manipulate objects with precision, navigate complex environments, and perform tasks that seemed impossible just years ago.

The cost for Unitress Humanoids is about 20k USD with some others well below that mark. Though production-grade Humanoids will be more expensive, this gives a good indication.

The hardware is beyond human-level capability in many dimensions. The bottleneck is software: perception, planning, decision-making, and task execution.

This is exactly where the deep learning breakthroughs matter most.

Video Generation: The Key to Robot Planning

Here's the insight that connects everything: video generation models like Sora are effectively world models. They understand physics, object permanence, spatial relationships, and temporal dynamics. They can predict what happens next in a scene.

Now flip this around. Instead of generating a video from a prompt, use the same model for robot planning. The robot observes a messy kitchen. You give it a prompt: "clean the kitchen." The model generates a video showing the sequence of steps—dishes moved to the sink, counters wiped, floor swept. This becomes the plan.

The robot executes the plan step-by-step, constantly checking its actions against the generated video. If the outcome doesn't match the desired state, it can self-correct using reinforcement learning. Eventually, the kitchen gets cleaned.

This architecture—using generative models for planning and reinforcement learning for execution—is a breakthrough. It leverages all the scaling progress from language models and video generation and applies it directly to physical tasks.

Industrial Tasks Are Simpler Than They Seem

Cleaning a kitchen is harder than:

warehouse picking
pallet moving
assembly
machine tending

Industrial environments are structured, repetitive, and predictable.

Thus humanoids will first scale in:

warehouses
factories
logistics centers

Historical Precedent: Google Translate’s Overnight Transformation

When Google first prototyped neural machine translation, it took 12 hours to translate a single sentence. Then a team of engineers spent a weekend optimizing the code, and suddenly it took 120 milliseconds. Over time, they also reduced the compute requirements dramatically.

We're seeing the same pattern with image and video generation. Models that were impossibly slow and expensive to run are becoming faster and cheaper through algorithmic improvements and hardware advances.

NVIDIA's hardware roadmap illustrates this: the jump from the H100 to the Blackwell architecture delivered over 10x improvement in AI performance. These gains compound. Faster inference means robots can process and act more quickly. Cheaper compute means deployment at scale becomes economically feasible.

It's not a hardware problem—we've seen robots perform complex physical actions quickly (like drumming or backflips). The bottleneck is processing: taking sensory input and generating the right actions. As inference speeds improve exponentially, this bottleneck dissolves.

The Two-Year Timeline: Why It’s Realistic

hardware is mature
multimodal models improving monthly
scaling laws apply to robotics
economics irresistible
adoption exponential

We will see production deployments, not prototypes.

The Workforce Transformation: With WAKU Care Everyone Becomes a Robot Operator

This transformation will change the nature of work itself. Most workers—perhaps eventually all workers—will become robot operators and supervisors. Instead of performing manual tasks, humans will manage fleets of humanoids, troubleshoot issues, and optimize workflows.

This isn't displacement—it's augmentation. A single human overseeing ten humanoids becomes ten times more productive. The economic value they create multiplies.

But this requires infrastructure: training platforms, maintenance systems, real-time monitoring, and knowledge bases that turn factory workers into robot orchestrators. The companies that build this infrastructure—that make it seamless for workers to transition from manual labor to robot supervision—will capture enormous value.

Humans will shift from:

performing tasks

to:

supervising fleets
managing exceptions
guiding robots
maintaining systems

This requires:

maintenance platforms
knowledge systems
real-time monitoring
operator assistance tools

This is why we build WAKU Care - as a Maintenance Service Platform powering the robot revolution.

Conclusion: The Exponential Curve Is Already Here

Unitree targeted 10,000 humanoids in 2025 — and likely exceeded it.

Humanoid robots at scale are the next chapter of the scaling-law revolution that created ChatGPT.

The hardware is ready.
The models are scaling.
The economics are undeniable.
Adoption will be exponential.

In two years, humanoids will work:

in factories
in warehouses
in logistics hubs

at scale.

The question is no longer if.
It is:

Will your organization lead — or fall behind?

All Articles

WAKU Care

Software for Maintenance and Service Teams

Double Maintenance Efficiency.
Intuitive, digital and measurable.
Manage knowledge. Avoid chaos.

Contact our robot experts!

WAKU Robotics supports you in choosing the right robot for your application. We take care of the procurement of the robots as well as the on-site test. Our WAKU Care software helps you to operate the robots across manufacturers and to analyze the processes.

Contact WAKU