The next AI era is here, and it requires living in the real world. AI needs us to wear it, move with it, and even drive it. Why? Because the next breakthrough isnât more text, itâs massive amounts of real-world video and sensor data. Tech giants are now in a high-stakes race for both profitability and dominance, competing to capture physical data at scale through wearables, vehicles, and cameras everywhere. In this edition, we also share Physical AI tools and learning resources to help you understand and build the next phase of AI. Stay curious.
Get 60% off for 1Â year
Share
Most businesses think churn = cancellations.âââThatâs wrong.
What your payment processor doesnât clearly show you:
We built a real-time financial command center on top of Stripe and any payment processor that shows:
Schedule a Free Discovery Call.
As large language models (LLMs) hit a training plateau in terms of the amount of text currently existing in the world that the models can swallow, inference becomes the priority for these models.
What are World models? They are neural networks that understand the dynamics of the real world, including physics and spatial properties. They can use input data, text, image, video, and movement to generate videos that simulate realistic physical environments. Physical AI developers use world models to generate custom synthetic data or downstream AI models for training robots. Physical AI simplified is the system that bridges the digital and physical worlds, allowing machines to perceive, reason, and interact with their surroundings in real time.
Humans and all animals interact with their surroundings unconsciously and without much thinking. We walk through spaces without hitting immovable objects, put our clothes on, drive, and navigate our world using our senses, and even optimize our own spaces to improve navigation. As of now, LLMs only navigate texts, images, and videos that they have as input and create outputs accordingly. World models that are trained to give machines âspatial intelligenceâ an internal understanding of physics, cause-and-effect, and 3D space. To train them, they ingest millions of hours of real-world video to understand motion and dynamics. By predicting subsequent events, the model can generate simulations, enabling robots to practice tasks virtually before attempting them physically. These learned capabilities are then fine-tuned for specific hardware configurations, such as autonomous vehicles or robotic appendages.
Remember the saying âData is the new Oilâ? Well, now companies with the most video data (YouTube, Meta, Tesla, and maybe the ESPNs for sports) have an upper hand in this new paradigm. But, this is just the beginning, as the battle for wearables intensifies, the data that these devices generate becomes more valuable, because Meta glasses worn by millions means hours of real world footabe used to train spatial models. Maybe these wearables will become ubiquitous and relatively cheap as similar to social media, we will become the product that provides the training data (videos) to tech companies as we wear these so-called wearables and drive cars with multiple cameras.
Major tech companies like NVIDIA, Google DeepMind, and Meta are developing world models to overcome current AI limitations, such as a lack of intuitive understanding of cause-and-effect and 3D space. Specialized startups like World Labs and AMI Labs are also working on this âspatial intelligenceâ to enable robots and autonomous systems to predict physical outcomes before acting, with applications in automotive, manufacturing, and entertainment industries. Startups and established companies are rushing to release wearables to get ahead of the next era. Snap just spun its wearable division into its own company, Google glasses are making a comeback, we all know Meta and RayBans devices, and OpenAI has been working on its AI device with Jony Ive.
This is just the beginning. In the next edition, Iâll break down how spatial computing, world models, and Physical AI will shape decision-making, how machines wonât just answer questions, but tell us what to do next.
Courses & Educational Content
Microsoft spends big, but Meta and Google are emerging as the winners, so far.
Why Meta looks like the real winner
Microsoft is spending aggressively on AI infrastructure, but most AI profits today still come from enterprise products like Azure AI and Copilot, not direct consumer monetization. AI-driven capex is rising faster than near-term margins.
Meta, meanwhile, has already cracked AI monetization. AI-driven ranking and recommendation systems across Instagram, Facebook, and WhatsApp have increased engagement and ad efficiency, directly lifting ad impressions and revenue. Ads account for ~98% of Metaâs revenue, and AI is now core to that engine.
Google is embedding AI directly into Search, Chrome (AI Mode), Workspace (Gemini), and Cloud. These upgrades improve retention, ad performance, and cloud growth, keeping AI tightly linked to revenue instead of standalone products.
Meta is monetizing AI immediately, Google is reinforcing its ads + cloud flywheel, and Microsoft is still converting massive AI investment into consumer-level profit.
Physical AI
The next AI era is here, and it requires living in the real world. was originally published in Coinmonks on Medium, where people are continuing the conversation by highlighting and responding to this story.


