The post Integrating Agentic AI in Computer Vision: Enhancing Video Analytics appeared on BitcoinEthereumNews.com. Joerg Hiller Nov 13, 2025 19:05 Explore three ways to integrate agentic AI into computer vision, enhancing video analytics with dense captions, VLM reasoning, and automatic scenario analysis, according to NVIDIA. Agentic AI is revolutionizing computer vision applications by introducing advanced techniques to enhance video analytics, according to NVIDIA. The integration of vision language models (VLMs) into these systems is transforming how visual content is processed, making it more searchable and insightful. Making Visual Content Searchable With Dense Captions Traditional convolutional neural networks (CNNs) struggle with limited training and semantics in video search tasks. By embedding VLMs, businesses can generate detailed captions for images and videos, converting unstructured content into rich, searchable metadata. This approach enables more flexible visual search capabilities, surpassing the constraints of file names or basic tags. For instance, UVeye, an automated vehicle-inspection system, processes over 700 million high-resolution images monthly. By applying VLMs, it converts visual data into structured reports, detecting defects with exceptional accuracy. Similarly, Relo Metrics uses VLMs to quantify the value of media investments in sports marketing, providing real-time monetary value for high-impact moments. Augmenting Alerts with VLM Reasoning While CNN-based systems typically generate binary detection alerts, they often lack contextual understanding, leading to false positives. VLMs can augment these systems, providing contextual insights into alerts. For example, Linker Vision uses VLMs to verify critical city alerts, reducing false positives and enhancing municipal response during incidents. The integration of VLMs enables cross-department coordination, turning observations into actionable insights. This capability is crucial for smart city implementations, where rapid and informed responses are necessary. Automatic Analysis of Complex Scenarios Agentic AI systems, combining VLMs with reasoning models, LLMs, and computer vision, can process complex queries across various modalities. This integration allows for deeper and more reliable… The post Integrating Agentic AI in Computer Vision: Enhancing Video Analytics appeared on BitcoinEthereumNews.com. Joerg Hiller Nov 13, 2025 19:05 Explore three ways to integrate agentic AI into computer vision, enhancing video analytics with dense captions, VLM reasoning, and automatic scenario analysis, according to NVIDIA. Agentic AI is revolutionizing computer vision applications by introducing advanced techniques to enhance video analytics, according to NVIDIA. The integration of vision language models (VLMs) into these systems is transforming how visual content is processed, making it more searchable and insightful. Making Visual Content Searchable With Dense Captions Traditional convolutional neural networks (CNNs) struggle with limited training and semantics in video search tasks. By embedding VLMs, businesses can generate detailed captions for images and videos, converting unstructured content into rich, searchable metadata. This approach enables more flexible visual search capabilities, surpassing the constraints of file names or basic tags. For instance, UVeye, an automated vehicle-inspection system, processes over 700 million high-resolution images monthly. By applying VLMs, it converts visual data into structured reports, detecting defects with exceptional accuracy. Similarly, Relo Metrics uses VLMs to quantify the value of media investments in sports marketing, providing real-time monetary value for high-impact moments. Augmenting Alerts with VLM Reasoning While CNN-based systems typically generate binary detection alerts, they often lack contextual understanding, leading to false positives. VLMs can augment these systems, providing contextual insights into alerts. For example, Linker Vision uses VLMs to verify critical city alerts, reducing false positives and enhancing municipal response during incidents. The integration of VLMs enables cross-department coordination, turning observations into actionable insights. This capability is crucial for smart city implementations, where rapid and informed responses are necessary. Automatic Analysis of Complex Scenarios Agentic AI systems, combining VLMs with reasoning models, LLMs, and computer vision, can process complex queries across various modalities. This integration allows for deeper and more reliable…

Integrating Agentic AI in Computer Vision: Enhancing Video Analytics

For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com


Joerg Hiller
Nov 13, 2025 19:05

Explore three ways to integrate agentic AI into computer vision, enhancing video analytics with dense captions, VLM reasoning, and automatic scenario analysis, according to NVIDIA.

Agentic AI is revolutionizing computer vision applications by introducing advanced techniques to enhance video analytics, according to NVIDIA. The integration of vision language models (VLMs) into these systems is transforming how visual content is processed, making it more searchable and insightful.

Making Visual Content Searchable With Dense Captions

Traditional convolutional neural networks (CNNs) struggle with limited training and semantics in video search tasks. By embedding VLMs, businesses can generate detailed captions for images and videos, converting unstructured content into rich, searchable metadata. This approach enables more flexible visual search capabilities, surpassing the constraints of file names or basic tags.

For instance, UVeye, an automated vehicle-inspection system, processes over 700 million high-resolution images monthly. By applying VLMs, it converts visual data into structured reports, detecting defects with exceptional accuracy. Similarly, Relo Metrics uses VLMs to quantify the value of media investments in sports marketing, providing real-time monetary value for high-impact moments.

Augmenting Alerts with VLM Reasoning

While CNN-based systems typically generate binary detection alerts, they often lack contextual understanding, leading to false positives. VLMs can augment these systems, providing contextual insights into alerts. For example, Linker Vision uses VLMs to verify critical city alerts, reducing false positives and enhancing municipal response during incidents.

The integration of VLMs enables cross-department coordination, turning observations into actionable insights. This capability is crucial for smart city implementations, where rapid and informed responses are necessary.

Automatic Analysis of Complex Scenarios

Agentic AI systems, combining VLMs with reasoning models, LLMs, and computer vision, can process complex queries across various modalities. This integration allows for deeper and more reliable insights beyond surface-level understanding.

Levatas, for instance, uses VLMs in visual-inspection solutions for critical infrastructure. By automating video analytics, it accelerates the inspection process, providing detailed reports and enabling swift responses to detected issues. This integration ensures reliable and efficient operations in sectors like energy and logistics.

Powering Agentic Video Intelligence with NVIDIA Technologies

Developers can leverage NVIDIA’s multimodal VLMs, such as NVCLIP and Nemotron Nano V2, to build metadata-rich indexes for advanced search and reasoning. The NVIDIA Blueprint for video search and summarization (VSS) allows for the integration of VLMs into computer vision applications, enabling smarter operations and real-time process compliance.

These advancements demonstrate NVIDIA’s commitment to enhancing AI capabilities within video analytics, fostering more intelligent and efficient systems across various industries.

For more details, visit the NVIDIA blog.

Image source: Shutterstock

Source: https://blockchain.news/news/integrating-agentic-ai-computer-vision-enhancing-video-analytics

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Lombard (BARD) Plunges 37.6% in 24 Hours: On-Chain Data Reveals Deeper Issues

Lombard (BARD) Plunges 37.6% in 24 Hours: On-Chain Data Reveals Deeper Issues

Lombard Protocol's native token BARD experienced a sharp 37.6% decline to $0.67, erasing $91 million in market capitalization within 24 hours. Our analysis of on
Share
Blockchainmagazine2026/03/19 07:04
Polygon Tops RWA Rankings With $1.1B in Tokenized Assets

Polygon Tops RWA Rankings With $1.1B in Tokenized Assets

The post Polygon Tops RWA Rankings With $1.1B in Tokenized Assets appeared on BitcoinEthereumNews.com. Key Notes A new report from Dune and RWA.xyz highlights Polygon’s role in the growing RWA sector. Polygon PoS currently holds $1.13 billion in RWA Total Value Locked (TVL) across 269 assets. The network holds a 62% market share of tokenized global bonds, driven by European money market funds. The Polygon POL $0.25 24h volatility: 1.4% Market cap: $2.64 B Vol. 24h: $106.17 M network is securing a significant position in the rapidly growing tokenization space, now holding over $1.13 billion in total value locked (TVL) from Real World Assets (RWAs). This development comes as the network continues to evolve, recently deploying its major “Rio” upgrade on the Amoy testnet to enhance future scaling capabilities. This information comes from a new joint report on the state of the RWA market published on Sept. 17 by blockchain analytics firm Dune and data platform RWA.xyz. The focus on RWAs is intensifying across the industry, coinciding with events like the ongoing Real-World Asset Summit in New York. Sandeep Nailwal, CEO of the Polygon Foundation, highlighted the findings via a post on X, noting that the TVL is spread across 269 assets and 2,900 holders on the Polygon PoS chain. The Dune and https://t.co/W6WSFlHoQF report on RWA is out and it shows that RWA is happening on Polygon. Here are a few highlights: – Leading in Global Bonds: Polygon holds 62% share of tokenized global bonds (driven by Spiko’s euro MMF and Cashlink euro issues) – Spiko U.S.… — Sandeep | CEO, Polygon Foundation (※,※) (@sandeepnailwal) September 17, 2025 Key Trends From the 2025 RWA Report The joint publication, titled “RWA REPORT 2025,” offers a comprehensive look into the tokenized asset landscape, which it states has grown 224% since the start of 2024. The report identifies several key trends driving this expansion. According to…
Share
BitcoinEthereumNews2025/09/18 00:40
Slumps as Yen gains on risk aversion

Slumps as Yen gains on risk aversion

The post Slumps as Yen gains on risk aversion appeared on BitcoinEthereumNews.com. The GBP/JPY register losses of 0.20& on Wednesday as investors wait for the Bank
Share
BitcoinEthereumNews2026/03/19 07:37