Revolutionizing Data Analytics: GPU-Native Velox and NVIDIA cuDF Integration

Rongchai Wang
Oct 06, 2025 06:01

NVIDIA and IBM collaborate to integrate GPU-native Velox with NVIDIA cuDF, enhancing data analytics performance on platforms like Presto and Apache Spark.

As data-driven demands grow, NVIDIA and IBM have partnered to enhance data analytics capabilities by integrating GPU-native Velox with NVIDIA cuDF. This collaboration aims to deliver significant performance improvements over traditional CPU-based systems by leveraging the high memory bandwidth and thread count of GPUs, according to NVIDIA. These enhancements are particularly beneficial for compute-heavy workloads involving multiple joins, complex aggregations, and string processing.

Velox and cuDF: A Powerful Combination

The integration of NVIDIA cuDF into the Velox execution engine allows for GPU-native query execution on widely-used platforms like Presto and Apache Spark. This open project aims to address performance bottlenecks, enabling real-time insights from massive datasets. Velox acts as an intermediary, translating query plans from systems like Presto and Spark into executable GPU pipelines powered by cuDF.

Accelerating Presto with GPU Power

By moving the entire Presto query plan to GPU, the integration aims to boost execution speed significantly. Enhancements to GPU operators such as TableScan, HashJoin, and HashAggregations in Velox enable end-to-end GPU execution in Presto. Initial benchmarks show impressive runtime reductions, with Presto on NVIDIA GPUs achieving runtimes significantly lower than CPU counterparts.

Multi-GPU Execution for Enhanced Performance

The collaboration introduces a UCX-based Exchange operator, which supports the entire execution pipeline on GPUs, leveraging high bandwidth NVLink and RoCE or InfiniBand for connectivity. This setup allows for substantial performance gains, with Presto on GPU showcasing more than a sixfold speedup in data exchange processes.

Hybrid Execution in Apache Spark

In Apache Spark, the integration with Apache Gluten and cuDF focuses on offloading compute-intensive query stages to GPUs, optimizing resource use in hybrid clusters. This strategy allows for efficient use of GPU resources while maintaining CPU availability for other tasks, resulting in significant performance improvements.

Community Involvement and Future Prospects

The open-source nature of this project encourages community involvement, aiming to drive further innovations across the data processing ecosystem. By implementing reusable GPU operators in Velox, the collaboration seeks to reduce duplication and simplify maintenance while accelerating various systems.

Image source: Shutterstock

Source: https://blockchain.news/news/revolutionizing-data-analytics-gpu-native-velox-nvidia-cudf-integration

Revolutionizing Data Analytics: GPU-Native Velox and NVIDIA cuDF Integration

Velox and cuDF: A Powerful Combination

Accelerating Presto with GPU Power

Multi-GPU Execution for Enhanced Performance

Hybrid Execution in Apache Spark

Community Involvement and Future Prospects

추천 콘텐츠

Pi Network Wallet Upgrade Signals Multi Asset Support and Web3 Ecosystem Expansion

China to import record 800,000 tonnes of US ethane amid Iran conflict

PROM’s $2.36 Peak Targets $1.80 Retest Within Days Before $3.50 Rally

인기 뉴스

USD/CHF Holds Steady Near 0.7800 as US Dollar Firms on Critical US-Iran Tensions

Trump Administration Unveils $166B Refund In Tariffs After Supreme Court Decision

Revolutionary: Coinbase’s X402 Launches Agentic.market, the First App Store for Autonomous AI Agents

Mayte Garcia On Prince’s Legacy, Live 4 Love Charities And The Glam Slam Benefit In Hollywood

Ice Open Network News: ICO Complaint Filed After Insider Breach Exposes User Database

암호화폐 가격