ExchangeDEX+

Buy Crypto Markets Spot FuturesGOLD Earn Events

This article presents experimental benchmarks for PyJuice, highlighting its efficiency in both compilation and runtime. Tests show that even models with nearly 1 billion parameters can be compiled in about 30 seconds, and PyJuice consistently outperforms baseline methods across different GPUs (RTX 4090, NVIDIA A40) and batch sizes. These results underline PyJuice’s speed, scalability, and advantage in real-world machine learning workloads.This article presents experimental benchmarks for PyJuice, highlighting its efficiency in both compilation and runtime. Tests show that even models with nearly 1 billion parameters can be compiled in about 30 seconds, and PyJuice consistently outperforms baseline methods across different GPUs (RTX 4090, NVIDIA A40) and batch sizes. These results underline PyJuice’s speed, scalability, and advantage in real-world machine learning workloads.

How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes

Author: Hackernoon

Source: Hackernoon

2025/08/26 04:00

2 min read

REAL$0.04677-15.71%

Table of Links

Abstract and 1. Introduction

Preliminaries and Related Work
Key Bottlenecks in PC Parallelization
Harnessing Block-Based PC Parallelization

4.1. Fully Connected Sum Layers

4.2. Generalizing To Practical Sum Layers

4.3. Efficient Implementations by Compiling PC Layers

4.4. Analysis: IO and Computation Overhead
Optimizing Backpropagation with PC Flows
Experiments

6.1. Faster Models with PyJuice

6.2. Better PCs At Scale

6.3. Benchmarking Existing PCs
Conclusion, Acknowledgements, Impact Statement, and References

A. Algorithm Details

B. Additional Technical Details

C. Experimental Details

D. Additional Experiments

D.1. Speed of the Compilation Process

In Table 5, we show the compilation speed of PCs with different structures and different sizes. Experiments are conducted on a server with an AMD EPYC 7763 64-Core Processor and 8 RTX 4090 GPUs (we only use one GPU). The results demonstrate the efficiency of the compilation process, where even the PD model with close to 1B parameters can be compiled in around 30 seconds.

\ Table 5. Average (± standard deviation of 3 runs) runtime (in seconds) of the compilation process of four PCs.

D.2. Runtime on Different GPUs

In addition to the RTX 4090 GPU adopted in the experiments in Table 1, we compare the runtime of PyJuice with the baselines on an NVIDIA A40 GPU. As shown in the following table, PyJuice is still significantly faster than all baselines for PCs of different sizes.

\ Table 6. Average (± standard deviation of 5 runs) runtime (in seconds) per training epoch of 60K samples for PyJuice and the baselines on five RAT-SPNs (Peharz et al., 2020b) with different sizes. All other settings are the same as described in Section 6.1.

D.3. Runtime on Different Batch Sizes

As a supplement to Table 1, we report the runtime for a RAT-SPN (Peharz et al., 2020b) with 465K nodes and 33.4M edges using batch sizes {8, 16, 32, 64, 128, 256, 512}. To minimize distractions, we only record the time to compute the forward and backward process, but not the time used for EM updates. Results are shown in the table below.

\ Table 7. Average (± standard deviation of 5 runs) runtime (in seconds) per training epoch (excluding EM updates) of 60K samples for PyJuice and the baselines on a RAT-SPNs (Peharz et al., 2020b) with 465K nodes and 33.4M edges. All other settings are the same as described in Section 6.1. OOM denotes out-of-memory.

:::info Authors:

(1) Anji Liu, Department of Computer Science, University of California, Los Angeles, USA (liuanji@cs.ucla.edu);

(2) Kareem Ahmed, Department of Computer Science, University of California, Los Angeles, USA;

(3) Guy Van den Broeck, Department of Computer Science, University of California, Los Angeles, USA;

:::

:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

Market Opportunity

RealLink Price(REAL)

$0.04684

$0.04684$0.04684

-7.08%

USD

RealLink (REAL) Live Price Chart

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

How Fast Is PyJuice? Testing Compilation Speed Across GPUs and Batch Sizes

Table of Links

D. Additional Experiments

D.1. Speed of the Compilation Process

D.2. Runtime on Different GPUs

D.3. Runtime on Different Batch Sizes

You May Also Like

Woman shot 5 times by DHS to stare down Trump at State of the Union address

FCA komt in 2026 met aangepaste cryptoregels voor Britse markt

WLFI Drops 20% Weekly as Price Tests the Crucial $0.113 Support

Trending News

Woman shot 5 times by DHS to stare down Trump at State of the Union address

FCA komt in 2026 met aangepaste cryptoregels voor Britse markt

WLFI Drops 20% Weekly as Price Tests the Crucial $0.113 Support

G-P Names Aamir Khan as Chief Revenue Officer and GK Konduri as Chief Product Officer

Academic Publishing and Fairness: A Game-Theoretic Model of Peer-Review Bias

Crypto Prices