ExchangeDEX+

Buy Crypto Markets Spot FuturesBTC Earn Event Center

Offline experiments confirm the feasibility of gradient-based contribution estimation by analyzing noise effects on CIFAR-10.Offline experiments confirm the feasibility of gradient-based contribution estimation by analyzing noise effects on CIFAR-10.

Offline Generative Active Learning: Feasibility and Limitations

Author: Hackernoon

Source: Hackernoon

2025/12/05 11:00

Table of Links

Abstract and 1 Introduction

Related work

2.1. Generative Data Augmentation

2.2. Active Learning and Data Analysis
Preliminary
Our method

4.1. Estimation of Contribution in the Ideal Scenario

4.2. Batched Streaming Generative Active Learning
Experiments and 5.1. Offline Setting

5.2. Online Setting
Conclusion, Broader Impact, and References

\

A. Implementation Details

B. More ablations

C. Discussion

D. Visualization

5. Experiments

First, we perform some analytical experiments in an offline setting(as discussed in Remark 4.6) to verify the feasibility of our method and also to facilitate a better understanding of our method for readers. Then, we conduct the main experiments under the online setting, compared with our baseline. Key ablation studies are also conducted to substantiate the efficiency of our method. Detailed information about the implementation can be found in Appendix A.

5.1. Offline Setting

5.1.1. CIFAR-10

\ As shown in Figure 2, it is observable that with the escalating scale of noise, the distribution of contributions progressively shifts to the left. This indicates that excessive noise tends to negatively impact the model. Note that the split with a noise of 0 is our training set, so we can see that the contribution values of these samples are concentrated around zero. In other words, these samples can no longer bring positive effects to the model because they have been fully utilized in previous training. This observation is consistent with some previous active learning work (Cai et al., 2013; Ash et al., 2021; Saran et al., 2023), where they also estimate the amount of information or the difficulty level of samples through gradients. However, they do not consider the positive or negative contributions but only select samples with larger absolute values. We further conduct quantitative experiments, as shown in Table 1, to prove that using our method to select data can effectively improve the

\ Figure 2. The distribution of contributions under different noise scales.

\ Table 1. Using our method to select samples brings improvement to the model.

\ performance of the model.

\ 5.1.2. LVIS

\ This gradient then serves to estimate each instance’s contribution. Subsequently, we rank these instances in decreasing order of their contribution, facilitating per-image analysis. As an illustrative example, we use a ‘bun’ category from the LVIS, because we discover that Stable Diffusion does not perform optimally within this category, often leading to confusion between ‘bun’ and ‘bunny’, thereby resulting in the generation of ambiguous data. As depicted in Figure 3, it can be observed that the instances having the most significant contributions are nearly unambiguous, whereas the instances with minimal contributions are mostly incorrect, resulting in rabbit images being generated. Therefore, through our method, we can effectively filter out the generated data with ambiguity.

\ To verify the indispensability of online learning, we first use the offline method to filter the generated data for training and compare it with our baseline. As shown in Figure 4, the offline method can only bring a slight improvement to the

\ Table 2. Main results on LVIS. “+CLIP” means using CLIP to filter the generated data.

\ Figure 3. The best and worst samples found using our contribution estimation function for a LVIS class ‘bun’.

\ final model performance. In addition, in the early stage of model training, this performance improvement is still quite obvious, but with the training process, this performance improvement gradually diminishes. We conjecture that this trend is likely due to the offline contribution estimation’s reliance on the initial model, and as the model undergoes training, the parameters change significantly, which leads to the inaccuracy of the offline contribution estimation. Therefore, the necessity arises for online contribution estimation.

:::info Authors:

(1) Muzhi Zhu, with equal contribution from Zhejiang University, China;

(2) Chengxiang Fan, with equal contribution from Zhejiang University, China;

(3) Hao Chen, Zhejiang University, China (haochen.cad@zju.edu.cn);

(4) Yang Liu, Zhejiang University, China;

(5) Weian Mao, Zhejiang University, China and The University of Adelaide, Australia;

(6) Xiaogang Xu, Zhejiang University, China;

(7) Chunhua Shen, Zhejiang University, China (chunhuashen@zju.edu.cn).

:::

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.

:::

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

Crypto Prices

Ethereum

ETH

$3 314,83

$3 314,83$3 314,83

-0,45%

Bitcoin

BTC

$95 151,13

$95 151,13$95 151,13

-0,44%

Solana

SOL

$142,85

$142,85$142,85

-1,02%

XRP

$2,0558

$2,0558$2,0558

-1,25%

Binance Coin

BNB

$949,41

$949,41$949,41

-0,61%

Offline Generative Active Learning: Feasibility and Limitations

Table of Links

5. Experiments

5.1. Offline Setting

You May Also Like

[Tambay] Tres niños na bagitos

Massive Whale Buying Spree Could Trigger XRP Supply Shock as Exchange Balances Drop to Lowest Since 2023 ⋆ ZyCrypto

US Senate Bill Could Unlock Institutional Floodgates for XRP, Solana and Dogecoin

Trending News

[Tambay] Tres niños na bagitos

Massive Whale Buying Spree Could Trigger XRP Supply Shock as Exchange Balances Drop to Lowest Since 2023 ⋆ ZyCrypto

US Senate Bill Could Unlock Institutional Floodgates for XRP, Solana and Dogecoin

Ondo Finance Launches USDY Yieldcoin on Stellar, Bringing Tokenized U.S. Treasuries to Users

Seized Bitcoin From Samourai Wallet Case Not Sold And Will Remain In Strategic Reserve, White House Confirms

Quick Reads

How Collective Intelligence Is Reshaping Crypto's Future: The Secret Behind BEEG Prediction Market Becoming Sui Ecosystem's Ultimate Compass in 2026

2026 Passive Income Playbook: How BEEG Liquid Staking Doubles Your Sui Ecosystem Rewards

From Meme to GameFi Ruler: How BEEG is Rewriting Sui Chain Gaming Payment Standards In 2026

Why BEEG Has Become the "Hard Currency" of Sui's Social Protocol in 2026: Unveiling the Tipping Economy Revolution

XRP Price Prediction 2026: Why Seasoned Traders Choose MEXC to Position for XRP's Next Bull Run

Crypto Prices