This article compares the proposed IIL method against SOTA incremental learning techniques on Cifar-100 and ImageNet-100.This article compares the proposed IIL method against SOTA incremental learning techniques on Cifar-100 and ImageNet-100.

SAGE Net Ablation Study: Analyzing the Impact of Input Sequence Length on Performance

2025/11/06 02:00
3분 읽기
이 콘텐츠에 대한 의견이나 우려 사항이 있으시면 crypto.news@mexc.com으로 연락주시기 바랍니다

Abstract and 1 Introduction

  1. Related works

  2. Problem setting

  3. Methodology

    4.1. Decision boundary-aware distillation

    4.2. Knowledge consolidation

  4. Experimental results and 5.1. Experiment Setup

    5.2. Comparison with SOTA methods

    5.3. Ablation study

  5. Conclusion and future work and References

    \

Supplementary Material

  1. Details of the theoretical analysis on KCEMA mechanism in IIL
  2. Algorithm overview
  3. Dataset details
  4. Implementation details
  5. Visualization of dusted input images
  6. More experimental results

5.2. Comparison with SOTA methods

Tab. 1 shows the test performance of different methods on the Cifar-100 and ImageNet-100. The proposed method achieves the best performance promotion after ten consecutive IIL tasks by a large margin with a low forgetting rate. Although ISL [13] which is proposed for a similar setting of learning from new sub-categories has a low forgetting rate, it fails on the new requirement of model enhancement. Attain a better performance on the test data is more important than forgetting on a certain data.

\ In the new IIL setting, all rehearsal-based methods including iCarl [22], PODNet [4], Der [31] and OnPro [29], not perform well. Old exemplars can cause memory overfitting and model bias [35]. Thus, limited old exemplars not always have a positive influence to the stability and plasticity [26], especially in the IIL task. Forgetting rate of rehearsal-based methods is high compared to other methods, which also explains their performance degradation on the test data. Detailed performance at each learning phase is shown in Fig. 4. Compared to other methods that struggle in resisting forgetting, our method is the only one that stably promotes the existing model on both of the two datasets.

\ Following ISL [13], we further apply our method on the incremental sub-population learning as shown in Tab. 2. Sub-population incremental learning is a special case of the IIL where new knowledge comes from the new subclasses. Compared to the SOTA ISL [13], our method is notably superior in learning new subclasses over long incremental steps with a comparable small forgetting rate. Noteworthy, ISL [13] use Continual Hyperparameter Framework (CHF) [3] searching the best learning rate (such as low to 0.005 in 15-step task) for each setting. While our method learns utilizing ISL pretrained base model with a fixed learning rate (0.05). Low learning rate in ISL reduces the forgetting but hinders the new knowledge learning. The proposed method well balances learning new from unseen subclasses and resisting forgetting on seen classes.

\

:::info Authors:

(1) Qiang Nie, Hong Kong University of Science and Technology (Guangzhou);

(2) Weifu Fu, Tencent Youtu Lab;

(3) Yuhuan Lin, Tencent Youtu Lab;

(4) Jialin Li, Tencent Youtu Lab;

(5) Yifeng Zhou, Tencent Youtu Lab;

(6) Yong Liu, Tencent Youtu Lab;

(7) Qiang Nie, Hong Kong University of Science and Technology (Guangzhou);

(8) Chengjie Wang, Tencent Youtu Lab.

:::


:::info This paper is available on arxiv under CC BY-NC-ND 4.0 Deed (Attribution-Noncommercial-Noderivs 4.0 International) license.

:::

\

면책 조항: 본 사이트에 재게시된 글들은 공개 플랫폼에서 가져온 것으로 정보 제공 목적으로만 제공됩니다. 이는 반드시 MEXC의 견해를 반영하는 것은 아닙니다. 모든 권리는 원저자에게 있습니다. 제3자의 권리를 침해하는 콘텐츠가 있다고 판단될 경우, crypto.news@mexc.com으로 연락하여 삭제 요청을 해주시기 바랍니다. MEXC는 콘텐츠의 정확성, 완전성 또는 시의적절성에 대해 어떠한 보증도 하지 않으며, 제공된 정보에 기반하여 취해진 어떠한 조치에 대해서도 책임을 지지 않습니다. 본 콘텐츠는 금융, 법률 또는 기타 전문적인 조언을 구성하지 않으며, MEXC의 추천이나 보증으로 간주되어서는 안 됩니다.

Roll the Dice & Win Up to 1 BTC

Roll the Dice & Win Up to 1 BTCRoll the Dice & Win Up to 1 BTC

Invite friends & share 500,000 USDT!