This article presents an ablation study confirming that an adaptive, per-step-per-layer learning rate is essential for the RECKONING framework.This article presents an ablation study confirming that an adaptive, per-step-per-layer learning rate is essential for the RECKONING framework.

Ablation Study Confirms Necessity of Dynamic Rates for RECKONING Performance

Abstract and 1. Introduction

  1. Background

  2. Method

  3. Experiments

    4.1 Multi-hop Reasoning Performance

    4.2 Reasoning with Distractors

    4.3 Generalization to Real-World knowledge

    4.4 Run-time Analysis

    4.5 Memorizing Knowledge

  4. Related Work

  5. Conclusion, Acknowledgements, and References

\ A. Dataset

B. In-context Reasoning with Distractors

C. Implementation Details

D. Adaptive Learning Rate

E. Experiments with Large Language Models

D Adaptive Learning Rate

Prior works [3, 4] show that a fixed learning rate shared across steps and parameters does not benefit the generalization performance of the system. Instead, [3] recommends learning a learning rate for

\ Table 8: An example of 6-hop reasoning from the CLUTRR-SG dataset.

\ Table 9: Example of distractors (black) and relevant knowledge (red) in the ProofWriter dataset.

\ each network layer and each adaptation step in the inner loop. The layer parameters can learn to adjust the learning rates dynamically at each step. To control the learning rate α in the inner loop adaptively, we define α as a set of adjustable variable: α = {α0, α1, …αL}, where L is the number of layers and for every l = 0, …, L, αl is a vector with N elements given a pre-defined inner loop step number N. The inner loop update equation then becomes

\

\

\ Are dynamic learning rates necessary for RECKONING’s performance? Following prior works on meta-learning [3, 4], we dynamically learn a set of per-step-per-layer learning rates for RECKONING. In this ablation study, we analyze whether dynamic learning rates for the inner loop effectively improve the outer loop reasoning performance. Similarly, we fix other experimental settings and set the number of inner loop steps to 4. As Figure 8 shows, when using a static learning rate (i.e., all layers and inner loop steps share a constant learning rate), the performance drops by a large margin (average drop of 34.2%). The performance drop becomes more significant on questions requiring more reasoning hops (45.5% drop for 4-hop and 39.5% drop for 6-hop), demonstrating the importance of using a dynamic learning rate in the inner loop of our framework.

\ Figure 8: We study how much the dynamic learning rate in the inner loop contributes to the outer loop performance. We fix all the hyperparameters except the option of using the dynamic or fixed learning rate. We conduct the analysis using the CLUTRR-SG dataset since it is more complex and difficult (lower random performance).

\

:::info Authors:

(1) Zeming Chen, EPFL (zeming.chen@epfl.ch);

(2) Gail Weiss, EPFL (antoine.bosselut@epfl.ch);

(3) Eric Mitchell, Stanford University (eric.mitchell@cs.stanford.edu)';

(4) Asli Celikyilmaz, Meta AI Research (aslic@meta.com);

(5) Antoine Bosselut, EPFL (antoine.bosselut@epfl.ch).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Market Opportunity
Solayer Logo
Solayer Price(LAYER)
$0.1787
$0.1787$0.1787
+1.30%
USD
Solayer (LAYER) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Coinbase Data Breach Fallout: Former Employee Arrest in India Over Customer Data Case Raises Bitcoin Security Concerns

Coinbase Data Breach Fallout: Former Employee Arrest in India Over Customer Data Case Raises Bitcoin Security Concerns

The post Coinbase Data Breach Fallout: Former Employee Arrest in India Over Customer Data Case Raises Bitcoin Security Concerns appeared on BitcoinEthereumNews.
Share
BitcoinEthereumNews2025/12/27 10:36
Burmese war amputees get free 3D-printed prostheses, thanks to Thailand-based group

Burmese war amputees get free 3D-printed prostheses, thanks to Thailand-based group

PROSTHETIC FEET. Silicon foot covers fitted with metal rods found in the prosthetic production unit in Mae Tao Clinic. A good prosthetic foot must absorb impact
Share
Rappler2025/12/27 10:00
China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise

China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise

The post China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise appeared on BitcoinEthereumNews.com. China Blocks Nvidia’s RTX Pro 6000D as Local Chips Rise China’s internet regulator has ordered the country’s biggest technology firms, including Alibaba and ByteDance, to stop purchasing Nvidia’s RTX Pro 6000D GPUs. According to the Financial Times, the move shuts down the last major channel for mass supplies of American chips to the Chinese market. Why Beijing Halted Nvidia Purchases Chinese companies had planned to buy tens of thousands of RTX Pro 6000D accelerators and had already begun testing them in servers. But regulators intervened, halting the purchases and signaling stricter controls than earlier measures placed on Nvidia’s H20 chip. Image: Nvidia An audit compared Huawei and Cambricon processors, along with chips developed by Alibaba and Baidu, against Nvidia’s export-approved products. Regulators concluded that Chinese chips had reached performance levels comparable to the restricted U.S. models. This assessment pushed authorities to advise firms to rely more heavily on domestic processors, further tightening Nvidia’s already limited position in China. China’s Drive Toward Tech Independence The decision highlights Beijing’s focus on import substitution — developing self-sufficient chip production to reduce reliance on U.S. supplies. “The signal is now clear: all attention is focused on building a domestic ecosystem,” said a representative of a leading Chinese tech company. Nvidia had unveiled the RTX Pro 6000D in July 2025 during CEO Jensen Huang’s visit to Beijing, in an attempt to keep a foothold in China after Washington restricted exports of its most advanced chips. But momentum is shifting. Industry sources told the Financial Times that Chinese manufacturers plan to triple AI chip production next year to meet growing demand. They believe “domestic supply will now be sufficient without Nvidia.” What It Means for the Future With Huawei, Cambricon, Alibaba, and Baidu stepping up, China is positioning itself for long-term technological independence. Nvidia, meanwhile, faces…
Share
BitcoinEthereumNews2025/09/18 01:37