This deep-dive analysis proves why SymTax works, showing its 'symbiotic' enricher and taxonomy fusion are essential for its state-of-the-art performance.This deep-dive analysis proves why SymTax works, showing its 'symbiotic' enricher and taxonomy fusion are essential for its state-of-the-art performance.

A Quantitative and Qualitative Analysis of the SymTax Citation Recommendation Model

Abstract and 1. Introduction

  1. Related Work

  2. Proposed Dataset

  3. SymTax Model

    4.1 Prefetcher

    4.2 Enricher

    4.3 Reranker

  4. Experiments and Results

  5. Analysis

    6.1 Ablation Study

    6.2 Quantitative Analysis and 6.3 Qualitative Analysis

  6. Conclusion

  7. Limitations

  8. Ethics Statement and References

Appendix

6 Analysis

We conduct extensive analysis to assess further the modularity of SymTax, the importance of different modules, combinatorial choice of LM and taxonomy fusion, and the usage of hyperbolic space over Euclidean space. Furthermore, we analysed the effect of using section heading as an additional signal (shown in Appendix A).

\

6.1 Ablation Study

We perform an ablation study to highlight the importance of Symbiosis, taxonomy fusion and hyperbolic space. We consider two variants of SymTax, namely SciBERTvector and SPECTERgraph. For each of these two variants, we further conduct three experiments by (i) removing the Enricher module that works on the principle of Symbiosis, (ii) not considering the taxonomy attribute associated with the citation context and (iii) using Euclidean space to calculate the separation score.

\ As evident from Table 3, Symbiosis exclusion results in a drop of 21.40% and 24.45% in Recall@5 and NDCG respectively for SciBERTvector whereas for SPECTERgraph, it leads to a drop of 17.84% and 20.32% in Recall@5 and NDCG respectively. Similarly, taxonomy exclusion results in a drop of 34.94% and 27.88% in Recall@5 and NDCG respectively for SciBERTvector whereas for SPECTERgraph, it leads to a drop of 14.81% and 12.51% in Recall@5 and NDCG respectively. It is clear from Table 3 that the use of Euclidean space instead of hyperbolic space leads to performance drop across all metrics in both variants. Exclusion of Symbiosis impacts higher recall metrics more in comparison to excluding taxonomy fusion and hyperbolic space.

\ Table 4: Analysis on choice of LM and taxonomy fusion on 10k random samples from ArSyTa. Best results are highlighted in bold and second best are italicised.

\

6.2 Quantitative Analysis

We consider two available LMs, i.e. SciBERT and SPECTER, and the two types of taxonomy fusion, i.e. graph-based and vector-based. This results in four variants, as shown in Table 4. As evident from the results, SciBERTvector and SPECTERgraph are the best-performing variants. So, the combinatorial choice of LM and taxonomy fusion plays a vital role in model performance. The above observations can be attributed to SciBERT being a LM trained on plain scientific text. In contrast, SPECTER is a LM trained with Triplet loss using 1-hop neighbours of the positive sample from the citation graph as hard negative samples. So, SPECTER embodies graph information inside itself, whereas SciBERT does not.

\

6.3 Qualitative Analysis

We assess the quality of recommendations given by different algorithms by randomly choosing an example. Though random, we choose the example that has multiple citations in a given context so that we can present the qualitative analysis well by investigating the top-10 ranked predictions. As shown in Table 5, we consider an excerpt from Liu et al. (2020) that contains five citations. As we can see that Symtax correctly recommend three citations in the top-10, whereas HAtten only recommend one citation correctly at rank 1 and BM25 only suggest one correct citation at rank 10. The use of title is crucial to performance, as we can see that many recommendations consist of the words “BERT" and “Pretraining", which are the keywords present in the title. One more observation is that the taxonomy plays a vital role in recommendations. The taxonomy category of the query is ‘Computation and Language‘, and most of the recommended articles are from the same category. SymTax gives only one recommendation (Deep Residual Learning for Image Recognition) from a different category, i.e.“Computer Vision", whereas HAtten recommends three citations from different categories, i.e. (Deep Residual Learning for Image Recognition) from “Computer Vision" and (Batch Normalization, and Adam) from “Machine Learning".

Table 5: The table shows the top-10 citation recommendations given by various algorithms for a randomly chosen example from ArSyTa. Valid predictions are highlighted in bold. It clearly shows that SymTax (SciBERT_vector) is able to recommend three valid articles in the top-10. In contrast, each of the HAtten and BM25 could recommend only one valid article for the given citation context. # denotes the rank of the recommended citations.

\

:::info Authors:

(1) Karan Goyal, IIIT Delhi, India (karang@iiitd.ac.in);

(2) Mayank Goel, NSUT Delhi, India (mayank.co19@nsut.ac.in);

(3) Vikram Goyal, IIIT Delhi, India (vikram@iiitd.ac.in);

(4) Mukesh Mohania, IIIT Delhi, India (mukesh@iiitd.ac.in).

:::


:::info This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

:::

\

Market Opportunity
DeepBook Logo
DeepBook Price(DEEP)
$0.057368
$0.057368$0.057368
+3.12%
USD
DeepBook (DEEP) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

The Federal Reserve cut interest rates by 25 basis points, and Powell said this was a risk management cut

The Federal Reserve cut interest rates by 25 basis points, and Powell said this was a risk management cut

PANews reported on September 18th, according to the Securities Times, that at 2:00 AM Beijing time on September 18th, the Federal Reserve announced a 25 basis point interest rate cut, lowering the federal funds rate from 4.25%-4.50% to 4.00%-4.25%, in line with market expectations. The Fed's interest rate announcement triggered a sharp market reaction, with the three major US stock indices rising briefly before quickly plunging. The US dollar index plummeted, briefly hitting a new low since 2025, before rebounding sharply, turning a decline into an upward trend. The sharp market volatility was closely tied to the subsequent monetary policy press conference held by Federal Reserve Chairman Powell. He stated that the 50 basis point rate cut lacked broad support and that there was no need for a swift adjustment. Today's move could be viewed as a risk-management cut, suggesting the Fed will not enter a sustained cycle of rate cuts. Powell reiterated the Fed's unwavering commitment to maintaining its independence. Market participants are currently unaware of the risks to the Fed's independence. The latest published interest rate dot plot shows that the median expectation of Fed officials is to cut interest rates twice more this year (by 25 basis points each), one more than predicted in June this year. At the same time, Fed officials expect that after three rate cuts this year, there will be another 25 basis point cut in 2026 and 2027.
Share
PANews2025/09/18 06:54
Zero Knowledge Proof Kicks Off 2026 With Presale Auction Plus $5M Reward – Could This Spark Major Movement?

Zero Knowledge Proof Kicks Off 2026 With Presale Auction Plus $5M Reward – Could This Spark Major Movement?

Most crypto markets concentrate on popular names bouncing back from the latest drops, yet one presale auction grabs focus for completely different reasons. Zero
Share
LiveBitcoinNews2026/01/15 05:00
Uphold’s Massive 1.59 Billion XRP Holdings Shocks Community, CEO Reveals The Real Owners

Uphold’s Massive 1.59 Billion XRP Holdings Shocks Community, CEO Reveals The Real Owners

Uphold, a cloud-based digital financial service platform, has come under the spotlight after on-chain data confirmed that it safeguards approximately 1.59 billion XRP. According to Uphold’s Chief Executive Officer (CEO), Simon McLoughlin, these tokens are fully owned by customers, not the exchange itself.  Uphold Clarifies Massive XRP Holdings The crypto community was taken by surprise […]
Share
Bitcoinist2025/09/18 00:30