SymTax is a novel AI for citation recommendation. It mimics human behavior by using a "symbiotic" model and hyperbolic geometry to improve accuracy.SymTax is a novel AI for citation recommendation. It mimics human behavior by using a "symbiotic" model and hyperbolic geometry to improve accuracy.

How Symbiotic AI Can Find Your Paper's Next Great Citation

:::info Authors:

(1) Karan Goyal, IIIT Delhi, India (karang@iiitd.ac.in);

(2) Mayank Goel, NSUT Delhi, India (mayank.co19@nsut.ac.in);

(3) Vikram Goyal, IIIT Delhi, India (vikram@iiitd.ac.in);

(4) Mukesh Mohania, IIIT Delhi, India (mukesh@iiitd.ac.in).

:::

Abstract and 1. Introduction

  1. Related Work

  2. Proposed Dataset

  3. SymTax Model

    4.1 Prefetcher

    4.2 Enricher

    4.3 Reranker

  4. Experiments and Results

  5. Analysis

    6.1 Ablation Study

    6.2 Quantitative Analysis and 6.3 Qualitative Analysis

  6. Conclusion

  7. Limitations

  8. Ethics Statement and References

Appendix

Abstract

Citing pertinent literature is pivotal to writing and reviewing a scientific document. Existing techniques mainly focus on the local context or the global context for recommending citations but fail to consider the actual human citation behaviour. We propose SymTax[1], a three-stage recommendation architecture that considers both the local and the global context, and additionally the taxonomical representations of query-candidate tuples and the Symbiosis prevailing amongst them. SymTax learns to embed the infused taxonomies in the hyperbolic space and uses hyperbolic separation as a latent feature to compute query-candidate similarity. We build a novel and large dataset ArSyTa, containing 8.27 million citation contexts and describe the creation process in detail. We conduct extensive experiments and ablation studies to demonstrate the effectiveness and design choice of each module in our framework. Also, combinatorial analysis from our experiments shed light on the choice of language models (LMs) and fusion embedding, and the inclusion of section headings as a signal. Our proposed module that captures the symbiotic relationship solely leads to performance gains of 26.66% and 39.25% in Recall@5 w.r.t. SOTA on ACL-200 and RefSeer datasets, respectively. The complete framework yields a gain of 22.56% in Recall@5 wrt SOTA on our proposed dataset. The code and dataset are available at https://github.com/goyalkaraniit/SymTax.

\

1 Introduction

Citing has always been the backbone of scientific research. It enables trust and supports the claims made in the scientific document. The ever-growing increase in the amount of scientific literature makes it imperative to ease out the author’s task of finding a list of suitable papers to follow and cite (Johnson et al., 2018; Bornmann et al., 2021; Nane et al., 2023). Citation recommendation is such a process that helps researchers to be aware of the relevant research in respective domains. There are two different approaches to recommend citations: local (Dai et al., 2019; Ebesu and Fang, 2017; Huang et al., 2012; He et al., 2010), and global (Xie et al., 2021; Ali et al., 2021; Bhagavatula et al., 2018; Guo et al., 2017). Local citation recommendation is the task of finding and recommending the most relevant prior work, mainly corresponding to a specific text passage (also known as citation context), making it context-aware. On the other hand, global citation recommendation recommends a list of suitable prior art for the entire document, mainly given the title and abstract or the whole document. In this paper, we solve the task of local citation recommendation, which is more fine-grained and provides a solution to the actual challenge the author faces.

\ Figure 1: Proposed method consists of three essential modules. Prefetcher and Reranker takes query consisting of citation context, title, abstract and taxonomy of the citing paper as input. For each candidate paper (Ci), Enricher uses knowledge from citation network and Reranker generates the final top-K recommendations.

\ For example, consider the below citation excerpt:[2]

\ This can have extreme consequences in real-life scenarios such as autonomous cars CitX.

\ Examining the above context in isolation makes it challenging to predict the specific article cited at CitX. However, leveraging global information such as title, abstract, and taxonomy narrows down the search space while at the same time utilizing symbiotic relationship provides the model with an enriched pool of the most suitable candidates. Unlike ACL-200 and RefSeer datasets with curated contexts of fixed size, we curate richer contexts by incorporating complete information of adjoining sentences with respect to the citation sentence. To summarise, we make the following contributions:

\ • Dataset: We have constructed a dataset ArSyTa comprising 8.27 million comprehensive citation contexts across diverse domains, featuring richer density and relevant features, including taxonomy concepts, to facilitate the task of citation recommendation.

\ • Conceptual: We explore the concept of Symbiosis from Biology and draw its analogy with human citation behaviour in the scientific research ecosystem and select a better pool of candidates.

\ • Methodological: We propose a novel taxonomy fused reranker that subsequently learns projections of fused taxonomies in hyperbolic space and utilises hyperbolic separation as a latent feature.

\ • Empirical: We perform extensive experiments, ablations, and analysis on five datasets and six metrics, demonstrating SymTax consistently outperforms SOTA by huge margins.

\

:::info This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

:::

[1] Accepted in ACL 2024

\ [2] Excerpt is borrowed from Towards Consistency in Adversarial Classification of (Meunier et al., 2022). The cited article is An analysis of adversarial attacks and defenses on autonomous driving models of (Deng et al., 2020).

Market Opportunity
null Logo
null Price(null)
--
----
USD
null (null) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

The Federal Reserve cut interest rates by 25 basis points, and Powell said this was a risk management cut

The Federal Reserve cut interest rates by 25 basis points, and Powell said this was a risk management cut

PANews reported on September 18th, according to the Securities Times, that at 2:00 AM Beijing time on September 18th, the Federal Reserve announced a 25 basis point interest rate cut, lowering the federal funds rate from 4.25%-4.50% to 4.00%-4.25%, in line with market expectations. The Fed's interest rate announcement triggered a sharp market reaction, with the three major US stock indices rising briefly before quickly plunging. The US dollar index plummeted, briefly hitting a new low since 2025, before rebounding sharply, turning a decline into an upward trend. The sharp market volatility was closely tied to the subsequent monetary policy press conference held by Federal Reserve Chairman Powell. He stated that the 50 basis point rate cut lacked broad support and that there was no need for a swift adjustment. Today's move could be viewed as a risk-management cut, suggesting the Fed will not enter a sustained cycle of rate cuts. Powell reiterated the Fed's unwavering commitment to maintaining its independence. Market participants are currently unaware of the risks to the Fed's independence. The latest published interest rate dot plot shows that the median expectation of Fed officials is to cut interest rates twice more this year (by 25 basis points each), one more than predicted in June this year. At the same time, Fed officials expect that after three rate cuts this year, there will be another 25 basis point cut in 2026 and 2027.
Share
PANews2025/09/18 06:54
Zero Knowledge Proof Kicks Off 2026 With Presale Auction Plus $5M Reward – Could This Spark Major Movement?

Zero Knowledge Proof Kicks Off 2026 With Presale Auction Plus $5M Reward – Could This Spark Major Movement?

Most crypto markets concentrate on popular names bouncing back from the latest drops, yet one presale auction grabs focus for completely different reasons. Zero
Share
LiveBitcoinNews2026/01/15 05:00
Uphold’s Massive 1.59 Billion XRP Holdings Shocks Community, CEO Reveals The Real Owners

Uphold’s Massive 1.59 Billion XRP Holdings Shocks Community, CEO Reveals The Real Owners

Uphold, a cloud-based digital financial service platform, has come under the spotlight after on-chain data confirmed that it safeguards approximately 1.59 billion XRP. According to Uphold’s Chief Executive Officer (CEO), Simon McLoughlin, these tokens are fully owned by customers, not the exchange itself.  Uphold Clarifies Massive XRP Holdings The crypto community was taken by surprise […]
Share
Bitcoinist2025/09/18 00:30