WormHole is a novel algorithm that bridges the gap between traversal-based and index-based graph search methods. It enables fast, scalable shortest-path queries on massive real-world networks without requiring full graph access or large preprocessing overhead. By exploiting core-periphery structures, WormHole constructs a compact sublinear index, delivers near-exact paths, and can be combined with existing methods for even faster results.WormHole is a novel algorithm that bridges the gap between traversal-based and index-based graph search methods. It enables fast, scalable shortest-path queries on massive real-world networks without requiring full graph access or large preprocessing overhead. By exploiting core-periphery structures, WormHole constructs a compact sublinear index, delivers near-exact paths, and can be combined with existing methods for even faster results.

Bridging the Gap Between BFS and Indexing for Large Graphs

:::info Authors:

(1) Talya Eden, Bar-Ilan University (talyaa01@gmail.com);

(2) Omri Ben-Eliezer, MIT (omrib@mit.edu);

(3) C. Seshadhri, UC Santa Cruz (sesh@ucsc.edu).

:::

Abstract and 1. Introduction

1.1 Our Contribution

1.2 Setting

1.3 The algorithm

  1. Related Work

  2. Algorithm

    3.1 The Structural Decomposition Phase

    3.2 The Routing Phase

    3.3 Variants of WormHole

  3. Theoretical Analysis

    4.1 Preliminaries

    4.2 Sublinearity of Inner Ring

    4.3 Approximation Error

    4.4 Query Complexity

  4. Experimental Results

    5.1 WormHole𝐸, WormHole𝐻 and BiBFS

    5.2 Comparison with index-based methods

    5.3 WormHole as a primitive: WormHole𝑀

References

ABSTRACT

Computing distances and finding shortest paths in massive realworld networks is a fundamental algorithmic task in network analysis. There are two main approaches to solving this task. On one hand are traversal-based algorithms like bidirectional breadth-first search (BiBFS), which have no preprocessing step but are slow on individual distance inquiries. On the other hand are indexing-based approaches, which create and maintain a large index. This allows for answering individual inquiries very fast; however, index creation is prohibitively expensive even for moderately large networks. For a graph with 30 million edges, the index created by the state-of-the-art is about 40 gigabytes. We seek to bridge these two extremes: quickly answer distance inquiries without the need for costly preprocessing.

\

\ WormHole offers several additional advantages over existing methods: (i) it does not require reading the whole graph and can thus be used in settings where access to the graph is ratelimited; (ii) unlike the vast majority of index-based algorithms, it returns paths, not just distances; and (iii) for faster inquiry times, it can be combined effectively with other index-based solutions, by running them only on the sublinear core.

1 INTRODUCTION

Scalable computation of distances and shortest paths in a large network is one of the most fundamental algorithmic challenges in graph mining and graph learning tasks, with applications across science and engineering. Examples of such applications include the identification of important genes or species in biological and ecological networks [18], driving directions in road networks [1–3], redistribution of task processing from mobile devices to cloud [59], computer network design and security [24, 30, 31, 58], and identifying a set of users with the maximum influence in a social network [32, 56], among many others. Thus, a long line of work [5, 21, 25, 28, 62] has developed over the years, constructing scalable algorithms for distance computation for a variety of real-life tasks.

\ The simplest methods for answering a shortest path inquiry (𝑠,𝑡) use traversals, among which the most basic is a breadth first search (BFS) starting from 𝑠 until we reach 𝑡. However, the inquiry time for BFS is linear in the network size, which is much too slow for real-world networks.1 A popular modification, Bidirectional BFS(BiBFS), runs BFS fromboth𝑠 and𝑡, alternatingbetween the two, until both ends meet.It has well been observed in the literature that BiBFS performs surprisingly well for shortest path inquiries on a wide range of networks (see, e.g., [8, 12, 62] and the many references within). Because BiBFS does not require any prior knowledge on the network structure, it is suitable when the number of shortest path inquiries being made is relatively small. However, pure traversal-based approaches do not scale well when one is required to answer a large number of shortest pair inquiries. As we show in Figure 1, BiBFS ends up seeing the whole graph within just a few hundred inquiries.

\ Alongline ofmodern approaches tackles the distance computation problem in a fundamentally different manner, by preprocessing the network and creating an index. The index, in turn, supports extremely fast real-time computation of distances. This line of work has been been investigated extensively in recent years, with Pruned Landmark Labeling (PLL, Akiba et al. [5]) being perhaps the most influential approach.

\ In virtually allindex-based methods, pre-processinginvolves choosing a subset L of nodes, called landmarks; computing all shortest paths among them; and keeping an index of the distance of every node in the network to every landmark. Thus, the space requirement for the index is at least of order 𝑛· |L|, where 𝑛 is the total number of nodes. Naively, this memory requirement can be as bad as quadratic in 𝑛. Despite several improvements to beat the quadratic footprint, existing hub

\ Figure 1: We illustrate the average running time per shortest path inquiry for three variants of WormHole, as compared to index-based (MLL [62] and PLL [5]), and traversal-based (BiBFS) competitors. PLL only finds distances, not paths. DNF marks that the preprocessing (index construction) step did not finish. All three of our variants outperformed BiBFS consistently. Index based solutions, on the other hand, generally failed on medium to large graphs as the index construction phase timed out. We note that even in smaller graphs where the index construction of MLL and PLL completed successfully, our fastest variant WormHole𝑀 has comparable per-inquiry running time.

\ and landmark-based approaches methods continue to have high cost, and can become infeasible even for moderately-sized graphs [40].

\ Notably, most index-based approaches only return distances in the graph, and not the paths themselves. The first concrete systematic investigation of solutions outputting shortest paths was made by Zhang et al. [62]. Their work points out that while existing index-based solutions can be adapted to also output shortest paths, these adaptations incur a very high additional space cost on top of that required for distance computations. The authors of [62] then proposed a new approach called monotonic landmark labelling (MLL) for saving on the index construction space cost. While their algorithm is the current state of the art for this problem, it is again index-based, meaning that the preprocessing cost is still rather expensive. Improving the computational complexity of the construction phase remains a fundamental challenge.

\ Beyond the computational constraints, it is sometimes simply unrealistic to assume access to the whole network; examples of scenarios where access is only given via limited query access include, e.g., social network analysis through external APIs [10], page downloads in web graphs [14], modern lightweight monitoring solutions in enterprise security [26, 60], and state space exploration in software testing, reinforcement learning and robotics [27], among many others. Existing indexing-based approaches are unsuitable for these scenarios since they require reading the whole graph as a prerequisite. Traversal-based methods such as BiBFS are suitable, but as mentioned they do not scale well if one requires multiple distance computations.

\ The limitations of indexing-based and traversal-based methods give rise to a natural question of whether there is a middle ground solution, with preprocessing that is more efficient than in index-based approaches and inquiry time that is faster than in traversal-based approaches. Namely, we ask:

\ Is it possible to answer shortest-path inquiries in large networks very quickly, without constructing an expensive index, or even seeing the whole graph?

\ A general solution for any arbitrary graph is perhaps impossible; however, real-world social and information networks admit order and structure that can be exploited. In this work we address this question positively for a slightly relaxed version of the shortest path problem on such networks. Inspired by the core-periphery structure of large networks [43, 50, 52, 63], we provide a solution which constructs a sublinearly-sized index and answers inquiries by querying a strictly sublinear subset of vertices. In particular, our solution does not need to access the whole network. The algorithm returns approximate shortest paths, where the approximation error is additive and very small (almost always zero or one). In practice, the setup time is negligible (a few minutes for billion-edges graphs), and inquiry times improve on those of BiBFS. Moreover, it can be easily combined with existing index-based solutions, to further improve on the inquiry times. We also include theoretical results that shed light on the empirical performance.

\

:::info This paper is available on arxiv under CC BY 4.0 license.

:::

[1] To avoid confusion, throughout this paper we use the term inquiry to indicate a request (arriving as an input in real-time to our data structure) to compute a short path SP(𝑠,𝑡) between 𝑠 and 𝑡. The term query refers to the act, taken by the algorithm itself, of retrieving information about a specific node. For more details on the query model we consider, see §1.2.)

Market Opportunity
Index Cooperative Logo
Index Cooperative Price(INDEX)
$0,4909
$0,4909$0,4909
+%1,57
USD
Index Cooperative (INDEX) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Why Emotional Security Matters as Much as Physical Care for Seniors

Why Emotional Security Matters as Much as Physical Care for Seniors

You ensure that your aging parents or loved ones get the best physical care. Regular checkups, nutritious meals, and safe living conditions are key. These basics
Share
Techbullion2026/01/23 19:54
Wall Street braced for a private credit meltdown. The risk is rising

Wall Street braced for a private credit meltdown. The risk is rising

The post Wall Street braced for a private credit meltdown. The risk is rising appeared on BitcoinEthereumNews.com. The sudden collapse last fall of a string of
Share
BitcoinEthereumNews2026/01/23 20:21
Vitalik Buterin lays out new Ethereum roadmap at EDCON

Vitalik Buterin lays out new Ethereum roadmap at EDCON

The post Vitalik Buterin lays out new Ethereum roadmap at EDCON appeared on BitcoinEthereumNews.com. At EDCON 2025 in Osaka, Ethereum co-founder Vitalik Buterin delivered fresh details of Ethereum’s technical roadmap, delineating both short-term scaling goals and longer-term protocol transformations. The immediate priority, according to slides from the presentation, is scaling at the L1 level by raising the gas limit while maintaining decentralization. Tools such as block-level access lists, ZK-EVMs, gas repricing, and slot optimization were highlighted as means to improve throughput and efficiency. A central theme of the presentation was privacy, divided into protections for on-chain “writes” (transactions, voting, DeFi operations) and “reads” (retrieving blockchain state). Write privacy could be achieved through client-side zero-knowledge proofs, encrypted voting, and mixnet-based transaction relays. Read privacy efforts include trusted execution environments, private information retrieval techniques, dummy queries to obscure access patterns, and partial state nodes that reveal only necessary data. These measures aim to reduce information leakage across both ends of user interaction. In the medium term, Ethereum’s focus shifts to cross-Layer-2 interoperability. Vitalik described trustless L2 asset transfers, proof aggregation, and faster settlement mechanisms as key milestones toward a seamless rollup ecosystem. Faster slots and stronger finality, supported by techniques like erasure coding and three-stage finalization (3SF), are also in scope to enhance responsiveness and security. The roadmap also includes Stage 2 rollup advancements to strengthen verification efficiency, alongside a call for broader community participation to help build and maintain these improvements. The long-term “Lean Ethereum” blueprint emphasizes security, simplicity and optimization, with ambitions for quantum-resistant cryptography, formal verification of the protocol, and adoption of ideal primitives for hashing, signatures, and zero-knowledge proofs. Buterin stressed that these improvements are not just for scalability but to make Ethereum a stable, trustworthy foundation for the broader decentralized ecosystem. This is a developing story. This article was generated with the assistance of AI and reviewed by editor Jeffrey Albus before publication.…
Share
BitcoinEthereumNews2025/09/18 03:22