ExchangeDEX+

Buy Crypto Markets Spot FuturesBTC Earn Events

The article outlines the fundamental limitations of researching code review as a communication network—highlighting data gaps, potential bias in qualitative falsification, and the inherent challenges of modeling information diffusion within engineering teams.The article outlines the fundamental limitations of researching code review as a communication network—highlighting data gaps, potential bias in qualitative falsification, and the inherent challenges of modeling information diffusion within engineering teams.

Spotify Study Flags Key Limits in Measuring Information Flow in Code Reviews

Author: Hackernoon

Source: Hackernoon

2025/11/19 19:45

FLOW$0.09072+0.49%

Table Of Links

ABSTRACT

1 INTRODUCTION

2 BACKGROUND

2.1 Code Review As Communication Network

2.2 Code Review Networks

2.3 Measuring Information Diffusion in Code Review

3 RESEARCH DESIGN

3.1 Hypotheses

3.2 Measurement model

3.3 Measuring system

4 LIMITATIONS

ACKNOWLEDGMENTS AND REFERENCES

4 LIMITATIONS

In general, the chain of evidence of our study depends on two main factors: (1) the measurement model, measuring system, and actual measurement, and (2) the thoroughness of our discussion for qualitatively rejecting the hypotheses and, thereby, falsifying the theory of code review as communication network.

\ Although we will not be able to provide the complete raw data and only a prototypical extraction pipeline for Backstage, we believe that our thorough description of our measurement model, measuring system, and the actual measurement at Spotify provides a solid foundation for this line of research. Our replication package will contain the necessary yet anonymized data to reproduce and replicate our study beyond the context of Spotify.

\ However, as for every data-driven study, missing, incomplete, faulty, or unreliable data may significantly affect the validity of our study. To mitigate those risks, we conducted a pilot study in October 2023. Although we have not encountered such threats to validity, we cannot exclude data-related limitations. Therefore, this section will also cover the limitations that come from excluding or missing data once our data collection is completed.

\ However, we believe the two most critical limitations of our study lie in the nature of a qualitative falsification of theories. Although traditional statistical hypothesis tests also have their limitations and, ultimately, also represent an implicit and qualitative discussion, we believe that a discussion remains more prone to bias, most importantly because there are no clear criteria to reject the hypotheses upfront.

\ Such clear rejection and falsification criteria are not possible and meaningful upfront for this research; all thresholds, values, or estimates would be arbitrary. However, we believe that a comprehensive discussion makes a potential bias explicit and allows other researchers to conclude differently. Additionally, we will publish our measurement system and all intermediate anonymized data to enable other researchers to replicate our work.

\ Second, even if our data and a thorough discussion suggest falsifying our theory by rejecting one of the hypotheses, our modelling approach may not capture the (relevant) information diffusion in code review. Although we have strong indications that the explicit referencing of code reviews is an active and explicit information diffusion triggered by human assessment, we are not aware of empirical evidence that supports our assumption.

\ Although already discussed in Section 3, we emphasize again that the findings of the extent of information diffusion will not be generalizable. We do not believe that this is a major limitation of our research design since our argumentation is based on contradiction (reductio ad absurdum). This section will also include a detailed discussion of limitations that originate in incomplete or missing data when they become visible after the data collection and analysis.

ACKNOWLEDGMENTS

We thank Spotify for supporting this research and the anonymous reviewers for their valuable and extensive feedback. This work was supported by the KKS Foundation through the SERT Project (Research Profile Grant 2018/010) at Blekinge Institute of Technology.

REFERENCES

[1] Claudia Ayala et al. “Use and Misuse of the Term “Experiment” in Mining Software Repositories Research”. In: IEEE Transactions on Software Engineering 48 (11 Nov. 2022), pp. 4229– 4248. issn: 0098-5589. doi: 10.1109/TSE.2021.3113558. url: https://ieeexplore.ieee.org/document/9547824/.

[2] Alberto Bacchelli and Christian Bird. “Expectations, outcomes, and challenges of modern code review”. In: Proceedings - International Conference on Software Engineering (2013), pp. 712–721. issn: 02705257.

[3] Tobias Baum et al. “Factors influencing code review processes in industry”. In: 2016. isbn: 9781450342186.

[4] Amiangshu Bosu and Jeffrey C. Carver. “Impact of developer reputation on code review outcomes in OSS projects”. In: ACM, Sept. 2014, pp. 1–10. isbn: 9781450327749. doi: 10 . 1145/2652524.2652544.

[5] Amiangshu Bosu et al. “Process Aspects and Social Dynamics of Contemporary Code Review: Insights from Open Source Development and Industrial Practice at Microsoft”. In: IEEE Transactions on Software Engineering 43 (2017), pp. 56–75. issn: 00985589.

[6] Atacílio Cunha, Tayana Conte, and Bruno Gadelha. “Code Review is just reviewing code? A qualitative study with practitioners in industry”. In: Association for Computing Machinery, Sept. 2021, pp. 269–274. isbn: 9781450390613. doi: 10.1145/3474624.3477063.

[7] Michael Dorner et al. “Only Time Will Tell: Modelling Information Diffusion in Code Review with Time-Varying Hypergraphs”. In: ACM, Sept. 2022, pp. 195–204. isbn: 9781450394277. doi: 10.1145/3544902.3546254. url: https://dl.acm.org/doi/ 10.1145/3544902.3546254.

[8] Michael Dorner et al. The Upper Bound of Information Diffusion in Code Review. 2024. arXiv: 2306.08980 [cs.SE].

[9] Daniel Méndez Fernández and Jan-Hendrik Passoth. “Empirical software engineering: From discipline to interdiscipline”. In: Journal of Systems and Software 148 (Feb. 2019), pp. 170– 179. issn: 01641212. doi: 10.1016/j.jss.2018.11.019.

[10] Xinbo Gao et al. “A survey of graph edit distance”. In: Pattern Analysis and Applications 13 (1 Feb. 2010), pp. 113–129. issn: 1433-7541. doi: 10.1007/s10044-008-0141-y.

[11] Kazuki Hamasaki et al. “Who does what during a code review? Datasets of OSS peer review repositories”. In: 2013, pp. 49–52. isbn: 9781467329361. doi: 10.1109 /MSR.2013. 6624003.

[12] Toshiki Hirao et al. “The review linkage graph for code review analytics: a recovery approach and empirical study”. In: ACM, Aug. 2019, pp. 578–589. isbn: 9781450355728. doi: 10.1145/3338906.3338949. url: https://dl.acm.org/doi/10. 1145/3338906.3338949.

[13] ISO/IEC. International Vocabulary of Metrology–Basic and General Concepts and Associated Terms.

[14] David Kavaler, Premkumar Devanbu, and Vladimir Filkov. “Whom are you going to call? Determinants of @-mentions in Github discussions”. In: Empirical Software Engineering 24 (6 Dec. 2019), pp. 3904–3932. issn: 1382-3256. doi: 10.1007/ s10664-019-09728-3. url: http://link.springer.com/10.1007/ s10664-019-09728-3.

[15] Lisha Li et al. “How Are Issue Units Linked? Empirical Study on the Linking Behavior in GitHub”. In: vol. 2018-December. IEEE, Dec. 2018, pp. 386–395. isbn: 978-1-7281-1970-0. doi: 10.1109/APSEC.2018.00053. url: https://ieeexplore.ieee.org/ document/8719531/.

[16] Audris Mockus and James D. Herbsleb. “Expertise browser: a quantitative approach to identifying expertise”. In: ACM Press, 2002, p. 503. isbn: 158113472X.

[17] Luca Pascarella et al. “Information Needs in Contemporary Code Review”. In: Proceedings of the ACM on Human-Computer Interaction 2 (CSCW Nov. 2018), pp. 1–27. issn: 2573-0142. doi: 10.1145/3274404. url: https://dl.acm.org/doi/10.1145/ 3274404.

[18] Peter C. Rigby and Christian Bird. “Convergent contemporary software peer review practices”. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering

ESEC/FSE 2013 (2013), p. 202.

[19] Caitlin Sadowski et al. “Modern Code Review: A Case Study at Google”. In: Proceedings of the 40th International Conference on Software Engineering Software Engineering in Practice

ICSE-SEIP ’18 (2018), pp. 181–190.

[20] Darja Šmite et al. “Decentralized decision-making and scaled autonomy at Spotify”. In: Journal of Systems and Software 200 (June 2023), p. 111649. issn: 01641212. doi: 10.1016/j.jss. 2023.111649.

[21] F. Thung et al. “Network Structure of Social Coding in GitHub”. In: IEEE, Mar. 2013, pp. 323–326. isbn: 978-0-7695-4948-4. doi: 10.1109/CSMR.2013.41.

[22] Xin Yang et al. “Understanding OSS peer review roles in peer review social network (PeRSoN)”. In: Proceedings - AsiaPacific Software Engineering Conference, APSEC 1 (2012), pp. 709–712. issn: 15301362. doi: 10.1109/APSEC.2012.63.

[23] Yang Zhang et al. “A Exploratory Study of @-Mention in GitHub’s Pull-Requests”. In: vol. 1. IEEE, Dec. 2014, pp. 343– 350. isbn: 978-1-4799-7425-2. doi: 10.1109/APSEC.2014.58. url: http://ieeexplore.ieee.org/document/7091329/.

:::info Authors:

Michael Dorner
Daniel Mendez
Ehsan Zabardast
Nicole Valdez
Marcin Floryan

:::

:::info This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

:::

Market Opportunity

FLOW Price(FLOW)

$0.09072

$0.09072$0.09072

+0.04%

USD

FLOW (FLOW) Live Price Chart

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.