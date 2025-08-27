The Challenges of Data Collection for Software Engineering Research

Abstract and 1. Introduction

  1. Background and 2.1. Related Work

    2.2. The Impact of XP Practices on Software Productivity and Quality

    2.3. Bayesian Network Modelling

  2. Model Design

    3.1. Model Overview

    3.2. Team Velocity Model

    3.3. Defected Story Points Model

  3. Model Validation

    4.1. Experiments Setup

    4.2. Results and Discussion

  4. Conclusions and References

4.1. Experiments Setup

Collecting data from real projects to validate our model was a difficult task due to several reasons. Due to XP simplicity value, it is difficult to find company collecting information regarding their activities and practices. Moreover, most real XP projects are developed by private companies having restrictions on publishing their internal development process. In addition, there is no guarantee that the available data is sufficient for model validation.

\ Two XP projects provided enough data to test our model. The first one is the Repo Margining System project [4]. The second one is a controlled case study reported by Pekka Abrahamsson [17]. We will refer to this case study in the rest of this paper by Abrahamsson Case Study. The model input data for the two projects are shown in tables 2 and 3. The model internal parameters are summarized in table 4.

\ Table 2 Repo Margining System input data

\ Table 3 Abrahamsson Case Study input data

\ Table 4 Model internal parameters (U(a,b) refers to uniform distribution from a to b, while N(µ,σ) refers to normal distribution with mean µ and standard deviation σ)

\

:::info Authors:

(1) Mohamed Abouelelam, Software System Engineering, University of Regina, Regina, Canada;

(2) Luigi Benedicenti, Software System Engineering, University of Regina, Regina, Canada.

:::

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.

:::

\

