This article examines potential validity threats in a controlled software engineering experiment, outlining risks to conclusion, internal, construct, and externalThis article examines potential validity threats in a controlled software engineering experiment, outlining risks to conclusion, internal, construct, and external

Assessing Validity Threats in Controlled Software Engineering Experiments

Abstract

1 Introduction

2 Original Study: Research Questions and Methodology

3 Original Study: Validity Threats

4 Original Study: Results

5 Replicated Study: Research Questions and Methodology

6 Replicated Study: Validity Threats

7 Replicated Study: Results

8 Discussion

9 Related Work

10 Conclusions And References

\

3 Original Study: Validity Threats

Based on the checklist provided by Wohlin et al. [52], the relevant threats to our study are next described.

3.1 Conclusion Validity

1. Random heterogeneity of participants. The use of a within-subjects experimental design ruled out the risk of the variation due to individual differences among participants being larger than the variation due to the treatment.

3.2 Internal Validity

  1. History and maturation:

    – Since participants apply different techniques on different artefacts, learning effects should not be much of a concern. – Experimental sessions take place on different days. Given the association of grades to performance in the experiment, we expect students will try to do better on the following day, causing that the technique applied the last day gets a better effectiveness. To avoid this, different participants apply techniques in different orders. This way we cancel out the threat due to order of application (avoiding that a given technique gets benefited from the maturation effect). In any case, an analysis of the chosen techniques per day is done to study maturation effect.

    \

  2. Interactions with selection. Different behaviours in different technique application groups are ruled out by randomly assigning participants to groups. However, we will check it analysing the behaviour of groups.

    \

  3. Hypothesis guessing. Before filling in the questionnaire, participants in the study were informed about the goal of the study only partially. We told them that we wanted to know their preferences and opinions, but they were not aware of our research questions. In any case, if this threat is occurring, it would mean that our results for perceptions are the best possible ones, and therefore would set an upper bound.

    \

  4. Mortality. The fact that several participants did not give consent to participate in the study has affected the balance of the experiment.

  5. Order of Training. Techniques are presented in the following order: CR, BT and EP. If this threat had taken place, then CR would be the most effective (or their favourite).

3.3 Construct Validity

  1. Inadequate preoperational explanation of cause constructs. Cause constructs are clearly defined thanks to the extensive training received by participants on the study techniques.
  2. Inadequate preoperational explanation of effect constructs. The question being asked is totally clear and should not be subject to possible misinterpretations. However, since the perception is subjective, there exists the possibility that the question asked is interpreted differently by different participants, and hence, perceptions are related to how the question is interpreted. This issue should be further investigated in future studies.

3.4 External Validity

  1. Interaction of setting and treatment. We tried to make the faults seeded in the programs as representative as possible of reality.
  2. Generalisation to other subject types. As we have already mentioned, the type of subjects our sample represents are developers with little or none previous experience in testing techniques and junior programmers. The extent to which the results obtained in this study can be generalised to other subject types needs to be investigated. Of all threats listed, the only one that could affect the validity of the results of this study in an industrial context is the one related to generalisation to other subject types.

:::info Authors:

  1. Sira Vegas
  2. Patricia Riofr´ıo
  3. Esperanza Marcos
  4. Natalia Juristo

:::

:::info This paper is available on arxiv under CC BY-NC-ND 4.0 license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

What Does Coinbase’s New Move Mean for Crypto and Finance?

What Does Coinbase’s New Move Mean for Crypto and Finance?

The post What Does Coinbase’s New Move Mean for Crypto and Finance? appeared on BitcoinEthereumNews.com. The most prominent cryptocurrency exchange in the United States, Coinbase, revealed a significant step on October 3rd by applying for national trust company status with the Office of the Comptroller of the Currency (OCC). This initiative aims to consolidate oversight for new product developments under a centralized federal structure, streamlining the integration of cryptocurrencies with […] Continue Reading:What Does Coinbase’s New Move Mean for Crypto and Finance? Source: https://en.bitcoinhaber.net/what-does-coinbases-new-move-mean-for-crypto-and-finance
Share
BitcoinEthereumNews2025/10/04 14:32
Tesla, Inc. (TSLA) Stock: Rises as Battery Cell Investment Expands at German Gigafactory

Tesla, Inc. (TSLA) Stock: Rises as Battery Cell Investment Expands at German Gigafactory

  TLDR TSLA trades near $485 after news of higher battery investment in Germany • Tesla targets up to 8 GWh of annual battery cell output by 2027 • Total cell factory
Share
Coincentral2025/12/17 04:37
‘One Battle After Another’ Hits Peak Popularity With 97% Rotten Tomatoes Score

‘One Battle After Another’ Hits Peak Popularity With 97% Rotten Tomatoes Score

The post ‘One Battle After Another’ Hits Peak Popularity With 97% Rotten Tomatoes Score appeared on BitcoinEthereumNews.com. ‘One Battle After Another’ is already being tipped for Oscar success Warner Bros It tends to take time to build interest in movies, even ones which seem to be sure-fire successes. In the era of social media, many movie fans want to read reviews from their counterparts rather than mainstream outlets. As a result, all but the biggest franchises usually only gain traction once they have been released. There are however exceptions to this rule and one is on the verge of release. Called One Battle After Another, it stars Leonardo DiCaprio as a washed-up delusional revolutionary who lives off grid with his teenage daughter. When one of his old enemies resurfaces and his daughter is abducted, the movie turns into a game of cat and mouse with car chases aplenty as well as the involvement of militias and mysterious organizations. The plot has a hint of 80s action extravaganza Commando but is actually loosely based on a book written by American author Thomas Pynchon. The movie hits a timely note as Pynchon is famous for sending up nefarious quasi-government organisations in his novels and director Paul Thomas Anderson continues that theme on screen. It has been seen as a political commentary and DiCaprio was a natural fit. His role combines the paranoia he portrayed in Howard Hughes biopic The Aviator with the comedic chases from his crime comedy Catch Me If You Can. DiCaprio is supported by an equally heavyweight cast led by Benicio del Toro as his accomplice and Sean Penn as his nemesis. One Battle After Another premiered in Los Angeles on September 8 and was met with universal acclaim. It has a critics’ rating of 97% on review aggregator Rotten Tomatoes but doesn’t yet have a single score from audiences as the film won’t be released…
Share
BitcoinEthereumNews2025/09/19 06:41