@phdthesis{12785, author = {Hadi Hemmati}, title = {Similarity-Based Test Case Selection: Toward Scalable and Practical Model-Based Testing}, abstract = {The growing complexity and size of software systems, along with the increasing role of software in everyday life, makes verification and validation, and testing in particular, essential in software engineering. High fault revealing power, with minimum cost, is the ultimate goal in software testing. Model-based testing (MBT) targets this goal by automating test generation in a systematic way from abstract models of the software under test. Automation reduces the test generation cost dramatically, but the total testing cost includes the cost of test execution and evaluation as well. In practice, there are limitations in testing budget both in terms of time and testing resources. A testing approach cannot be scalable and practical in large industrial systems, unless it addresses all dimensions of the testing cost. But the systematic test generation nature of MBT potentially results in large test suites with execution costs that far exceed the testing budget. Therefore, a mechanism for adjusting the size of the output test suite in MBT is an absolute necessity to ensure success in industry. This thesis proposes techniques that minimize the test suite size while preserving (to the maximum extent) its fault detection rate. The proposed family of techniques, called similarity-based test case selection, hypothesizes that the more diverse the test cases, the higher the fault detection rate. The thesis initiates with a systematic review on search-based techniques for test case generation, which is a starting point for identifying the potential approaches for search-based test case selection being used in similarity-based test case selection. Finding the most effective ways of defining similarity measures and selection algorithms constitutes the core of the thesis. The best selection techniques among different variants of the proposed similarity-based techniques are identified through rigorous empirical analyses on two industrial case studies. The cost-effectiveness of the approach is also compared to the existing selection techniques in the literature. Then, different influential factors on the effectiveness of the technique are examined through controlled experiments in order to gain insight on the analyzed problem, and to gain confidence in the reliability of the results. The main contribution of this thesis includes the proposal and evaluation of highly effective similarity-based test case selection techniques, which turns out to be extremely beneficial in two industrial contexts (up to 50\% reduction in the number of test cases required for detecting the same number of faults as to the current, best alternatives). Furthermore, the technique showed to be scalable with test suite size and also robust to variations in the fault detection rate of the test suite. Another contribution is a complementary study on estimating the best size for a test suite based on similarity comparisons among test cases. From a practical standpoint, this is a significant contribution to increase the usability of the proposed techniques, since testers are no longer required to select an arbitrary test suite size. In conclusion, similarity-based test case selection showed promising results on two industrial case studies with respect to minimizing the testing cost in MBT. The proposed technique is far more effective than existing techniques. It helps make MBT applicable on larger systems by adjusting the output test suite according to test budget. This research has therefore the potential to significantly impact how MBT is performed today.}, year = {2011}, publisher = {University of Oslo}, }