Cyber-Physical Systems (CPSs) are a new generation of systems which integrate software with physical processes. The increasing complexity of these systems, combined with the uncertainty in their interactions with the physical world, makes the definition of effective test oracles especially challenging, facing the well known test oracle problem. Metamorphic testing has shown great potential to alleviate the test oracle problem by exploiting the relations among the inputs and outputs of different executions of the system, so-called metamorphic relations (MRs). In this article, we propose a MR pattern called Performance Variation (PV) for the identification of performance-driven MRs, and we show its applicability in two CPSs from different domains: automated navigation systems and elevator control systems. For the evaluation, we assessed the effectiveness of this approach for detecting failures in an open source simulation-based autonomous navigation system, as well as in an industrial case study from the elevation domain. We derive concrete MRs based on the PV pattern for both case studies and we evaluate their effectiveness with seeded faults. Results show that the approach is effective at detecting over 88% of the seeded faults, while keeping the ratio of false positives at 4% or lower.
Authors: Jon Ayerdi, Pablo Valle, Sergio Segura, Aitor Arrieta, Goiuria Sagardui and Maite Arratibel
Title of the source: IEEE Transactions on Reliability
Year: 2022More info
Evaluating System-Level Test Generation for Industrial Software: A Comparison between Manual, Combinatorial and Model-Based Testing
Adequate testing of safety-critical systems is vital to ensure correct functional and non-functional operations. Previous research has shown that testing of such systems requires a lot of effort, thus automated testing techniques have found a certain degree of success. However, automated testing has not replaced the need for manual testing, rather a common industrial practice exhibits a balance between automated and manual testing. In this respect, comparing manual testing with automated testing techniques continues to be an interesting topic to investigate. The need for this investigation is most apparent at system-level testing of industrial systems, where there is a lack of results on how different testing techniques perform with respect to both structural and system-level metrics such as Modified Condition/Decision Coverage (MC/DC) and requirement coverage. In addition to the coverage, the cost of these techniques will also determine their efficiency and thus practical viability. In this paper, we have developed cost models for efficiency measurement and performed an experimental evaluation of manual testing, model-based testing (MBT) and combinatorial testing (CT) in terms of MC/DC and requirement coverage. The evaluation is done in an industrial context of a safety-critical system that controls several functions on-board the passenger trains. We have reported the dominant conditions of MC/DC affected by each technique while generating MC/DC adequate test suites. Moreover, we investigated differences and overlaps of test cases generated by each of the three techniques. The results showed that all test suites achieved 100% requirement coverage except the test suite generated by pairwise testing strategy. However, MBT-generated test suites were more MC/DC adequate and provided a higher number of both similar and unique test cases. Moreover, unique test cases generated by MBT had an observable affect on MC/DC, which will complement manual testing to increase MC/DC coverage. The least dominant MC/DC condition fulfilled by the generated test cases by all three techniques is the ‘independent effect of a condition on the outcomes of a decision’. Lastly, the evaluation also showed CT as the most efficient testing technique amongst the three in terms of time ted outcomes.
Authors: Muhammad Nouman Zafar, Wasif Afzal, Eduard Paul Enoiu
Title of the source: The 3rd ACM/IEEE International Conference on Automation of Software Test 2022
Software testing is key for quality assurance of embedded systems. However, with increased development pace, the amount of test results data risks growing to a level where exploration and visualization of the results are unmanageable. This paper covers a tool, Tim, implemented at a company developing embedded systems, where software development occurs in parallel branches and nightly testing is partitioned over software branches, test systems and test cases. Tim aims to replace a previous solution with problems of scalability, requirements and technological flora. Tim was implemented with a reference group over several months. For validation, data were collected both from reference group meetings and logs from the usage of the tool. Data were analyzed quantitatively and qualitatively. The main contributions from the study include the implementation of eight views for test results exploration and visualization, the identification of four solutions patterns for these views (filtering, aggregation, previews and comparisons), as well as six challenges frequently discussed at reference group meetings (expectations, anomalies, navigation, integrations, hardware details and plots). Results are put in perspective with related work and future work is proposed, e.g., enhanced anomaly detection and integrations with more systems such as risk management, source code and requirements repositories.
Authors: Per Erik Strandberg, Wasif Afzal, Daniel Sundmark
Title of the source: International Journal on Software Tools for Technology Transfer
A novel methodology to classify test cases using natural language processing and imbalanced learning
Detecting the dependency between integration test cases plays a vital role in the area of software test optimization. Classifying test cases into two main classes – dependent and independent – can be employed for several test optimization purposes such as parallel test execution, test automation, test case selection and prioritization, and test suite reduction. This task can be seen as an imbalanced classification problem due to the test cases’ distribution. Often the number of dependent and independent test cases is uneven, which is related to the testing level, testing environment and complexity of the system under test. In this study, we propose a novel methodology that consists of two main steps. Firstly, by using natural language processing we analyze the test cases’ specifications and turn them into a numeric vector. Secondly, by using the obtained data vectors, we classify each test case into a dependent or an independent class. We carry out a supervised learning approach using different methods for handling imbalanced datasets. The feasibility and possible generalization of the proposed methodology is evaluated in two industrial projects at Bombardier Transportation, Sweden, which indicates promising results.
Authors: Sahar Tahvili, Leo Hatvani, Enislay Ramentol, Rita Pimentel, Wasif Afzal, Francisco Herrera
Title of the source: Engineering Applications of Artificial Intelligence
Model-based testing (MBT) has been previously used to validate embedded systems. However, (i) creation of a model conforming to the behavioural aspects of an embedded system, (ii) generation of executable test scripts and (iii) assessment of test verdict, re-quires a systematic process. In this paper, we have presented a three-phase tool-supported MBT workflow for the testing of an embedded system, that spans from requirements specification to test verdict assessment. The workflow starts with a simplistic, yet practical, application of a Domain-Specific Language (DSL) based on Gherkin-like style, which allows the requirements engineer to specify requirements and to extract information about model elements(i.e. states and transitions). This is done to assist the graphical modelling of the complete system under test (SUT). Later stages of the workflow generates an executable test script that runs on a domain-specific simulation platform. We have evaluated this tool-supported workflow by specifying the requirements, extracting information from the DSL and developing a model of a subsystem of the train control management system developed at Alstom Transport AB in Sweden. The C# test script generated from the SUT model is successfully executed at the Software-in-the-Loop (SIL) execution platform and test verdicts are visualized as a sequence of passed and failed test steps.
Authors: Muhammad Nouman Zafar, Wasif Afzal, Eduard Enoiu
Title of the source: A-TEST 2021: Proceedings of the 12th International Workshop on Automating TEST Case Design, Selection, and Evaluation
Publisher: Association for Computing Machinery
Relevant pages: 33-40
ELEVATE is an industrial elevator simulator that can be applied to examine the performance of elevator installations and test scheduling algorithms in a realistic environment. Elevate is a GUI-based application and requires manual steps to perform simulations. In this thesis, we have designed, developed and implemented a program (Simuloop) to facilitate a simulator loop with Elevate. Simuloop enables automatic simulations without manual interference. We propose 2 experiments that apply Simuloop to generate demanding passenger traffic to test an industrial elevator dispatcher. We apply a genetic algorithm and reinforcement learning, respectively. Simuloop is used to give feedback to the algorithms by simulating the passenger traffic. The experiment with the genetic algorithm performs stochastic updates on a lunch peak profile with 948 passengers. The updates are based on varying the passenger weight, entry/exit time to identify patterns that yield a high waiting time. The algorithm is able to increase the average waiting time from 20 to 44.5 seconds. The experiment with reinforcement learning has higher requirements to Simuloop since it depends on frequent feedback to guide the learning. We design a small-scale experiment to train the algorithm to select the arrival and destination floors for passengers. The algorithm is able to increase the cumulative waiting time, suggesting that the experiment is applicable for the use case. Due to the limitations of the pipeline, we conclude that using reinforcement learning is unpractical. The experiments prove that Simuloop can successfully be applied for automatic testing of an elevator system. This opens up the opportunity for performing exhaustive testing without the need for manual steps.
Authors: Torbjørn Ruud
Title of the source: Master’s thesis
Publisher: University of Oslo