Software test results exploration and visualization with continuous integration and nightly testing

Software testing is key for quality assurance of embedded systems. However, with increased development pace, the amount of test results data risks growing to a level where exploration and visualization of the results are unmanageable. This paper covers a tool, Tim, implemented at a company developing embedded systems, where software development occurs in parallel branches and nightly testing is partitioned over software branches, test systems and test cases. Tim aims to replace a previous solution with problems of scalability, requirements and technological flora. Tim was implemented with a reference group over several months. For validation, data were collected both from reference group meetings and logs from the usage of the tool. Data were analyzed quantitatively and qualitatively. The main contributions from the study include the implementation of eight views for test results exploration and visualization, the identification of four solutions patterns for these views (filtering, aggregation, previews and comparisons), as well as six challenges frequently discussed at reference group meetings (expectations, anomalies, navigation, integrations, hardware details and plots). Results are put in perspective with related work and future work is proposed, e.g., enhanced anomaly detection and integrations with more systems such as risk management, source code and requirements repositories.

Authors: Per Erik Strandberg, Wasif Afzal, Daniel Sundmark

Title of the source: International Journal on Software Tools for Technology Transfer

Publisher:  Springer

Relevant pages:  

Year: 2022

A novel methodology to classify test cases using natural language processing and imbalanced learning

Detecting the dependency between integration test cases plays a vital role in the area of software test optimization. Classifying test cases into two main classes – dependent and independent – can be employed for several test optimization purposes such as parallel test execution, test automation, test case selection and prioritization, and test suite reduction. This task can be seen as an imbalanced classification problem due to the test cases’ distribution. Often the number of dependent and independent test cases is uneven, which is related to the testing level, testing environment and complexity of the system under test. In this study, we propose a novel methodology that consists of two main steps. Firstly, by using natural language processing we analyze the test cases’ specifications and turn them into a numeric vector. Secondly, by using the obtained data vectors, we classify each test case into a dependent or an independent class. We carry out a supervised learning approach using different methods for handling imbalanced datasets. The feasibility and possible generalization of the proposed methodology is evaluated in two industrial projects at Bombardier Transportation, Sweden, which indicates promising results.

Authors: Sahar Tahvili, Leo Hatvani, Enislay Ramentol, Rita Pimentel, Wasif Afzal, Francisco Herrera

Title of the source: Engineering Applications of Artificial Intelligence

Publisher:  ELSEVIER

Relevant pages:  

Year: 2020

Towards a workflow for model-based testing of embedded systems

Model-based testing (MBT) has been previously used to validate embedded systems. However, (i) creation of a model conforming to the behavioural aspects of an embedded system, (ii) generation of executable test scripts and (iii) assessment of test verdict, re-quires a systematic process. In this paper, we have presented a three-phase tool-supported MBT workflow for the testing of an embedded system, that spans from requirements specification to test verdict assessment. The workflow starts with a simplistic, yet practical, application of a Domain-Specific Language (DSL) based on Gherkin-like style, which allows the requirements engineer to specify requirements and to extract information about model elements(i.e. states and transitions). This is done to assist the graphical modelling of the complete system under test (SUT). Later stages of the workflow generates an executable test script that runs on a domain-specific simulation platform. We have evaluated this tool-supported workflow by specifying the requirements, extracting information from the DSL and developing a model of a subsystem of the train control management system developed at Alstom Transport AB in Sweden. The C# test script generated from the SUT model is successfully executed at the Software-in-the-Loop (SIL) execution platform and test verdicts are visualized as a sequence of passed and failed test steps.

Authors: Muhammad Nouman Zafar, Wasif Afzal, Eduard Enoiu

Title of the source: A-TEST 2021: Proceedings of the 12th International Workshop on Automating TEST Case Design, Selection, and Evaluation

Publisher:  Association for Computing Machinery

Relevant pages:  33-40

Year: 2021

Simuloop – Testing Framework for an Industrial Elevator System

ELEVATE is an industrial elevator simulator that can be applied to examine the performance of elevator installations and test scheduling algorithms in a realistic environment. Elevate is a GUI-based application and requires manual steps to perform simulations. In this thesis, we have designed, developed and implemented a program (Simuloop) to facilitate a simulator loop with Elevate. Simuloop enables automatic simulations without manual interference. We propose 2 experiments that apply Simuloop to generate demanding passenger traffic to test an industrial elevator dispatcher. We apply a genetic algorithm and reinforcement learning, respectively. Simuloop is used to give feedback to the algorithms by simulating the passenger traffic. The experiment with the genetic algorithm performs stochastic updates on a lunch peak profile with 948 passengers. The updates are based on varying the passenger weight, entry/exit time to identify patterns that yield a high waiting time. The algorithm is able to increase the average waiting time from 20 to 44.5 seconds. The experiment with reinforcement learning has higher requirements to Simuloop since it depends on frequent feedback to guide the learning. We design a small-scale experiment to train the algorithm to select the arrival and destination floors for passengers. The algorithm is able to increase the cumulative waiting time, suggesting that the experiment is applicable for the use case. Due to the limitations of the pipeline, we conclude that using reinforcement learning is unpractical. The experiments prove that Simuloop can successfully be applied for automatic testing of an elevator system. This opens up the opportunity for performing exhaustive testing without the need for manual steps.

DOI:

Authors: Torbjørn Ruud

Title of the source: Master’s thesis

Publisher:  University of Oslo

Relevant pages:  

Year: 2021

Understanding Digital Twins for Cyber-Physical Systems: A Conceptual Model

Digital Twins (DTs) are revolutionizing Cyber-Physical Systems (CPSs) in many ways, including their development and operation. The significant interest of industry and academia in DTs has led to various definitions of DTs and related concepts, as seen in many recently published papers. Thus, there is a need for precisely defining different DT concepts and their relationships. To this end, we present a conceptual model that captures various DT concepts and their relationships, some of which are from the published literature, to provide a unified understanding of these concepts in the context of CPSs. The conceptual model is implemented as a set of Unified Modeling Language (UML) class diagrams and the concepts in the conceptual model are explained with a running example of an automated warehouse case study from published literature and based on the authors’ experience of working with the real CPS case study in previous projects.

Authors: Tao Yue, Paolo Arcaini and Shaukat Ali

Title of the source: 2021 10th International Symposium On Leveraging Applications of Formal Methods, Verification and Validation (ISoLA)

Publisher:  Springer

Relevant pages:  54-71

Year: 2021

Genetic Algorithm-based Testing of Industrial Elevators under Passenger Uncertainty

Elevators, as other cyber-physical systems, need to deal with uncertainty during their operation due to several factors such as passengers and hardware. Such uncertainties could affect the quality of service promised by elevators and in the worst case lead to safety hazards. Thus, it is important that elevators are extensively tested by considering uncertainty during their development to ensure their safety in operation. To this end, we present an uncertainty testing methodology supported with a tool to test industrial dispatching systems at the Software-in-the-Loop (SiL) test level. In particular, we focus on uncertainties in passenger data and employ a Genetic Algorithm (GA) with specifically designed genetic operators to significantly reduce the quality of service of elevators, thus aiming to find uncertain situations that are difficult to extract by users. An initial experiment with an industrial dispatcher revealed that the GA significantly decreased the quality of service as compared to not considering uncertainties. The results can be used to further improve the implementation of dispatching algorithms to handle various uncertainties.

Authors:
Joritz Galarraga, Aitor Arrieta Marcos, Shaukat Ali, Goiuria Sagardui, Maite Arratibel

Title of the source: 2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)

Publisher:  IEEE

Relevant pages:  353-358

Year: 2021

Machine Learning-based Test Oracles for Performance Testing of Cyber-Physical Systems: an Industrial Case Study on Elevators Dispatching Algorithms

The software of systems of elevators needs constant maintenance to deal with new functionality, bug fixes or legislation changes. To automatically validate the software of these systems, a typical approach in industry is to use regression oracles, which execute test inputs both in the software version under test and in a previous software version. However, these practices require a long test execution time and cannot be re-used at different test phases. To deal with these issues, we propose DARIO, a test oracle that relies on regression machine-learning algorithms to detect both functional and non-functional problems of the system. The machine-learning algorithms of this oracle are trained by using data from previously tested versions to predict reference functional and non-functional performance
values of the new versions. An empirical evaluation with an industrial case study demonstrates the feasibility of using our approach. A total of five regression learning algorithms were validated by using mutation testing techniques. For the context of functional bugs, the accuracy when predicting verdicts by DARIO ranged between 95% to 98%, across the different scenarios proposed. For the context of non-functional bugs, were competitive too, having an accuracy when predicting verdicts by DARIO ranging between 83% to 87%.

Authors:
Aitor Gartziandia, Aitor Arrieta, Jon Ayerdi, Miren Illarramendi, Aitor Agirre, Goiuria Sagardui, Maite Arratibel

Title of the source: Journal of Software: Evolution and Process

Publisher: 

Relevant pages: 

Year: 2022