DevOps4CPS-Testing 2021​

DevOps4CPS-Testing 2021

Blending best practice DevOps solutions with the development processes used for CPS to deliver software more rapidly and in a more secure manner are emerging and critical open challenges of both contemporary and future CPS development. 

The Adeptness project is hosting the DevOps4CPS-Testing Workshop on 16 April 2021, in conjunction with ICST 2021, to discuss current state-of-the-art in addressing these challenges and a path forward to move CPS testing to the next level within new innovations in DevOps tools and technologies.

Calls for papers have been published and more information is available from the DevOps4CPS-Testing Workshop website. The workshop will be chaired by Aitor Arrieta, from Mondragon University, and Skaukat Ali, from the Simula Research Laboratory, in conjunction with Sebastiano Panichella, from Zurich University of Applied Sciences, who is the Technical Director of the COSMOS project. 

The Adeptness project is pleased to collaborate in co-organizing the workshop with another Horizon 2020 research project called COSMOS, focused on designing and developing novel DevOps methodologies, techniques, and tools that will enable effective, continuous development and evolution of cyber-physical systems.

Anomaly Detection with Digital Twin in Cyber-Physical Systems

Cyber-Physical Systems (CPSs) are susceptible to various anomalies during their operations. Thus, it is important to detect such anomalies. Detecting such anomalies is challenging since it is uncertain when and where anomalies can happen. To this end, we present a novel approach called Anomaly deTection with digiTAl twIN (ATTAIN), which continuously and automatically builds a digital twin with live data obtained from a CPS for anomaly detection. ATTAIN builds a Timed Automaton Machine (TAM) as the digital representation of the CPS, and implements a Generative Adversarial Network (GAN) to detect anomalies. GAN uses a GCN-LSTM-based module as a generator, which can capture temporal and spatial characteristics of the input data and learn to produce realistic unlabeled fake samples. TAM labels these fake samples, which are then fed into a discriminator along with real labeled samples. After training, the discriminator is capable of distinguishing anomalous data from normal data with a high F1 score. To evaluate our approach, we used three publicly available datasets collected from three CPS testbeds. Evaluation results show that ATTAIN improved the performance of two state-of-art anomaly detection methods by 2.413%, 8.487% and 5.438% on average on the three datasets, respectively. Moreover, ATTAIN achieved on average 8.39% increase in the anomaly detection capability with digital twins as compared with an approach of not using digital twins.

Authors: Qinghua Xu, Shaukat Ali, Tao Yue

An Evaluation of Monte Carlo-Based Hyper-Heuristic for Interaction Testing of Industrial Embedded Software Applications

Hyper-heuristic is a new methodology for the adaptive hybridization of meta-heuristic algorithms to derive a general algorithm for solving optimization problems. This work focuses on the selection type of hyper-heuristic, called the exponential Monte Carlo with counter (EMCQ). Current implementations rely on the memory-less selection that can be counterproductive as the selected search operator may not (historically) be the best performing operator for the current search instance. Addressing this issue, we propose to integrate the memory into EMCQ for combinatorial t-wise test suite generation using reinforcement learning based on the Q-learning mechanism, called Q-EMCQ. The limited application of combinatorial test generation on industrial programs can impact the use of such techniques as Q-EMCQ. Thus, there is a need to evaluate this kind of approach against relevant industrial software, with a purpose to show the degree of interaction required to cover the code as well as finding faults. We applied Q-EMCQ on 37 real-world industrial programs written in Function Block Diagram (FBD) language, which is used for developing a train control management system at Bombardier Transportation Sweden AB. The results show that Q-EMCQ is an efficient technique for test case generation. Additionally, unlike the t-wise test suite generation, which deals with the minimization problem, we have also subjected Q-EMCQ to a maximization problem involving the general module clustering to demonstrate the effectiveness of our approach. The results show the Q-EMCQ is also capable of outperforming the original EMCQ as well as several recent meta/hyper-heuristic including modified choice function, Tabu high-level hyper-heuristic, teaching learning-based optimization, sine cosine algorithm, and symbiotic optimization search in clustering quality within comparable execution time.

Authors: Bestoun S. Ahmed, Eduard Enoiu, Wasif Afzal, Kamal Z. Zamli

Towards a Taxonomy for Eliciting Design-Operation Continuum Requirements of Cyber-Physical Systems

Software systems that are embedded in autonomous Cyber-Physical Systems (CPSs) usually have a large life-cycle, both during its development and in maintenance. This software evolves during its life-cycle in order to incorporate new requirements, bug fixes, and to deal with hardware obsolescence. The current process for developing and maintaining this software is very fragmented, which makes developing new software versions and deploying them in the CPSs extremely expensive. In other domains, such as web engineering, the phases of development and operation are tightly connected, making it possible to easily perform software updates of the system, and to obtain operational data that can be analyzed by engineers at development time. However, in spite of the rise of new communication technologies (e.g., 5G) providing an opportunity to acquire Design-Operation Continuum Engineering methods in the context of CPSs, there are still many complex issues that need to be addressed, such as the ones related with hardware-software co-design. Therefore, the process of Design-Operation Continuum Engineering for CPSs requires substantial changes with respect to the current fragmented software development process. In this paper, we build a taxonomy for Design-Operation Continuum Engineering of CPSs based on case studies from two different industrial domains involving CPSs (elevation and railway). This taxonomy is later used to elicit requirements from these two case studies in order to present a blueprint on adopting Design-Operation Continuum Engineering in any organization developing CPSs.

Authors: Jon Ayerdi , Aitor Garciandia , Aitor Arrieta , Wasif Afzal, Eduard Paul Enoiu, Aitor Agirre , Goiuria Sagardui , Maite Arratibel , Ola Sellin

Detecting Inconsistencies in Annotated Product Line Models

Model-based product line engineering applies the reuse practices from product line engineering with graphical modeling for the specification of software intensive systems. Variability is usually described in separate variability models, while the implementation of the variable systems is specified in system models that use modeling languages such as SysML. Most of the SysML modeling tools with variability support, implement the annotation-based modeling approach. Annotated product line models tend to be error-prone since the modeler implicitly describes every possible variant in a single system model. To identifying variability-related inconsistencies, in this paper, we firstly define restrictions on the use of SysML for annotative modeling in order to avoid situations where resulting instances of the annotated model may contain ambiguous model constructs. Secondly, inter-feature constraints are extracted from the annotated model, based on relations between elements that are annotated with features. By analyzing the constraints, we can identify if the combined variability- and system model can result in incorrect or ambiguous instances. The evaluation of our prototype implementation shows the potential of our approach by identifying inconsistencies in the product line model of our industrial partner which went undetected through several iterations of the model.

Authors: Damir Bilic, Jan Carlson, Daniel Sundmark, Wasif Afzal, Peter Wallin

Intermittently Failing Tests in the Embedded Systems Domain

Software testing is sometimes plagued with intermittently failing tests and finding the root causes of such failing tests is often difficult. This problem has been widely studied at the unit testing level for open source software, but there has been far less investigation at the system test level, particularly the testing of industrial embedded systems. This paper describes our investigation of the root causes of intermittently failing tests in the embedded systems domain, with the goal of better understanding, explaining and categorizing the underlying faults. The subject of our investigation is a currently-running industrial embedded system, along with the system level testing that was performed. We devised and used a novel metric for classifying test cases as intermittent. From more than a half million test verdicts, we identified intermittently and consistently failing tests, and identified their root causes using multiple sources. We found that about 1-3% of all test cases were intermittently failing. From analysis of the case study results and related work, we identified nine factors associated with test case intermittence. We found that a fix for a consistently failing test typically removed a larger number of failures detected by other tests than a fix for an intermittent test. We also found that more effort was usually needed to identify fixes for intermittent tests than for consistent tests. An overlap between root causes leading to intermittent and consistent tests was identified. Many root causes of intermittence are the same in industrial embedded systems and open source software. However, when comparing unit testing to system level testing, especially for embedded systems, we observed that the test environment itself is often the cause of intermittence.

Authors: Per Erik Strandberg, Thomas J. Ostrand, Elaine J. Weyuker, Wasif Afzal, Daniel Sundmark

Model-Based Testing in Practice: An Industrial Case Study using GraphWalker

Model-based testing (MBT) is a test design technique that supports the automation of software testing processes and generates test artefacts based on a system model representing behavioural aspects of the system under test (SUT). Previous research has shown some positive aspects of MBT such as low-cost test case generation and fault detection effectiveness. However, it is still a challenge for both practitioners and researchers to evaluate MBT tools and techniques in real, industrial settings. Consequently, the empirical evidence regarding the mainstream use, including the modelling and test case generation using MBT tools, is limited. In this paper, we report the results of a case study on applying GraphWalker, an open-source tool for MBT, on an industrial cyber-physical system (i.e., a Train Control Management System developed by Bombardier Transportation in Sweden), from modelling of real-world requirements and test specifications to test case generation. We evaluate the models of the SUT for completeness and representativeness, compare MBT with manual test cases written by practitioners using multiple attributes as well as share our experiences of selecting and using GraphWalker for industrial application. The results show that a model of the SUT created using both requirements and test specifications provides better understanding of the SUT from testers’ perspective, making it more complete and representative than the model created based only on the requirements specification alone. The generated model-based test cases are longer in terms of the number of test steps, achieve better edge coverage and can cover requirements more frequently in different orders while achieving the same level of requirements coverage as manually created test cases.

Authors: Muhammad Nouman Zafar, Wasif Afzal, Eduard Paul Enoiu, Athanasios Stratis, Aitor Arrieta, Goiuria Sagardui

Industrial Scale Passive Testing with T-EARS

Passive testing continuously observes the system or system execution logs without any interference or instrumentation to test diverse combinations of functions, resulting in a more thorough evaluation over time. However, reaching a working solution to passive testing is not without challenges. While there have been some efforts to extract information from system requirements to create passive test cases, to our knowledge, no such efforts are mature enough to be applied in a real, industrial safety-critical context. Our passive testing approach uses the Timed – Easy Approach to Requirements Syntax (T-EARS) specification language and its accompanying tool-chain. This study reports challenges and solutions to introducing system-level passive testing for a vehicular safety-critical system through industrial data analysis, including 116 safety-related requirements. Our results show that passive testing using the T-EARS language and its tool-chain can be used for system-level testing in an industrial setting for 64% of the studied requirements. We identified several sources of false positive results and show how to tune test cases to reduce such false positives systematically. Finally, we show the requirement coverage achieved by a manual test session and that passive testing using T-EARS can find a set of injected faults that are considered hard to find with other test techniques.

Authors: Daniel Flemström, Henrik Jonsson, Eduard Paul Enoiu, Wasif Afzal