Uncertainty-aware Robustness Assessment of Industrial Elevator Systems

Industrial elevator systems are commonly used software systems in our daily lives, which operate in uncertain environments such as unpredictable passenger traffic, uncertain passenger attributes and behaviors, and hardware delays. Understanding and assessing the robustness of such systems under various uncertainties enable system designers to reason about uncertainties, especially those leading to low system robustness, and consequently improve their designs and implementations in terms of handling uncertainties. To this end, we present a comprehensive empirical study conducted with industrial elevator systems provided by our industrial partner Orona, which focuses on assessing the robustness of a dispatcher, i.e., a software component responsible for elevators’ optimal scheduling. In total, we studied 90 industrial dispatchers in our empirical study. Based on the experience gained from the study, we derived an uncertainty-aware robustness assessment method (named UncerRobua) comprising a set of guidelines on how to conduct the robustness assessment and a newly proposed ranking algorithm, for supporting the robustness assessment of industrial elevator systems against uncertainties.

DOI: TBD

Authors: Liping Han, Shaukat Ali, Tao Yue, Aitor Arrieta and Maite Arratibel

Title of the source: ACM Transactions on Software Engineering and Methodology

Publisher:  ACM Journals

Relevant pages:  

Year: 2022

Uncertainty-Aware Transfer Learning to Evolve Digital Twins for Industrial Elevators

Digital twins are increasingly developed to support the development, operation, and maintenance of cyber-physical systems such as industrial elevators. However, industrial elevators continuously evolve due to changes in physical installations, introducing new software features, updating existing ones, and making changes due to regulations (e.g., enforcing restricted elevator capacity due to COVID-19), etc. Thus, digital twin functionalities (often built on neural network-based models) need to evolve themselves constantly to be synchronized with the industrial elevators. Such an evolution is preferred to be automated, as manual evolution is timeconsuming and error-prone. Moreover, collecting sufficient data to re-train neural network models of digital twins could be expensive or even infeasible. To this end, we propose unceRtaInty-aware tranSfer lEarning enriched Digital Twins (RISE-DT), a transfer learning based approach capable of transferring knowledge about the waiting time prediction capability of a digital twin of an industrial elevator across different scenarios. RISE-DT also leverages uncertainty quantification to further improve its effectiveness. To evaluate RISE-DT, we conducted experiments with 10 versions of an elevator dispatching software from Orona, Spain, which are deployed in a Software in the Loop (SiL) environment. Experiment results show that RISE-DT, on average, improves the Mean Squared Error by 13.131% and the utilization of uncertainty quantification further improves it by 2.71%.

DOI: https://doi.org/10.1145/3540250.3558957

Authors: Qinghua Xu, Shaukat Ali, Tao Yue and Maite Arratibel

Title of the source: ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Publisher:  Association for Computing Machinery

Relevant pages: 

Year: 2022


More info

Are Elevator Software Robust against Uncertainties? Results and Experiences from an Industrial Case Study

Industrial elevator systems are complex Cyber-Physical Systems operating in uncertain environments and experiencing uncertain passenger behaviors, hardware delays, and software errors. Identifying, understanding, and classifying such uncertainties are essential
to enable system designers to reason about uncertainties and subsequently develop solutions for empowering elevator systems to deal with uncertainties systematically. To this end, we present a method, called RuCynefin, based on the Cynefin framework to classify uncertainties in industrial elevator systems from our industrial partner (Orona, Spain), results of which can then be used for assessing their robustness. RuCynefin is equipped with a novel classification algorithm to identify the Cynefin contexts for a variety of uncertainties in industrial elevator systems, and a novel metric for measuring the robustness using the uncertainty classification. We evaluated RuCynefin with an industrial case study of 90 dispatchers from Orona to assess their robustness against uncertainties. Results show that RuCynefin could effectively identify several situations for which certain dispatchers were not robust. Specifically, 93% of such versions showed some degree of low robustness against
uncertainties. We also provide insights on the potential practical usages of RuCynefin, which are useful for practitioners in this field.

DOI: 10.1145/3540250.3558955

Authors: Liping Han, Tao Yue, Shaukat Ali, Aitor Arrieta and Maite Arratibel

Title of the source: ESEC/FSE 2022: Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Publisher:  Association for Computing Machinery

Relevant pages: 

Year: 2022

More info

Multi-Objective Metamorphic Test Case Selection: an Industrial Case Study

Metamorphic testing is a technique that has shown great potential to alleviate the test oracle problem by exploiting the relations among the inputs and outputs of different executions of a system. However, this approach requires multiple test executions. In applications like Cyber-Physical Systems (CPSs), where the test executions can be very expensive in terms of time
and resources needed, this can supose a problem. Therefore, it is paramount to optimize the test suite to reduce the costs of verifying the system. Test case selection is an optimization
technique which accomplishes this by selecting a subset of test cases while aiming to preserve the effectiveness of the original test suite as much as possible. While there are many approaches for test case selection in the existing literature, none of them has
been proposed for the metamorphic test case selection problem, where each metamorphic test case consists of a source and, at least, a follow-up test case pair.

In this work, we present an evolutionary multi-objective approach for the metamorphic test case selection problem, adapting existing multi-objective test selection techniques and proposing new evolutionary operators and objective functions. Furthermore, we evaluate our approach with a set of metamorphic tests developed for an industrial case study from the elevation domain. The results suggest that our approach outperforms both Random Search and the same metaheuristic algorithm without the new evolutionary operators we propose.

DOI: TBD

Authors: Jon Ayerdi, Aitor Arrieta, Ernest Bota Pobee and Maite Arratibel

Title of the source: IEEE 33rd International Symposium on Software Reliability Engineering

Publisher:  IEEE

Relevant pages:  

Year: 2022

Performance-Driven Metamorphic Testing of Cyber-Physical Systems

Cyber-Physical Systems (CPSs) are a new generation of systems which integrate software with physical processes. The increasing complexity of these systems, combined with the uncertainty in their interactions with the physical world, makes the definition of effective test oracles especially challenging, facing the well known test oracle problem. Metamorphic testing has shown great potential to alleviate the test oracle problem by exploiting the relations among the inputs and outputs of different executions of the system, so-called metamorphic relations (MRs). In this article, we propose a MR pattern called Performance Variation (PV) for the identification of performance-driven MRs, and we show its applicability in two CPSs from different domains: automated navigation systems and elevator control systems. For the evaluation, we assessed the effectiveness of this approach for detecting failures in an open source simulation-based autonomous navigation system, as well as in an industrial case study from the elevation domain. We derive concrete MRs based on the PV pattern for both case studies and we evaluate their effectiveness with seeded faults. Results show that the approach is effective at detecting over 88% of the seeded faults, while keeping the ratio of false positives at 4% or lower.

DOI: https://doi.org/10.1109/TR.2022.3193070

Authors: Jon Ayerdi, Pablo Valle, Sergio Segura, Aitor Arrieta, Goiuria Sagardui and Maite Arratibel

Title of the source: IEEE Transactions on Reliability

Publisher:  IEEE

Relevant pages:  

Year: 2022

Evaluating System-Level Test Generation for Industrial Software: A Comparison between Manual, Combinatorial and Model-Based Testing

Adequate testing of safety-critical systems is vital to ensure correct functional and non-functional operations. Previous research has shown that testing of such systems requires a lot of effort, thus automated testing techniques have found a certain degree of success. However, automated testing has not replaced the need for manual testing, rather a common industrial practice exhibits a balance between automated and manual testing. In this respect, comparing manual testing with automated testing techniques continues to be an interesting topic to investigate. The need for this investigation is most apparent at system-level testing of industrial systems, where there is a lack of results on how different testing techniques perform with respect to both structural and system-level metrics such as Modified Condition/Decision Coverage (MC/DC) and requirement coverage. In addition to the coverage, the cost of these techniques will also determine their efficiency and thus practical viability. In this paper, we have developed cost models for efficiency measurement and performed an experimental evaluation of manual testing, model-based testing (MBT) and combinatorial testing (CT) in terms of MC/DC and requirement coverage. The evaluation is done in an industrial context of a safety-critical system that controls several functions on-board the passenger trains. We have reported the dominant conditions of MC/DC affected by each technique while generating MC/DC adequate test suites. Moreover, we investigated differences and overlaps of test cases generated by each of the three techniques. The results showed that all test suites achieved 100% requirement coverage except the test suite generated by pairwise testing strategy. However, MBT-generated test suites were more MC/DC adequate and provided a higher number of both similar and unique test cases. Moreover, unique test cases generated by MBT had an observable affect on MC/DC, which will complement manual testing to increase MC/DC coverage. The least dominant MC/DC condition fulfilled by the generated test cases by all three techniques is the ‘independent effect of a condition on the outcomes of a decision’. Lastly, the evaluation also showed CT as the most efficient testing technique amongst the three in terms of time ted outcomes.

Authors: Muhammad Nouman Zafar, Wasif Afzal, Eduard Paul Enoiu

Title of the source: The 3rd ACM/IEEE International Conference on Automation of Software Test 2022

Publisher:  IEEE

Relevant pages:  

Year: 2022

Software test results exploration and visualization with continuous integration and nightly testing

Software testing is key for quality assurance of embedded systems. However, with increased development pace, the amount of test results data risks growing to a level where exploration and visualization of the results are unmanageable. This paper covers a tool, Tim, implemented at a company developing embedded systems, where software development occurs in parallel branches and nightly testing is partitioned over software branches, test systems and test cases. Tim aims to replace a previous solution with problems of scalability, requirements and technological flora. Tim was implemented with a reference group over several months. For validation, data were collected both from reference group meetings and logs from the usage of the tool. Data were analyzed quantitatively and qualitatively. The main contributions from the study include the implementation of eight views for test results exploration and visualization, the identification of four solutions patterns for these views (filtering, aggregation, previews and comparisons), as well as six challenges frequently discussed at reference group meetings (expectations, anomalies, navigation, integrations, hardware details and plots). Results are put in perspective with related work and future work is proposed, e.g., enhanced anomaly detection and integrations with more systems such as risk management, source code and requirements repositories.

Authors: Per Erik Strandberg, Wasif Afzal, Daniel Sundmark

Title of the source: International Journal on Software Tools for Technology Transfer

Publisher:  Springer

Relevant pages:  

Year: 2022

A novel methodology to classify test cases using natural language processing and imbalanced learning

Detecting the dependency between integration test cases plays a vital role in the area of software test optimization. Classifying test cases into two main classes – dependent and independent – can be employed for several test optimization purposes such as parallel test execution, test automation, test case selection and prioritization, and test suite reduction. This task can be seen as an imbalanced classification problem due to the test cases’ distribution. Often the number of dependent and independent test cases is uneven, which is related to the testing level, testing environment and complexity of the system under test. In this study, we propose a novel methodology that consists of two main steps. Firstly, by using natural language processing we analyze the test cases’ specifications and turn them into a numeric vector. Secondly, by using the obtained data vectors, we classify each test case into a dependent or an independent class. We carry out a supervised learning approach using different methods for handling imbalanced datasets. The feasibility and possible generalization of the proposed methodology is evaluated in two industrial projects at Bombardier Transportation, Sweden, which indicates promising results.

Authors: Sahar Tahvili, Leo Hatvani, Enislay Ramentol, Rita Pimentel, Wasif Afzal, Francisco Herrera

Title of the source: Engineering Applications of Artificial Intelligence

Publisher:  ELSEVIER

Relevant pages:  

Year: 2020

Towards a workflow for model-based testing of embedded systems

Model-based testing (MBT) has been previously used to validate embedded systems. However, (i) creation of a model conforming to the behavioural aspects of an embedded system, (ii) generation of executable test scripts and (iii) assessment of test verdict, re-quires a systematic process. In this paper, we have presented a three-phase tool-supported MBT workflow for the testing of an embedded system, that spans from requirements specification to test verdict assessment. The workflow starts with a simplistic, yet practical, application of a Domain-Specific Language (DSL) based on Gherkin-like style, which allows the requirements engineer to specify requirements and to extract information about model elements(i.e. states and transitions). This is done to assist the graphical modelling of the complete system under test (SUT). Later stages of the workflow generates an executable test script that runs on a domain-specific simulation platform. We have evaluated this tool-supported workflow by specifying the requirements, extracting information from the DSL and developing a model of a subsystem of the train control management system developed at Alstom Transport AB in Sweden. The C# test script generated from the SUT model is successfully executed at the Software-in-the-Loop (SIL) execution platform and test verdicts are visualized as a sequence of passed and failed test steps.

Authors: Muhammad Nouman Zafar, Wasif Afzal, Eduard Enoiu

Title of the source: A-TEST 2021: Proceedings of the 12th International Workshop on Automating TEST Case Design, Selection, and Evaluation

Publisher:  Association for Computing Machinery

Relevant pages:  33-40

Year: 2021

Simuloop – Testing Framework for an Industrial Elevator System

ELEVATE is an industrial elevator simulator that can be applied to examine the performance of elevator installations and test scheduling algorithms in a realistic environment. Elevate is a GUI-based application and requires manual steps to perform simulations. In this thesis, we have designed, developed and implemented a program (Simuloop) to facilitate a simulator loop with Elevate. Simuloop enables automatic simulations without manual interference. We propose 2 experiments that apply Simuloop to generate demanding passenger traffic to test an industrial elevator dispatcher. We apply a genetic algorithm and reinforcement learning, respectively. Simuloop is used to give feedback to the algorithms by simulating the passenger traffic. The experiment with the genetic algorithm performs stochastic updates on a lunch peak profile with 948 passengers. The updates are based on varying the passenger weight, entry/exit time to identify patterns that yield a high waiting time. The algorithm is able to increase the average waiting time from 20 to 44.5 seconds. The experiment with reinforcement learning has higher requirements to Simuloop since it depends on frequent feedback to guide the learning. We design a small-scale experiment to train the algorithm to select the arrival and destination floors for passengers. The algorithm is able to increase the cumulative waiting time, suggesting that the experiment is applicable for the use case. Due to the limitations of the pipeline, we conclude that using reinforcement learning is unpractical. The experiments prove that Simuloop can successfully be applied for automatic testing of an elevator system. This opens up the opportunity for performing exhaustive testing without the need for manual steps.

DOI:

Authors: Torbjørn Ruud

Title of the source: Master’s thesis

Publisher:  University of Oslo

Relevant pages:  

Year: 2021