Streamlining performance verification through automation and design

27 October 2016

New Image

While addressing the change necessary to achieve continuous delivery capability and DevOps practices as one of the first obstacles we had to face was the overhead put on the delivery time by the traditional way of performance verification. The creation of environments and long test executions - mostly derived from insufficient practices - heavily limited the timeframe and even the content of each release. Problems originated in the low level of automation of SUT creation and configuration and the automation difficulty in test case execution, especially in result analysis. To establish baseline we collected data from the past two years about number of environment bring-ups, bring-up time, number of test case executions and test results being in ?Not analysed? state. To tackle this complex problem we took two parallel path. On one hand, we started redefining how we are doing performance tests, by reducing the test and SUT complexity to the smallest possible increment. On the other hand, we get heavy automation work going to cover configuration and execution. We created a scalable dynamic deployment framework (AvED - Automated virtual Environment Deployment), with which we can carry out on demand SUT and non-SUT deployment in a fraction of the original configuration time. This Python based application carries out the deployment through REST API and triggers the configuration system. The configuration can be read from three different sources: predefined XML, described by test requirement, selected manually (for special circumstances). We joined AvED with Jenkins effectively fitting performance verification into our SCM/CI pipeline. With the utilization of AvED and scalable systems we are currently able to deploy 48 parallel test environments to our available cloud capacity both on VMware and Openstack infrastructure. To test the performance of different possible cloud variants we created a Python based measurement system to establish the performance baseline on reference configuration and developed a Python, Power CLI and Bash based performance predictor tool which utilizes IPSL and Gatling. The results of performance prediction can be further refined and expanded by the use of Clover (Cloud Verification) tool. With the use of predictor we can already specify the VNF performance on a never before tested infrastructure without having to actually deploy the VNF itself. In order to automate the test result analysis we are developing a SUT behaviour and data discrepancy recognition frame along with selective log analytics. With our renewed test cases and increased automation, we significantly decreased performance verification turnover time from months to weeks or days and we are able to provide load/mass traffic testing feedback to development even in the desired 2 hours long CI cycle. With the reduction of testing cycle times we are able to introduce new types of tests into our delivery process, such as new ways of chaos and robustness tests with Pumba or Faulty Cat, which ultimately leads to higher coverage and quality affecting all stakeholders throughout our VNF delivery. In my presentation, I would provide a general overview on how we redesigned our test cases in order to meet automation possibilities, and would introduce our dynamic deployment framework and its connections towards the SCM/CI pipeline and automated test execution frame. I would demonstrate the key benefits of simplification in automation by which we were able to create a modular and generic system which can be applied companywide or even outside Nokia.