Green Configuration: Determining the Influence of Software Configurations on Energy Consumption (funded by the DFG: 590,000€, 2018-2021)

Reducing energy consumption is an increasingly important goal of our society. It is predicted that IT energy usage will almost double by the year 2020, not only due to numerous data centers, but also because of billions of end devices. At the same time, software and hardware systems grow ever more complex by providing configuration capabilities, which makes it difficult to select a software and hardware system that not only satisfies the functional requirements, but also considers its energy consumption. In general, there are two important causes of energy wasting of IT systems: sub-optimal configuration of configurable systems (users' perspective) and inefficient implementation (developers' perspective).

Green Configuration will target both causes to reduce the energy consumption of IT systems. In this project, we will develop techniques to measure, predict, and optimize the influence of features and their interactions on the energy consumption of a configurable system. Furthermore, we will develop methods and analysis techniques to support developers in identifying the cause of energy problems in the source code.

The heart of the project is the energy-influence model, which captures the influence of software, hardware, and workload features on energy consumption. We will determine the influence using a combination of black-box and white-box measurements with modern learning strategies (e.g., active learning and transfer learning). We will use tailored sampling techniques and related them to the results of control-flow and data-flow analysis techniques to determine optimal configurations and trace the origin of energy leaks. The goal is to learn an accurate influence model with few measurements, which can then be use to relate the individual influences to regions of the source code. 

The influence model builds the foundation of the Green Configurator, a tool to show the consequences of configuration decisions to users. The Green Analyzer builds also on the influence model to support developers in maintenance tasks, for example, by highlighting regions in the source code with high energy consumption.


Green Configuration commits to the German goal of reducing energy consumption. By making consequences of configuration decisions visible to the user, we aim at inducing long-term behavioral changes that potentially save more energy than pure software-based optimizations. The foundational character of the project can fertilize further research in related fields. For instance, we investigate the issue of the effect of software-hardware configuration on quality attributes and provide accurate and realistic surrogate models for multi-objective optimization. By combining software analysis with machine learning, we also expect new insights about the effect of program flow and control flow on energy consumption.


Pervolution: Performance Evolution of Highly-Configurable Software Systems (funded by DFG: 289 000€; 2018-2021)

Almost every complex software system today is configurable. Configuration options allow users to tailor a system according to their requirements. A key non-functional requirement is performance. However, users and even domain experts often lack understanding which configuration options have an influence on performance and how and which combinations of options cause performance interactions. In addition, software systems evolve, introducing a further dimension of complexity. During evolution, developers add new functionality and need to understand which pieces of functionality—controlled by which configuration options—require maintenance, refactoring, and debugging. For example, the number of configuration options has nearly doubled in the Linux kernel (x86), starting from 3284 in release 12 to 6319 in release 32; the increase for the Apache Web server has been from 150 options in 1998 to nearly 600 options in 2014. Identifying the options that are critical for performance becomes infeasible in such scenarios. Without a proper representation of the evolution of the performance influence of configuration options, developers have to start from scratch again and again, to identify performance bugs, performance-optimal configurations, and to build an understanding of the performance behavior of the system.

In Pervolution, we will develop an approach to facilitate performance-aware evolution of complex, configurable systems, by tracking down evolutionary performance changes and by providing development guidelines based on extracted performance-evolution patterns. Our goal is to provide deep insights into the performance evolution of configuration options and their interactions, so that developers can reason about their decisions about the system’s configurability during software evolution, in the light of the changed performance behavior. Technically, we will use performance-influence models that capture the influences of configuration options as well as of their interactions on performance, using a unique combination of machine-learning techniques, sampling heuristics, and experimental designs. For each version of the system (time dimension), we analyze performance influence models of different variants (configuration dimension). We will relate variability and code changes to the individual influences inside the model and allow users to reason about these models.

We will extend the prevailing black-box approach of learning performance-influence models by combining static, variability-aware code analyses and program differencing to spot performance-related changes across different system versions. This unique combination of techniques allows us to tame the exponential complexity of the problem and to provide practical solutions. A distinguishing feature of Pervolution is that it considers both dimensions of variability and time explicitly and equally.



Automated Code Generation via Deep Learning

Deep learning has already revolutionized important fields in Computer Science, such as object recognition, natural language processing, and robotics. The distinctive feature of deep neural nets is that it learns important features of a domain on its own ---be it certain edge patterns in a picture for object recognition or certain language characteristics. In software engineering, we have a very similar task: finding the core characteristics (i.e., features) that lead to syntactically and semantically correct code based on a certain input. 

The goal of this endeavor is to explore the possibilities of deep learning for automated code generation to, for example, fix bugs automatically, generate skeletons of classes, improve code completion, and optimize the source code. We will address several challenges in this project: What is the right code abstraction for learning? Which learning methodology is suitable? How to encode the context of variables, methods, etc. into a neural net? Which net architecture allows us to encode semantics and syntactic rules? Answering these basic research questions will provide as with deep insights, methods, and tools to subsequently make a large step forward in automating software engineering.


Automated Configuration of Machine-Learning Software

Today, many users from different disciplines and fields have problems that can be solved with machine learning, such as statistical analysis, forecasts, optimization, classification, or categorization. However, applying the right methods is difficult, because one needs expert knowledge to know the preconditions of these techniques and when to use what to obtain accurate and efficient results. 

The goal of this project is to automated the selection of a suitable machine-learning technique based on a given task. We view a machine-learning library as a configurable software system and apply state of the art modeling, configuration, and optimization techniques. Using a domain-specific language and the approach learning by example, we can guide the configuration process descriptively with non-expert knowledge in machine learning. An inference machine will take care of the appropriate model and algorithm selection as well as parameter tuning. As a result, we make machine learning more tangible to a wide audience. 


Transfer Learning of Performance Models

Transfer learning allows us to apply knowledge gained from a certain data source to a new target data source. This way, we can reuse models learned from an outdated system and translate to a new system, which might diverge in certain aspects, but exhibits also a similar behavior. 

The goal of the project is to use transfer learning for performance models of software systems. This way, we can transfer a performance model learned from data based on another hardware or based on another workload to a new hardware or workload. The benefits are that we require less measurements for a new system and do not loose crucial information (e.g., performance trends) of the former system.