This list is provisional and current as of 4th June 2015.
Evolutionary Approximation of Software for Embedded Systems: Median Function (Camera-Ready PDF)
This paper deals with genetic programming-based improvement of non-functional properties of programs intended for low-cost microcontrollers. As the objective is to significantly reduce power consumption and execution time, the approximate computing scenario is considered in which occasional errors in results are acceptable. The method is based on Cartesian genetic programming and evaluated in the task of approximation of 9-input and 25-input median function. Resulting approximations show a significant improvement in the execution time and power consumption with respect to the accurate median function while the observed errors are moderate.
Grow and graft genetic programming greatly improves GPGPU dynamic programming software for predicting the minimum binding energy for folding of RNA molecules. The parallel code inserted into the existing CUDA version of pknots was “grown” using a BNF grammar. On an nVidia Tesla K40 GPU GGGP gives a speed up of up to 10 000 times.
We present locoGP, a Genetic Programming (GP) system written in Java for evolving Java source code. locoGP was designed to improve the performance of programs as measured in the number of operations executed. Variable test cases are used to maintain functional correctness during evolution. The operation of locoGP is demonstrated on a number of typically constructed “off-the-shelf” hand-written implementations of sort and prefix-code programs. locoGP was able to find improvement opportunities in all test problems.
Wesley K. G. Assuncao (DINF – Federal University of Paraná, COINF – Technological Federal University of Paraná); Stefan Fischer (ISSE – Johannes Kepler University Linz); Silvia R. Vergilio (DINF – Federal University of Paraná); Alexander Egyed (ISSE – Johannes Kepler University Linz)
Software Product Lines (SPLs) are families of related software systems that provide different combinations of features. Extensive research and application attest to the significant economical and technological benefits of employing SPL practices. However, there are still several challenges that remain open. Salient among them is reverse engineering SPLs from existing variants of software systems and their subsequent evolution. In this paper, we aim at sketching connections between research on these open SPL challenges and ongoing work on Genetic Improvement. Our hope is that by drawing such connections we can spark the interest of both research communities on the exciting synergies at the intersection of these subject areas.
We would all benefit if software were slimmer, thinner, and generally only did what we needed and nothing more. To this end, our research team has been exploring methods for removing unused and undesirable features from compiled programs. Our primary goal is to improve software security by removing rarely used features in order to decrease a program’s attack surface. This paper describes two different approaches for “thinning” binary programs. The first approach removes specific program features using dynamic tracing as a guide. This approach is relatively safe, but is only capable of removing code which is reachable in a trace when an undesirable feature is enabled. The second approach uses a genetic algorithm (GA) to mutate a program until a suitable variant is found. Our GA-based approach can potentially remove any code that is not strictly required for proper execution, but may break program semantics in unpredictable ways. We show results of these approaches on a toy program and real-world software and explore some of the implications for software security.
ECSELR is an ecologically-inspired approach to software evolution that enables environmentally driven evolution at runtime in extant software systems without relying on any offline components or management. ECSELR embeds adaptation and evolution inside the target software system enabling the system to transform itself via darwinian evolutionary mechanisms and adapt in a self contained manner. This allows the software system to benefit autonomously from the useful emergent byproducts of evolution like adaptivity and bio-diversity, avoiding the problems involved in engineering and maintaining such properties. ECSELR enables software systems to address changing environments at runtime, ensuring benefits like mitigation of attacks and memory-optimization among others while avoiding time consuming and costly maintenance and downtime. ECSELR differs from existing work in that, 1) adaptation is embedded in the target system, 2) evolution and adaptation happens online(i.e. in-situ at runtime) and 3) ECSELR is able to embed adaptation inside systems that have already been started and are in the midst of execution. We demonstrate the use of ECSELR and present results on using the ECSELR framework to slim a software system.
The speed with which newly discovered software vulnerabilities are patched is a critical factor in mitigating the harm caused by subsequent exploits. Unfortunately, software vendors are often slow or unwilling to patch vulnerabilities, especially in embedded systems which frequently have no mechanism for updating factory-installed firmware. The situation is particularly dire for commercial off the shelf (COTS) software users, who lack source code and are wholly dependent on patches released by the vendor.
We propose a solution in which the vulnerabilities drive an automated evolutionary computation repair process capable of directly patching embedded systems firmware. Our approach does not require access to source code, regression tests, or any participation from the software vendor. Instead, we present an interactive evolutionary algorithm that searches for patches that resolve target vulnerabilities while relying heavily on post-evolution difference minimization to remove most regressions. Extensions to prior work in evolutionary program repair include: repairing vulnerabilities in COTS router firmware; handling stripped MIPS executable; operating without fault localization information; operating without a regression test suite; and incorporating user interaction into the evolutionary repair process.
We demonstrate this method by repairing two well-known vulnerabilities in version 4 of NETGEAR’s WNDR3700 wireless router before NETGEAR released patches publicly for the vulnerabilities. Without fault localization we are able to find repair edits that are not located on execution traces. Without the advantage of regression tests to guide the search, we find that 80% of repairs of the example vulnerabilities retain program functionality after minimization. With minimal user interaction to demonstrate required functionality, 100% of the proposed repairs were able to address the vulnerabilities while retaining required functionality.
Genetic improvement (GI) has been successfully used to optimise non-functional properties of software, such as execution time, by automatically manipulating program’s source code. Measurement of non-functional properties, however, is a non-trivial task; energy consumption, for instance, is highly dependant on the hardware used. Therefore, we propose the GI4GI framework (and two illustrative applications). GI4GI first applies GI to improve the fitness function for the particular environment within which software is subsequently optimised using traditional GI.
This paper presents a brief outline of a higher-order mutation based framework for Genetic Improvement (GI). We argue that search-based higher-order mutation testing can be used to implement a form of genetic programming (GP) to increase the search granularity and testability of GI.
Bobby R. Bruce (University College London)
The discipline of Software Engineering has arisen during a time in which developers rarely concerned themselves with the energy efficiency of their software. Due to the growth in both mobile devices and large server clusters this period is undoubtedly coming to an end and, as such, new tools for creating energy-efficient software are required. This paper takes the position that Genetic Improvement, a Search-Based Software Engineering technique, has the potential to aid developers in refactoring their software to a more energy-efficient state; allowing focus to remain on functional requirements while leaving concerns over energy consumption to an automated process.
Energy has recently become an objective for Genetic Improvement. Measuring software energy use is complicated which might tempt us to use simpler measurements. However if we base the GI on inaccurate measurements we can not assure any improvements. This paper seeks to highlight important issues when evaluating energy use of programs.
Nathan Burles (University of York); Jerry Swan (University of York); Edward Bowles (University of York); Alexander E. I. Brownlee (University of Stirling); Zoltan A. Kocsis (University of Stirling Nadarajen Veerapen (University of Stirling)
We discuss the useful role that can be played by a subtype of improvement programming, which we term ‘Embedded Dynamic Improvement’. In this approach, developer-specified variation points define the scope of improvement. A search framework is embedded at these variation points, facilitating the creation of adaptive software that can respond online to changes in its execution environment.
We re-examine the central motivation behind Genetic Improvement Programming (GIP), and argue that the most important insight is the concept of applying Genetic Programming to existing software. This viewpoint allows us to make several observations about potential directions for GIP research.
“If you cannot measure it, you cannot improve it.”—Lord Kelvin
Fitness in GP/GI is usually a short-sighted greedy fitness function counting the number of satisfied test cases (or some other score based on error). If GP/GI is to be extended to successfully tackle “full software systems”, which is the stated domain of Genetic Improvement, with loops, conditional statements and function calls, then this kind of fitness will fail to scale. One alternative approach is to measure the fitness gain in terms of the accumulated information at each executed step of the program. This paper discusses methods for measuring the way in which programs accumulate information relevant to their task as they run, by building measures of this information gain based on information theory and model complexity.