Papers

This list is provisional and current as of 4th June 2015.

Full Papers

Evolutionary Approximation of Software for Embedded Systems: Median Function (Camera-Ready PDF)

Vojtech Mrazek, (Brno University of Technology); Zdenek Vasicek (Brno University of Technology); Lukas Sekanina (Brno University of Technology)

This paper deals with genetic programming-based improvement of non-functional properties of programs intended for low-cost microcontrollers. As the objective is to significantly reduce power consumption and execution time, the approximate computing scenario is considered in which occasional errors in results are acceptable. The method is based on Cartesian genetic programming and evaluated in the task of approximation of 9-input and 25-input median function. Resulting approximations show a significant improvement in the execution time and power consumption with respect to the accurate median function while the observed errors are moderate.

Grow and Graft a better CUDA pknotsRG for RNA pseudoknot free energy calculation (Camera-Ready PDF)

William B. Langdon (University College London); Mark Harman (University College London)

Grow and graft genetic programming greatly improves GPGPU dynamic programming software for predicting the minimum binding energy for folding of RNA molecules. The parallel code inserted into the existing CUDA version of pknots was “grown” using a BNF grammar. On an nVidia Tesla K40 GPU GGGP gives a speed up of up to 10 000 times.

locoGP: Improving Performance by Genetic Programming Java Source Code (Camera-Ready PDF)

Brendan Cody-Kenny (Trinity College Dublin); Edgar Galvan Lopez (Trinity College Dublin); Stephen Barrett (Trinity College Dublin)

We present locoGP, a Genetic Programming (GP) system written in Java for evolving Java source code. locoGP was designed to improve the performance of programs as measured in the number of operations executed. Variable test cases are used to maintain functional correctness during evolution. The operation of locoGP is demonstrated on a number of typically constructed “off-the-shelf” hand-written implementations of sort and prefix-code programs. locoGP was able to find improvement opportunities in all test problems.

Genetic Improvement for Software Product Lines: An Overview and a Roadmap (Camera-Ready PDF)

Roberto E. Lopez-Herrejon (ISSE – Johannes Kepler University Linz); Lukas Linsbauer (ISSE – Johannes Kepler University Linz);

Wesley K. G. Assuncao (DINF – Federal University of Paraná, COINF – Technological Federal University of Paraná); Stefan Fischer (ISSE – Johannes Kepler University Linz); Silvia R. Vergilio (DINF – Federal University of Paraná); Alexander Egyed (ISSE – Johannes Kepler University Linz)

Software Product Lines (SPLs) are families of related software systems that provide different combinations of features. Extensive research and application attest to the significant economical and technological benefits of employing SPL practices. However, there are still several challenges that remain open. Salient among them is reverse engineering SPLs from existing variants of software systems and their subsequent evolution. In this paper, we aim at sketching connections between research on these open SPL challenges and ongoing work on Genetic Improvement. Our hope is that by drawing such connections we can spark the interest of both research communities on the exciting synergies at the intersection of these subject areas.

Removing the Kitchen Sink from Software (Camera-Ready PDF)

Jason Landsborough (SPAWAR Systems Center Pacific); Stephen Harding (SPAWAR Systems Center Pacific); Sunny Fugate (SPAWAR Systems Center Pacific)

We would all benefit if software were slimmer, thinner, and generally only did what we needed and nothing more. To this end, our research team has been exploring methods for removing unused and undesirable features from compiled programs. Our primary goal is to improve software security by removing rarely used features in order to decrease a program’s attack surface. This paper describes two different approaches for “thinning” binary programs. The first approach removes specific program features using dynamic tracing as a guide. This approach is relatively safe, but is only capable of removing code which is reachable in a trace when an undesirable feature is enabled. The second approach uses a genetic algorithm (GA) to mutate a program until a suitable variant is found. Our GA-based approach can potentially remove any code that is not strictly required for proper execution, but may break program semantics in unpredictable ways. We show results of these approaches on a toy program and real-world software and explore some of the implications for software security.

Embedding Adaptivity in Software Systems using the ECSELR framework (Camera-Ready PDF)

Kwaku Yeboah-Antwi (INRIA); Benoit Baudry (INRIA)

ECSELR is an ecologically-inspired approach to software evolution that enables environmentally driven evolution at runtime in extant software systems without relying on any offline components or management. ECSELR embeds adaptation and evolution inside the target software system enabling the system to transform itself via darwinian evolutionary mechanisms and adapt in a self contained manner. This allows the software system to benefit autonomously from the useful emergent byproducts of evolution like adaptivity and bio-diversity, avoiding the problems involved in engineering and maintaining such properties. ECSELR enables software systems to address changing environments at runtime, ensuring benefits like mitigation of attacks and memory-optimization among others while avoiding time consuming and costly maintenance and downtime. ECSELR differs from existing work in that,  1) adaptation is embedded in the target system,  2) evolution and adaptation happens online(i.e. in-situ at runtime) and 3) ECSELR is able to embed adaptation inside systems that have already been started and are in the midst of execution. We demonstrate the use of ECSELR and present results on using the ECSELR framework to slim a software system.

Repairing COTS Router Firmware without Access to Source Code or Test Suites: A Case Study in Evolutionary Software Repair (Camera-Ready PDF)

Eric Schulte (GrammaTech); Westley Weimer (University of Virginia); Stephanie Forrest (University of New Mexico)

The speed with which newly discovered software vulnerabilities are patched is a critical factor in mitigating the harm caused by subsequent exploits.  Unfortunately, software vendors are often slow or unwilling to patch vulnerabilities, especially in embedded systems which frequently have no mechanism for updating factory-installed firmware.  The situation is particularly dire for commercial off the shelf (COTS) software users, who lack source code and are wholly dependent on patches released by the vendor.

We propose a solution in which the vulnerabilities drive an automated evolutionary computation repair process capable of directly patching embedded systems firmware.  Our approach does not require access to source code, regression tests, or any participation from the software vendor.  Instead, we present an interactive evolutionary algorithm that searches for patches that resolve target vulnerabilities while relying heavily on post-evolution difference minimization to remove most regressions.  Extensions to prior work in evolutionary program repair include: repairing vulnerabilities in COTS router firmware; handling stripped MIPS executable; operating without fault localization information; operating without a regression test suite; and incorporating user interaction into the evolutionary repair process.

We demonstrate this method by repairing two well-known vulnerabilities in version 4 of NETGEAR’s WNDR3700 wireless router before NETGEAR released patches publicly for the vulnerabilities.  Without fault localization we are able to find repair edits that are not located on execution traces.  Without the advantage of regression tests to guide the search, we find that 80% of repairs of the example vulnerabilities retain program functionality after minimization.  With minimal user interaction to demonstrate required functionality, 100% of the proposed repairs were able to address the vulnerabilities while retaining required functionality.

Position Papers

GI4GI: Improving Genetic Improvement Fitness Functions (Camera-Ready PDF)

Mark Harman (University College London); Justyna Petke (University College London)

Genetic improvement (GI) has been successfully used to optimise non-functional properties of software, such as execution time, by automatically manipulating program’s source code. Measurement of non-functional properties, however, is a non-trivial task; energy consumption, for instance, is highly dependant on the hardware used. Therefore, we propose the GI4GI framework (and two illustrative applications). GI4GI first applies GI to improve the fitness function for the particular environment within which software is subsequently optimised using traditional GI.

Genetic Improvement using Higher Order Mutation (Camera-Ready PDF)

Yue Jia (University College London); Fan Wu (University College London); Mark Harman (University College London); Jens Krinke (University College London)

This paper presents a brief outline of a higher-order mutation based framework for Genetic Improvement (GI). We argue that search-based higher-order mutation testing can be used to implement a form of genetic programming (GP) to increase the search granularity and testability of GI.

Energy Optimisation via Genetic Improvement (Camera-Ready PDF)

Bobby R. Bruce (University College London)

The discipline of Software Engineering has arisen during a time in which developers rarely concerned themselves with the energy efficiency of their software. Due to the growth in both mobile devices and large server clusters this period is undoubtedly coming to an end and, as such, new tools for creating energy-efficient software are required. This paper takes the position that Genetic Improvement, a Search-Based Software Engineering technique, has the potential to aid developers in refactoring their software to a more energy-efficient state; allowing focus to remain on functional requirements while leaving concerns over energy consumption to an automated process.

Genetic Improvement of Energy Usage is only as Reliable as the Measurements are Accurate (Camera-Ready PDF)

Saemundur O. Haraldsson (University of Stirling); John R. Woodward (University of Stirling)

Energy has recently become an objective for Genetic Improvement. Measuring software energy use is complicated which might tempt us to use simpler measurements. However if we base the GI on inaccurate measurements we can not assure any improvements. This paper seeks to highlight important issues when evaluating energy use of programs.

Embedded Dynamic Improvement (Camera-Ready PDF)

Nathan Burles (University of York); Jerry Swan (University of York); Edward Bowles (University of York);  Alexander E. I. Brownlee (University of Stirling); Zoltan A. Kocsis (University of Stirling Nadarajen Veerapen (University of Stirling)

We discuss the useful role that can be played by a subtype of improvement programming, which we term ‘Embedded Dynamic Improvement’. In this approach, developer-specified variation points define the scope of improvement. A search framework is embedded at these variation points, facilitating the creation of adaptive software that can respond online to changes in its execution environment.

Rethinking Genetic Improvement Programming  (Camera-Ready PDF)

David R. White (University of Glasgow); Jeremy Singer (University of Glasgow)

We re-examine the central motivation behind Genetic Improvement Programming (GIP), and argue that the most important insight is the concept of applying Genetic Programming to existing software. This viewpoint allows us to make several observations about potential directions for GIP research.

Fitness as Task-relevant Information Accumulation (Camera-Ready PDF)

Colin G. Johnson (University of Kent); John R. Woodward (University of Stirling)

“If you cannot measure it, you cannot improve it.”—Lord Kelvin

Fitness in GP/GI is usually a short-sighted greedy fitness function counting the number of satisfied test cases (or some other score based on error). If GP/GI is to be extended to successfully tackle “full software systems”, which is the stated domain of Genetic Improvement, with loops, conditional statements and function calls, then this kind of fitness will fail to scale. One alternative approach is to measure the fitness gain in terms of the accumulated information at each executed step of the program. This paper discusses methods for measuring the way in which programs accumulate information relevant to their task as they run, by building measures of this information gain based on information theory and model complexity.