The study of the genotype-phenotype map has recently been spurred by a new development: the advent of evolutionary computation. Evolutionary computation is a recently burgeoning field in which the principles of selective breeding are applied to optimization and engineering problems. It includes genetic algorithms (Holland, 1992), evolutionary strategies (Rechenberg, 1973, 1994), evolutionary programming (Fogel, Owens, and Walsh, 1966), and genetic programming (Koza, 1992).
In an evolutionary algorithm, for a particular problem (such as producing a neural network that recognizes a face) the space of possible solutions is represented as a data structure upon which certain "genetic" operations can act (such as mutation or recombination of the data), to produce variant "offspring". The offspring are then selected according to how well they carry out the desired behavior as parents for subsequent "breeding". An algorithm iterates this procedure, and the population of candidate solutions evolves.
In many problems, evolutionary algorithms have been found to produce solutions better than any that have been produced by rational design, or better than other search and optimization algorithms. In other cases, however, evolutionary algorithms fail miserably. The engineer is faced with the practical problem of understanding why. In so doing, researchers are gaining experience in a new domain of evolutionary phenomena. Their experience parallels in many ways the experience of animal and plant breeders, with one great exception: the programmer controls the genetic system.
What turns out to be crucial to the success of the evolutionary algorithm is how the candidate solutions are represented as data structures. This is known as the representation problem, and its appearance in evolutionary computation parallels its appearance in other areas of artificial intelligence (e.g. Lehmann, 1988; Rich and Knight, 1991; Winston, 1992; Jones, 1995). The process of adaptation can proceed only to the extent that favorable mutations occur, and this depends on how genetic variation maps onto phenotypic variation.
Biologists are not confronted by this problem because they study the end-products of evolution, which are prima facie evidence that the favorable mutations have occurred at a sufficient rate. Furthermore, a biologist wanting to study this question faces great methodological hurdles; comparative and experimental approaches to the problem are blocked because one cannot simply pick alternate genetic systems that produce the same phenotype and compare their capabilities to produce adaptive variation. In evolutionary computation, however, this is possible.
Among the earliest experiments in evolutionary computation, Friedberg (1959) attempted to evolve functioning computer programs by mutating and selecting the code, but found that mutations effectively randomized the behavior of the programs, and adaptive evolution was impossible; there is no way to improve the performance of a conventional computer program by randomly altering letters in the source code. It became understood that the mutation/selection process is not universally effective in producing adaptation if favorable mutations cannot be produced (see for instance Bossert (1967), Bremermann et al. (1966), Eden (1967), or Simon (1965)). In contrast to Friedberg's results, Koza (1992) succeeded in evolving computer programs that perform well on complex tasks (such as prediction of protein structure or random number generation) by recombining branches of parse trees for the programs. Thomas Ray (1992) succeeded in designing computer programs that exhibit evolution as an emergent property by careful design of the data structures. The difference between Friedberg and Koza's systems was in the representation of the computer programs and the way genetic operators act on them.
Hence, the Darwinian solution of optimization problems is possible if and only if the problem is "coded" in a way that makes the mutation-recombination-selection procedure an effective one. The "representation problem" is how to code a problem such that random variation and selection can lead to a solution. The representation problem underlies the issue of whether selection, mutation, and/or recombination can produce adaptation.
For biology the "representation problem" has some unsettling implications. If, as evolutionary biology asserts, all adaptations are the result of mutation and selection, organisms have to be evolvable. But once one calls into question the inevitability of organisms being evolvable, one can ask, how and why did an evolvable genome originate in the first place? Is it a fortuitous consequence of physics, or of biochemistry, or a "frozen accident" from life's origin? Are the genetic representations of the phenotype a product of evolution? What, if any, are the evolutionary forces that shape the genotype-phenotype map?
The thesis of this essay is that the genotype-phenotype map is under genetic control and therefore evolvable. Further we suggest that its evolution explains seemingly unrelated problems of evolutionary biology: the role of epistasis in adaptation, genetic canalization, developmental constraints, developmental and morphological integration, biological versatility, the evolution of complex adaptations, the biological basis of homology and perhaps the origin of body plans. Evolutionary computation may provide a fertile new source of experience from which these different problems in evolutionary biology can be integrated.