Since my post a couple of weeks ago about NASA and the antenna evolution experiment,

I've been meaning to write a followup. In both comments and private emails, I've gotten

a number of interesting questions about the idea of fitness landscapes, and some of the things

I mentioned in brief throwaway comments. I'm finally finding the time to write about

it now.

### Fitness Landscapes

First, let me review what a fitness landscape is.

Suppose you're searching for something. There are N parameters that define the thing you're searching for. Then we can say that you're doing a search for a target *within N dimensions*. The N-dimensional space containing all of the possible values for all of the dimensions is called the *search space*.

You can model that search as an iterative function, F, which takes

as parameters a tuple of N values defining a current position in the search space,

and returns a new position in the search space.

So you start at some arbitrary position in your search space, and that's

the first argument to F. Then each time you call F, it returns a new position in

the search space, which is hopefully closer to the target of the search.

How do you evaluate the quality of a position in the search space?

You define an evaluation function, E. E takes a position in the search space,

and returns a real number which describes the quality of the position in terms

of the particular search. The higher the value returned by E for a point, the

better the quality of the target described by the position in the search space. Expressed using the evaluation, the goal is to find the position p within the space that

maximizes E(p).

Now, suppose you want a way of visualizing the search. It's easy: define

a new space with N+1 dimensions, where the N+1th has the value of

calling E on the first N dimensions. Call this new, N+1th dimension "height". Now,

the surface produced by doing this is the evaluation landscape. If your search is

an evolutionary process, the evaluation landscape is called a *fitness landscape*

As usual, an example makes this easier. Suppose that you're searching

in two dimensions. Then your position at any time is just a position in

a two-dimensional cartesian graph. To visualize the search space, you create

a three dimensional graph, where the quality of the search result at (x,y) is

the value of the Z coordinate.

For example, imagine that the fitness function is something nice and simple, with

a single obvious maximum: E(x,y) = sin(x) + cos(y). The maximum will naturally be 2, and

one of the places where it will occur is at (π/2,0). Then your fitness landscape ends up looking like the graph below.

We can also get some much wilder landscapes, like the one below, which is defined by E(x,y)=sin(xy)cos(xy) + (3-xy)/((xy)^{2} + 1). It's still structured,

but it's a lot more complicated.

And we can get totally unstructured landscapes, where each point is complely random, like this one:

### Static Versus Dynamic Landscapes and Adding Dimensions

I've frequently criticized creationists for talking about a *static* fitness landscape. In a throwaway comment, I mentioned that you *can* use a static landscape by adding dimensions. This comment really confused a lot of people.

Suppose you've got a search of a two-dimensional space. Now, we're going to make it more

complex. Instead of the search function always being the same, we're going to actually have each

invocation of the search function return both a point in the search space, *and* a new

fitness function. So we start with an initial search function, f_{0} and some initial

point (x_{0}, y_{0}) in the search space. The first search step evaluates

f_{0}(x_{0}, y_{0}), getting (x_{1},y_{1}) and

f_{1}. The second search step evaluates f_{1}(x_{1}, y_{1}),

getting (x_{2},y_{2}) and f_{2}. So the search function is

*changing* over time.

Assuming we're in the world of computable functions, every possible value for

f_{i} can be represented by an integer, F_{i}. So we can re-model this as three-dimensional search, where instead of evaluating f_{i}(x_{i}, y_{i}) getting f_{i+1} and (x_{i+1}, y_{i+1}), we'll evaluate

a new three-dimensional search function f'(x_{i}, y_{i}, F_{i})

getting back (x_{i+1}, y_{i+1}, F_{i+1}). So we've taken the

way that the search function itself can change, and incorporated it into the

search space, at the cost of making the search space itself more complex.

For any way that a search space or search process changes over time, we can

incorporate that change into the search space by adding dimensions. This can give us

a fixed fitness landscape - but it does so at the cost of adding lots of dimensions.

So now we've got a measure of the quality of the search result at any particular step. How

can we define the *performance* of a search over time? There are several ways of doing it; the easiest to understand is as the limit of a function P, such that at time T, P(T) = (σ_{i=0,T}E(x_{t}, p_{t}))/T; that is, the mean of

the evaluation result at each step, divided by the number of steps. So take P(T), and

compute its limit as T approaches infinity - that's one reasonable way of describing the

performance of a search. What it's really doing is computing the

average fitness result of the search. There are others measures that also take

the rate of convergence on a solution into account, but for simplicity, the simple measure is

adequate.

### No Free Lunch

Now, to move on to a bit of NFL-related stuff.

The main fitness-landscape argument used by creationists against evolution is based

on Dembski's No Free Lunch work. NFL is a family of theorems which (stated informally) says:

averaged over all possible fitness landscapes, no search algorithm can possibly do

better than a random walk.

Given a search function and a landscape, you can compute a performance measure. Given a set of multiple landscapes, you can compute the performance of your search function on each landscape separately. Then you can describe the average performance of your search on your set of landscapes. If the set of landscapes is enumerable, then you can (theoretically) compute the average performance of your search function over all possible landscapes. (Even if the the set of landscapes isn't enumerable, you can still talk theoretically about the average performance over all landscapes; you just can't compute it.)

So - given any reasonable performance metric, no search algorithm can possible do better than random guessing over the set of all possible fitness landscapes.

There are places where this theorem is actually interesting. But they're not the places

that creationists like Dembski continually invoke it. (The main interesting application

if it is in game theory, where you can show things like "There's no universal

game winning algorithm.") In something like evolution, it's silly for one simple reason: the "all possible landscapes" clause.

Evolution doen't work over all possible landscapes, and no sane person would argue

that it could.

When we talk about fitness landscapes, we tend to think of nice, smooth, continuous

surfaces, with clearly defined maxima and minima (hilltops and valley bottoms). But all

possible landscapes doesn't just include smooth landscapes. It includes random ones. In

fact, if you work out the set of all possible landscapes, there are a *lot* more

discontinuous random ones than there are structured, smooth ones.

So in the NFL scenario, you need to be able to work in a nice smooth landscape, like

this one:

It's easy to see how a search function could find the maximum in that one-dimensional

seach space - it's got a very clean fitness landscape. On the other hand, NFL also requires

you to be able to find the maximum in a totally random landscape - like the one below:

It should be obvious that there's absolutely no meaningful way to search for the maximum value in that seach space. There's nothing in the fitness landscape that can provide the slightest clue of how to identify a maximum.

Evolution works in landscapes with structure. Another way of putting that is that

evolution works in landscapes where the result of a search step provides feedback about the

structure of the landscape. But the key takeaway here is that NFL doesn't provide any meaningful rebuttal to information, because we don't expect evolutionary search to

work in all possible landscapes!

In terms of evolution, the average performance over all possible landscapes is a strange

idea. And it's obvious on an intuitive level why there's no way that a single search function

can possibly perform well on all possible landscapes. On some landscapes, it will do really

well. On some (the random ones, which make up the majority of all possible landscapes), it

will be effectively the same as a random walk. And on others, it will perform worse than

random, because the adversarial landscape is perfectly structured to make the search strategy

travel towards the worst possible places.

### Smuggling Information

Lately, Dembski and friends have been taking a new tack, which involves talking about

"smuggling information". They've been using the NFL argument for years, but they've

run into a rather serious problem: evolution works.

As evolutionary algorithms have become well known, and have had some astonishing

successes, it's become difficult to make the argument that an evolutionary process

can't possibly succeed. So it became necessary to come up with an excuse for why

evolutionary algorithms work so amazingly well, even though the creationists

mathematical argument shows that it should be impossible.

Their excuse is to basically accuse the experiments that use evolutionary

algorithms of cheating. They argue that the reason that the evolutionary mechanism can

succeed is because information was "smuggled" into the system - that, essentially, the

search algorithm encodes information about the landscape which allows it to succeed. They

argue that this is cheating - that the evolutionary process really has nothing do with the

success of the search, because the "solution" is really encoded into the structure

of the problem via the smuggled information.

There are two responses to this.

The first one is, in a word, "Duh!". That is, *of course* there's information

about the landscape in the system. As I discussed above, there's no such thing as a search

algorithm that works on all landscapes, but for landscapes with particular properties, there

are search algorithms that are highly successful. If you look at it from an

information-theoretic viewpoint, *any* search algorithm which can successfully operate

in a particular search space encodes information about the space into its structure. From

the viewpoint of math, this is just totally, blindingly obvious.

And it's not a problem for biological evolution. Biological evolution is based

on mutation, reproduction, and differential success. That is, it's a process

where you have a population of reproducing individuals, where the children are

slightly different from the parents. Some of the children survive and reproduce,

and some don't. This process clearly only works within a particular kind of search space;

that is, a search space where the survival to reproduction of a subset of the population

indicates that that subset of the population has a higher fitness value.

Evolution, modelled as a search, requires certain properties in its search space

for it to work. The information smuggling argument basically claims that that

means that it can't work. But *every* successfull search algorithm

has certain requirements for he search-space in which it operates. By the

arguments of Demski and friends, there is no such thing as a successful search

that doesn't cheat.

The second response to the argument is a bit more subtle.

If you look at the evolutionary process, it's most like the iterative search process

described towards the beginning of this post. The "search function" isn't really static over

time; it's constantly changing. At any point in time, you can loosely think of the search

function for a given species as exploring some set of mutations, and selecting the ones that

allow them to survive. After a step, you've got a new population, which is going to have new

mutations to explore. So the search function itself is changing. And how is it changing?

Modelled mathematically, it's changing by *incorporating information about the
landscape*.

There's a really fascinating area of computer science called inductive learning machine

theory. It was invented by a guy named John Case, who's a friend of mine; he was a member of

my PhD in grad school. John is interested in understanding how learning works on a

mechanistic level. What he does is model it as an inductive process.

You've got a target function, T. The goal of the learning machine is

to find a program for T. The way that it does that is by continually refining its

program. So you give it T(0), and it outputs a guess at a program for T. Then you give

it T(1), and it outputs a better guess (it should *at least* be correct for

T(0) and T(1)!). Then you give it T(2), and it outputs another guess. Each step, it

generates a new program guessing how to compute T.

According to Dembski, the process by which the learning machine improves its guesses is

*cheating*.

Evolution is really doing exactly the same thing as John's inductive learning machines:

each step, it's gaining information from the environment. You can say that the evolutionary

search at step N has *incorporated the results* of steps 0..N-1. Information about the

environment has naturally been incorporated. In fact, if you think about it, it's

*obvious* that this is the case - and that this is exactly *why*, on a

mathematical level, evolution works.

Go back to the NASA experiment for a moment. The search space of possible antennas is

huge - but it's structured and continuous. How does evolution find the best antenna? It tries

lots of options - and based on the results of those options, it discovers something about the

structure of the search space - it discovers that *some set* of solutions work better

than others, and it tries to search forward from those. By doing that, it's pruned out huge

areas of the search space - the areas rooted at the least successful tests. Eliminating a

large part of the search space *is a way of incorporating information about the search
space*: it's adding the information that "the solution isn't over there".

Pointing out that an evolutionary process works in a particular set of landscapes, and

that it's got a way of acquiring and integrating information about the landscape that it's

searching isn't a critique of evolution - it's an explanation of how it works. Far from being

a good argument against evolution, it's an argument *for* evolution, because it shows

the mechanism by which evolution is capable of finding solutions.

This seems analogous to the way the creationists have babbled about "irreducible complexity" as an argument against evolution; whereas in fact the same thing, under the name of "interlocking complexity", was predicted as a _consequence_ of evolution, by Muller, almost a century ago.

The worst criticism you can make of evolution as a search algorithm is that it would not work in a world where having more descendants means that you have fewer descendants. Well, yes.

Many complex dynamical systems can't be written in terms of a landscape. The Lorenz system is a simple example. Why do you assume that evolution can be described by a landscape? Is there a fundamental reason or is it a convenient assumption?

Could you say a little more about the NFL theorems? As you've described them, I'm not sure what the problem is even under the creationist account: some random walks in some landscapes are optimizing, no?

You may want to state explicitly that this argument is made without buying the premise that Homo Sapiens is the end-all be-all `function' that is being sought out.

If you look at the search space of biological evolution, you see that the vast majority of it would give a fitness of 0 for any realistic evaluation function. I think this is part of what inspires Dembski to imagine it looking like your random example. Of course, even the anti-science folks admit "microevolution", thus admitting a smooth fitness landscape over small regions.

In Dembski's case, it's just a bunch of hot air and smoke and mirrors. Behe and the irreducible complexity-ites on the other hand are making a specific [and theoretically possible] claim about the fitness landscape: that the regions of nonzero fitness pertaining to most living organisms are not connected to the regions pertaining to most others or the jumps between those regions of nonzero fitness are too far to be jumped in a single generation with the frequency needed for evolution to occur as modern biology sees it.

In other words, God has to keep pushing it along because even God was not powerful enough to create a search algorithm sufficient to traverse the fitness landscape he created.

The question then arises from the "ID" camp, how did the particular mechanism for making such choices arise? This is probably what they mean by "information loading, and probably equates to the question of how you achieve the Biogenic transition from non-"living" material to the LUCA.

The answer as I understand it involves that (a) such a mechanism is potentially POSSIBLE to have to begin with (b) precursors can form, and in biology (c) prevolution, where polymers may catalyze formation rates for simpler "poly"mers, other similar polymers, or even more complex polymers [doi:10.1073/pnas.0806714105]... eventually leading to the mechanism of a polymer which autocatalytically self-replicates in an imperfect manner which allows competitive selection of variations, aka "evolving life".

I'm just a bloody amateur, however.

jbw_jbw@hotmail.com:Many complex dynamical systems can't be written in terms of a landscape. The Lorenz system is a simple example.Er? Doesn't it just need to be expressed as a (possibly catastrophic) landscape in about four dimensions? Or are you considering a search for the surface as a whole, rather than finding some point "on" the surface?

"Many complex dynamical systems can't be written in terms of a landscape. The Lorenz system is a simple example."

What does that mean? What is, exactly, this "Lorenz system" and give an example of a "complex dynamic system" (whatever that is, why would -dynamic- be relevant? Mark just showed how to turn a "dynamic" fitness function into a static one of higher dimension. What class of functions wouldn't work?). You can certainly create a fitness landscape using as the landscape function one of the chaotic functions Lorenz played with. You can create one using many chaotic functions as the fitness function. You may be correct, but I am not actually seeing what you are arguing.

You can create a fitness landscape using a nowhere differential function. You won't get a fitness function that "works" very well in that case, but you can create it.

@Stephen #4:

No, we're not the end-all and be-all, but we should represent a close approximation of a local peak on the fitness landscape. That doesn't mean that another species with a different starting point on the landscape couldn't climb to a higher peak, but we are the best fit to our environment that could have been attained given our evolutionary history, as I understand it. There's no such thing as a population evolving "toward" some ideal.

@ Uncephalized #8

Your last sentence is the point I think Stephen was trying to make. You also have to remember that if you take a long enough span of time, the evolutionary paths a specific organism takes branch - so humans and the other still-living primates are all successful, and apparently more so than the now-extinct primates.

That's part of it; if you look from any of our ancestors, you can't just say we're the "goal", because the goal is just ... continued reproduction.

@Uncephalized #8:

I wouldn't be surprised if there's some level of time lag involved. That is, if something happens to change the fitness landscape, it may take a while for a population to reflect that. So my guess is that we may not approximate a local maximum of the current function as well as we would a slightly older version.

And, of course, the same would be equally true for almost any species that's not clearly headed towards short-term extinction. Whether out peak is higher than that of any other species' isn't something we can really determine.

@Paul #3

Indeed, a random walk will work really well for some fraction of all possible landscapes. It's another way to state the NFL results. As Mark stated, NFL tells us that over all possible landscapes, random search performs exactly as well as any single search algorithm. However, it also tells us that on any single landscape, all possible search algorithms, when averaged together, perform equivalently to random search. A search algorithm, under the restrictions of NFL at least, is simply an ordered, nonrepeating sequence of points drawn from the search space. Some will optimize the function much better than random search and other will optimize it much worse. It's also worth noting that the NFL results are also true regardless of how you measure the efficacy of the search.

So if you know absolutely nothing about your optimization problem, you might as well try a random walk. Of course, in practice, we know that most interesting problems are not random, for example. We often know that a function is convex, for another example. This type of information gives us an advantage. We can relatively safely assume that an algorithm which always takes the worst available next step will be outperformed by a really good tabu search, even though over all possible functions, it will be exactly as good.

Mark wrote

And

As I've said several times in several venues, including possibly in a comment here on GM/BM, evolution by random mutations and natural selection is a process for transferring information about properties of the environment to the genome of a population.

Oran remarked

A characteristic of biological evolution, whose initial conditions include a successfully reproducing population, is that it

startson a node (really, a cluster of neighboring nodes, given that it's a population with heritable variability) that is of non-zero fitness -- that's what "successfully reproducing"means. So the subspace from which mutations sample is not a random sample of the whole space but rather is restricted to a region 'near' a node already known to be of non-zero fitness, where 'near' means one mutational step away from the current node. (And yes, the point holds if there are multiple kinds of mutations -- indels, duplications, etc. The sampling is still from a tiny region of the whole N-space and is in a region of known non-zero fitness.)Stephen Wells, #1: actually, evolution as a search algorithm does have one big disadvantage over other search algorithms: it doesn't tell you when you've reached the optimum. It doesn't even tell you if you've reached a

localoptimum. This also relates to the remark from Stephen at #4.Of course, this does mean that evolution doesn't suffer too much on search spaces that aren't smooth enough to detect local optimums to begin with, like search spaces with discontinuities, whereas other search algorithms won't work at all if the search space isn't sufficiently smooth.

Also, no search algorithm currently exists that guarantees the global optimum is found within a reasonable amount of time (which is an NP-complete problem), so I can't hold that against evolution either.

As I understand it, even local optima are pretty rare in high-dimensional spaces... in practice, you have to settle for a succession of ridges and saddles.

Another implication of this NFL stuff would seem to be that "universal superiority" is a dubious claim -- no matter how well you've adapted to prior conditions, there will be some possible change that would "clean your clock"....

Beowulff wrote

Again, referring to biological evolution in particular, the system doesn't even

careif it's on a global (or local) maximum.It is not a search process!It 'cares' only that fitness is sufficiently far enough above zero to ensure the population continues to reproduce at a rate higher than the resources of the environment can support (fecundity). Populations whose reproduction rate falls below the replacement rate ultimately go extinct. That's the definition of fitness -- the "quality of the position" in Mark's words -- in biological evolution. Nothing else matters.The search metaphor for biological evolution has several significant flaws. First is the one I mentioned above, that the initial conditions include a population that is already successfully reproducing and is therefore on a node with a value of E that is already sufficiently high. In fact, a successfully reproducing population is not 'searching' for anything; it is sitting on a satisfactory solution.

Second, biological evolution is a massively parallel process, with multiple candidates (both organisms within populations and populations themselves) simultaneously tested for fitness. It is a competition that strictly relative and confined to the immediate context -- it matters only that I can outrun you, not the bear, right now. Outrunning the bear (a local maximum wrt the external environment) is irrelevant: outrunning you is the determinant of my fitness. And whether I am fit depends on how fast you can run, and therefore on who you are; it is not a constant value for all 'you.' As Mark has defined it above, E takes a different value for every instance of 'you.'

Third, biological evolution neither knows nor cares about maxima, be they global or local. It 'cares' only about relative fitness at each point in time. Just so long as the node is characterized by a fitness value (reproduction rate) sufficiently high as to exceed the carrying capacity of the environment, it is happy. While it might find local (or even global) maxima, that is merely a by-product of the evolutionary process and is irrelevant to characterizing that process.

Fourth, biological evolution (like genetic algorithms) is inefficient and massively wasteful of organisms and populations. For every successfully reproducing population there are untold numbers of failed populations -- over 99% of all species (populations) have gone extinct.

Fifth, intuitions derived from the representation of fitness landscapes as two- or three-dimensional surfaces are actively misleading because they encourage the notion that local or global maxima are relevant. As Sergey Gavrilets has shown (see also Tellgren's informal discussion) (PDF), high-dimensioned biological fitness landscapes are dominated by fitness 'ridges' -- nearly-neutral networks, interconnected volumes of space characterized by near-equal fitness -- that enable an evolving population to move about in fitness space with relative freedom via purely stochastic processes; hill-climbing adaptation -- maxima-seeking -- is not the sole or even the dominant variable.

Mark -

Good post, but I do have one criticism. I think you are overstating Dembski's argument. He was not claiming that the NFL theorems show that Darwinian evolution is impossible. He was claiming instead that Darwinian evolution can only work if “complex, specified information” had been built into the biosphere to begin with. He even traces this back to the fine-tuning of the universe. The “information smuggling,” argument was always Dembski's main claim. It was not some desperation move in response to the fact that evolution works.

Of course, your “Duh” argument is right on the mark. I reveiwed

No Free Lunchwhen it was published, for the academic journalEvolution. I made effectively the same point you did:"But the key takeaway here is that NFL doesn't provide any meaningful rebuttal to information,"

Did you mean "..any meaningful rebuttal to evolution" ?

This is the kind of thing that I frequently struggle with at work. We use a parametrized functional based system that can only be solved discretely. ie f(t) = y for (0

Damn greater than signs didn't work, for obvious reasons.

so the function is f(t) = y for (0 > t > 1)

Mark, you definitely must have a look at

http://www.swimbots.com/index.html

It's a very nice and simple evolution simulation. It tool me a couple of hours of fiddling with the system to figure out how to manufacture a decent swim-bot. The simulated evolution took less time to achieve several superior designs.

it should at least be correct for T(0) and T(1)Evolution is really doing exactly the same thing as John's inductive learning machinesThis is beyond my competence, but it sounds like John's machine might be overfitting. Wouldn't evolution generally avoid making too huge an adjustment over a single generation that might be an outlier?

Another point to keep in mind is that over the long run, natural selection selects for evolvability. For example, one could imagine an organism for which the interdependencies between its various "genes" are so complex that the effect on fitness of the smallest possible change would be effectively random, resulting in a random search landscape. Such an organism would be unable to evolve by natural selection, and would ultimately be outcompeted by organisms with biological processes organized in such a way that small changes at the genetic level typically result in small changes at the phenotypic level. So we are all the distant descendants of the proto-organisms that were capable of evolving. And indeed, mutational studies show that organisms are surprisingly tolerant of mutational changes. Most mutations produce only small effects on fitness, both because only a few amino acids in a protein are absolutely critical, and also because there are homeostatic mechanisms that can often compensate for changes. Often, a gene that is involved in critical processes can be completely "knocked out" (inactivated) with only a modest effect on fitness because some other gene or genes takes over its functions. This is of course exactly what one would expect based on evolutionary theory.

The following new paper on arXiv does for continuous time models what I've previously studied (since grad school 1973-1977) for discrete time. Follow (unless you're a clueless Creationist who doesn't know integral calculus) this hotlink to the abstract, then click on the hotlink to PDF [this indirection will work if and when the PDF is relocated]:

On mathematical theory of selection: Continuous time population dynamics

Authors: Georgy P. Karev

Comments: 28 pages; submitted to J. of Mathematical Biology

Subjects: Populations and Evolution (q-bio.PE); Quantitative Methods (q-bio.QM)

Mathematical theory of selection is developed within the frameworks of general models of inhomogeneous populations with continuous time. Methods that allow us to study the distribution dynamics under natural selection and to construct explicit solutions of the models are developed. All statistical characteristics of interest, such as the mean values of the fitness or any trait can be computed effectively, and the results depend in a crucial way on the initial distribution. The developed theory provides an effective method for solving selection systems; it reduces the initial complex model to a special system of ordinary differential equations (the escort system). Applications of the method to the Price equations are given; the solutions of some particular inhomogeneous Malthusian, Ricker and logistic-like models used but not solved in the literature are derived in explicit form.

Crongatulations ORAN, in your post #5 you showed that you understood why Mark is completely wrong. But I dont think the Dembski´s work is just a bunch of hot air and smoke and mirrors.The fitness of an organism doesnt give any clue of what mutation will increase its fitness.What mark is saying is that as long as the fitness of an organism increases, the probability of a mutation increases its fitness increases too.It´s really an absurd.

Another straw man of ID liars and Creationist ignoramuses, who almost always ignore neutral mutations, has been published:

http://www.sciencedaily.com/releases/2010/01/100120131203.htm

How Organisms Can Tolerate Mutations, Yet Adapt to Environmental Change

ScienceDaily (Jan. 25, 2010) — Biologists at the University of Pennsylvania studying the processes of evolution appear to have resolved a longstanding conundrum: How can organisms be robust against the effects of mutations yet simultaneously adaptable when the environment changes?

The short answer, according to University of Pennsylvania biologist Joshua B. Plotkin, is that these two requirements are often not contradictory and that an optimal level of robustness maintains the phenotype in one environment but also allows adaptation to environmental change.

Using an original mathematical model, researchers demonstrated that mutational robustness can either impede or facilitate adaptation depending on the population size, the mutation rate and a measure of the reproductive capabilities of a variety of genotypes, called the fitness landscape. The results provide a quantitative understanding of the relationship between robustness and evolvability, clarify a significant ambiguity in evolutionary theory and should help illuminate outstanding problems in molecular and experimental evolution, evolutionary development and protein engineering.

The key insight behind this counterintuitive finding is that neutral mutations can set the stage for future, beneficial adaptation. Specifically, researchers found that more robust populations are faster to adapt when the effects of neutral and beneficial mutations are intertwined. Neutral mutations do not impact the phenotype, but they may influence the effects of subsequent mutations in beneficial ways.

To quantify this idea, the study's authors created a general mathematical model of gene interactions and their effects on an organism's phenotype. When the researchers analyzed the model, they found that populations with intermediate levels of robustness were the fastest to adapt to novel environments. These adaptable populations balanced genetic diversity and the rate of phenotypically penetrant mutations to optimally explore the range of possible phenotypes.

The researchers also used computer simulations to check their result under many alternative versions of the basic model. Although there is not yet sufficient data to test these theoretical results in nature, the authors simulated the evolution of RNA molecules, confirming that their theoretical results could predict the rate of adaptation.

"Biologists have long wondered how can organisms be robust and also adaptable," said Plotkin, assistant professor in the Department of Biology in Penn's School of Arts and Sciences. "After all, robust things don't change, so how can an organism be robust against mutation but also be prepared to adapt when the environment changes? It has always seemed like an enigma."

Robustness is a measure of how genetic mutations affect an organism's phenotype, or the set of physical traits, behaviors and features shaped by evolution. It would seem to be the opposite of evolvability, preventing a population from adapting to environmental change. In a robust individual, mutations are mostly neutral, meaning they have little effect on the phenotype. Since adaptation requires mutations with beneficial phenotypic effects, robust populations seem to be at a disadvantage. The Penn-led research team has demonstrated that this intuition is sometimes wrong.

The study, appearing in the current issue of the journal Nature, was conducted by Jeremy A. Draghi, Todd L. Parsons and Plotkin from Penn's Department of Biology and Günter P. Wagner of the Department of Ecology and Evolutionary Biology at Yale University.

The study was funded by the Burroughs Wellcome Fund, the David and Lucile Packard Foundation, the James S. McDonnell Foundation, the Alfred P. Sloan Foundation, the Defense Advanced Research Projects Agency, the John Templeton Foundation, the National Institute of Allergy and Infectious Diseases and the Perinatology Research Branch of the National Institutes of Health.

Story Source:

Adapted from materials provided by University of Pennsylvania.

Journal Reference:

1. Draghi et al. Mutational robustness can facilitate adaptation. Nature, 2010; 463 (7279): 353 DOI: 10.1038/nature08694