Much to my professional shame, PZ recently pointed out David Plaisted, a Computer Science professor at
the University of North Carolina, who has an anti-evolution screed on his university
website. Worse, it's typical creationist drivel, which anyone with half a brain should know is utter
rubbish. But worst of all, this computer scientist's article is chock full of bad math. And it's
deliberately bad math: this guy works in automated theorem proving and rewrite systems - there's no way
that he doesn't know what utter drivel his article is.
His argument comes in two parts. The first, which he sees as his real contribution, is
an argument that is a pseudo-scientific formulation of that good old standard, differentiating between micro- and macro- evolution (which he calls small and large evolution), and arguing that there is
a barrier - so that small changes cannot possibly add up to become large changes. From his introduction:
I would like to present an alternative to the theory of evolution that includes the type of evolution that has been observed directly or in the fossil record, but does not require a common origin for all of life. The kind of evolution that has been observed will be called small evolution, but evolution at some larger scale to be specified will be called large evolution. The purpose of this discussion is to provide some kind of a reasonable boundary on what has been observed and what can reasonably be considered possible without being a full-scale evolutionist. Evolutionists often say that if you believe in the origin of new species, you believe in evolution, showing that they do not make a distinction such as that between large and small evolution.
How does he back up this argument?
It is my impression that organisms have a loosely constrained part, consisting of
characteristics like skin color that are easily modified without many effects on the remainder of the
organism. There is also a tightly constrained part consisting of many elements that are tightly
interconnected, and one cannot change anything without significant effects on many other things. For
example, many proteins that have to interact with each other would be tightly constrained, because a major
change to any one of them would prevent its interaction with the others. And, it is difficult for such
groups of genes to evolve because of the many interactions. It may be possible, however, to modify some of
the amino acids in such a protein without much effect on its function. This would be a minor change to the
highly constrained part of the organism, and I am willing to consider such mutations as part of "small
evolution." However, major mutations to the tightly constrained "kernel" of an organism would be "large
evolution" if there were a significant number of them.
Thus we have two parts of the genome, the loosely constrained part and the tightly constrained part,
or kernel. We also have two kinds of mutations, those that have little effect (minor mutations) and those
that have a significant effect (major mutations). Small evolution asserts that there can only be a small
number of major mutations in the tightly constrained part of an organism.
Yep, "It is my impression", followed by a bunch of nonsense trying to justify the idea that there must
be some kind of barrier. No actual evidence, no actual argument. No way of defining the difference between
the "tightly constrained part" and the "loosely constrained part". In fact, as the article goes on, what
his definition really t it comes down to is, if we've ever observed a mutation in some trait, it must be
part of the loosely constrained part. So for example, from the section he titles "A Precise Definition":
The highly constrained part of the organism would, as explained above, consist of proteins that
interact with many others, such as in the metabolism of ATP. Probably the best way to define this part of
the organism is to say that it consists of genes coding proteins without which the organism cannot
survive. We still need to specify what are major and minor mutations. Some point mutations substitute
amino acids in a protein, but do not change its shape or appreciably change its function. Some point
mutations do not even change the amino acid. These would be minor mutations. Some point mutations cause
drastic changes in the shape of the protein. This would undoubtedly have a major effect on the function of
the protein, and would be a major mutation. There are thousands of proteins in a cell, but most pairs of
proteins do not interact in any way. This is because their shapes are so carefully segregated. It is
difficult to see how this segregation could have arisen in an organic soup with many nucleic acid pieces
evolving in many different ways.
We consider the probability that a mutation to the highly constrained part of the organism will be
beneficial or fatal. Kimure (cited in ReMine, The Biotic Message, page 246) estimates that mutations which
alter amino acids are ten times more likely to be harmful than neutral or beneficial. It would be a simple
matter to run laboratory tests to see how often a point mutation causes a major change in the shape of a
typical protein. I suspect that over half of the mutations that cause an amino acid substitution would be
major mutations, and that these would almost always be fatal. If the shape of a protein changes
drastically, the chance that the protein will still have a useful function in the cell is extremely small.
Thus the ratio of harmful or fatal mutations to beneficial ones (ignoring neutral mutations) would be very
high. A major mutations to the highly constrained part (kernel) of the organism would almost always be
fatal. Also, changes to the shape of a protein probably occur in large jumps or small increments, because
of how proteins fold. If the folding of a protein is changed due to a point mutation, its shape will
significantly change. Otherwise, the shape will not change appreciably. Thus there are gaps even in the
structure of proteins.
This is fairly carefully disguised, to try to hide the fact that he's got a weasel's argument. But
what it comes down to is an elaborate argument for why any observed mutation to
any part of an organism automatically disqualifies that part from being part of the core constrained
part of the genome. Remarkably handy, that - any mutation that anyone ever points out, he can just
wave his hands - poof! - declare it part of the unconstrained genome, and whoopie! no problem.
You can also see quite clearly where he's going with this whole argument. It's just another
wretched big numbers argument. He's going to pull a bundle of numbers out of his ass, multiply
them together, declare them to be the probability of something, and say "Look, that's just too unlikely to be possible, because it's just too improbable."
Of course, before he gets to that, he needs to do some quote mining. What's a crappy creationist
screed without any quote-mining? He pulls a little bit out of the talk.origins archive (naturally
without linking... Don't want the rubes to be following the link and seeing what it really says, now
Here's what he quotes, in context from his article:
Here is a quotation from Introduction to Evolutionary Biology at the talk.origins archive:
"Most mutations that have any phenotypic effect are deleterious. Mutations that result in amino acid substitutions can change the shape of a protein, potentially changing or eliminating its function. This can lead to inadequacies in biochemical pathways or interfere with the process of development"
For evolution to have occurred, it would seem that the structures of proteins must have changed in large jumps due to point mutations, since many different species have substantially different genes. Thus if all life has a common ancestor, large evolution must have occurred. However, the evidence for this is lacking in the fossil record. Even the common structures found in different organisms can argue for a common designer rather than common descent. Furthermore, there are difficulties of plausibility with large evolution. One can imagine the proteins evolving in small increments, but for them to cross the large gaps seems impossible. About the only way I can conceive for this to happen is if the gene for the protein is first copied, and then one of the copies mutates to a new shape, while the original gene continues to preserve its needed function in the cell. However, it would probably be a long time before the new copy would have any function in the cell, so this would entail a useless protein existing in the organism for a significant time. These might correspond to the pseudogenes, whose function we do not know.
Before getting to the proper quote, let me point out that he's playing another trick here. We know perfectly well that one of the common mechanisms by which new functions recur is duplication of old functions, followed by mutation of one copy. This is common, and observed. He works that into his argument here, trying to wave it away, so that if anyone brings it up, he can say he refuted that.
Now, let's get back to the focus, and look at the original article from talk.origins:
Most mutations are thought to be neutral with regards to fitness. (Kimura defines neutral as |s| < 1/2Ne, where s is the selective coefficient and Ne is the effective population size.) Only a small portion of the genome of eukaryotes contains coding segments. And, although some non-coding DNA is involved in gene regulation or other cellular functions, it is probable that most base changes would have no fitness consequence.
Most mutations that have any phenotypic effect are deleterious. Mutations that result in amino acid substitutions can change the shape of a protein, potentially changing or eliminating its function. This can lead to inadequacies in biochemical pathways or interfere with the process of development. Organisms are sufficiently integrated that most random changes will not produce a fitness benefit. Only a very small percentage of mutations are beneficial. The ratio of neutral to deleterious to beneficial mutations is unknown and probably varies with respect to details of the locus in question and environment.
Mutation limits the rate of evolution. The rate of evolution can be expressed in terms of nucleotide substitutions in a lineage per generation. Substitution is the replacement of an allele by another in a population. This is a two step process: First a mutation occurs in an individual, creating a new allele. This allele subsequently increases in frequency to fixation in the population. The rate of evolution is k = 2Nvu (in diploids) where k is nucleotide substitutions, N is the effective population size, v is the rate of mutation and u is the proportion of mutants that eventually fix in the population.
Mutation need not be limiting over short time spans. The rate of evolution expressed above is given as a steady state equation; it assumes the system is at equilibrium. Given the time frames for a single mutant to fix, it is unclear if populations are ever at equilibrium. A change in environment can cause previously neutral alleles to have selective values; in the short term evolution can run on "stored" variation and thus is independent of mutation rate. Other mechanisms can also contribute selectable variation. Recombination creates new combinations of alleles (or new alleles) by joining sequences with separate microevolutionary histories within a population. Gene flow can also supply the gene pool with variants. Of course, the ultimate source of these variants is mutation.
The original context is one which is leading into a discussion of the actual probability math
of the propagation or elimination of a mutation in the genome of a species. It's clear that most mutations are neutral; that most (but not all) mutations that have immediate phenotypic
effect are deleterious; and that mutations that are initially neutral can become beneficial over
time. In context, it gives rather a different impression, no?
More importantly, it shows something about what good mathematical studies of evolution predict. And it means that the author of this little screed has seen the valid mathematical work. So he's
got no excuse for slapping together nonsense numbers to create his probability argument. We now know that if he doesn't cite the legitimate mathematics of mutation rates and probabilities, that it's not because he's ignorant - it's because he's deliberately not using the valid math.
If the shape of a protein changes, the protein will likely have no function in the organism. The protein will continue to have no function until enough mutations have accumulated that it again has a function in the cell. All of these intermediate mutations will be neutral ones. In order to obtain the new functional protein, all of the combinations of neutral mutations have to be generated, since evolution has no way to distinguish between them on the basis of fitness. Thus evolution has to do a blind search in this case, which is very inefficient.
All combinations of neutral mutations have to be generated? Where'd that come from, you ask? The answer is "nowhere". But still - ignoring that - watch what he's going to do next. He's claiming that
evolution must perform an exhaustive search over all possible neutral mutations of the gene that
produces a protein.
Typical polypeptide chains have from 50 to 3000 amino acids, so their genes have from 150 to 9000 base pairs. Often several polypeptide chains fit together in a protein, and their shapes have to match very carefully for this to occur. Once a gene has mutated, it will probably take a number of further mutations until it again has a function in the cell. For purposes of illustration, let's start with a gene having 100 base pairs and suppose that at least 5 point mutations are needed until it again has a function in the cell. Now, the more mutations that occur, the more random the gene will become, so we would expect that the density of useful genes decreases with increasing numbers of mutations. Therefore, the most efficient way to discover a new useful gene is to generate all possible combinations of mutations in order of the number of mutations. How many combinations of 5 point mutations are there? This would be 3 5 (100*99*98*97*96)/(1*2*3*4*5) since there are 3 point mutations at each locus. This is about 18 billion. We need to have at least 18 billion individuals, then, with different alleles, to be able to generate all of these. (We might find a useful gene before all of the combinations were generated, though. This could reduce the number 5 to somewhere near 4.) Anyway, this requires at least 18 billion mutations in a region of 100 base pairs, or about 180 million mutations per base pair. The genome probably has at least 10 million base pairs, so we would need about 2 * 10 15 mutations altogether. This might be feasible in a million years in a population of a billion with about a mutation per year per individual.
Did you notice the switch? In the previous paragraph, he said "all combinations of neutral mutations". Now he's switched it to "all combinations of any mutation". And he's asserted that only point mutations
count, that point mutations occur one at a time, and that there cannot possibly be any beneficial effect
until all 5 points are in their ideal mutated form. He's mutated his argument to boost the size of
But only 5 point mutations to get a new shape of a functional protein seems very small when there are 150 to 9000 base pairs. If there are 10 mutations, the same calculations lead to about 10^18 combinations, which is about 10^16 mutations per base pair, for a total of 10^23 mutations in the population if the genome has about 10^7 base pairs. A billion individuals for a billion years would give 10^18 mutations with one mutation per individual per year, so we would need a trillion individuals for 100 billion years, or a higher mutation rate.
And again. Earlier, he argued that even small changes in the constrained part of the genome have
drastic effects on protein shapes. But now, 5 point mutations, which were absolutely lethal
before, are too insignificant. He's making an argument to boost his numbers even more. 5 base pairs
of changes doesn't create big enough numbers - so he's going to shift the goalposts in order
to make the probability numbers even worse.
Of course, even 10 point mutations is quite small, and many genes have many more than 100 base pairs. One would expect many more than 10 point mutations to an allele before it again has a (new) function in the cell. So the numbers soon become astronomical and completely infeasible. In addition, we need to consider that some genes work together with others, so we might need to generate 3 or 4 genes at the same time, making the task even much more difficult. (This corresponds to Behe's "irreducibility." Note that this does not prevent evolution, but makes it astronomically more difficult, if irreducibility can be demonstrated.) For several polypeptide chains that fit together, it would be very hard to imagine how the whole complex could change shape gradually except by very large steps, which we have shown to be impossible. Another problem is that neutral mutations tend to die out of the population, so it may not be possible to generate all these alleles even in a vast amount of time unless the population is even more astronomically large to generate the combinations of neutral mutations rapidly in a single line of individuals.
And once again. See, he wants to make it look like the probability of evolution is so outrageously large that it's just not imaginable that it would work. So he's got to keep piling on excuses to increase
the numbers. He does it one more time, pulling in the rates of fixation/elimination of mutations from the talk.origin article, to try to make it appear to be even less likely that his required collection of mutations could occur and fix in the population.
The thing is - by this argument, this very argument that he presents, the so-called "small evolution" changes are impossible. What does it take to modify the color of a moth's wings? It's a change in proteins. How much about the protein needs to change to change the color? You can pull out the same nonsense argument that he uses to show that it probably needs at least 5 base pairs of change, and that the genes coding the color have at least 100 base pairs, and that none of the point changes can have any effect until they're all there, and so on. Nothing in this argument has anything to do with
what he's claiming to show. He's arguing for a difference between micro- and macro-evolution; but
he doesn't distinguish them, and his silly little big numbers argument applies as well (or as poorly) to
micro as it does to macro.
What makes me particularly angry about this is that the guy is a computer scientist. One of the
standard curriculum requirements for CS is discrete probability theory and combinatorics. We're well
trained in this area. He must know how stupid this argument is. He knows how dishonest he's being. And he's using his credibility as a member of my profession to make this deliberately dishonest, slimy, pathetic argument.