(Continuing in my series of updates of the GM/BM posts about the bad math of the IDists, I'm posting an update of my original critique of Dembski and his No Free Lunch nonsense. This post has more substantial changes than my last repost; based on the large numbers of comments I've gotten in the months since then, I'm addressing a bit more of the basics of how Dembski abuses NFL.)
It's time to take a look at one of the most obnoxious duplicitous promoters of Bad Math, William Dembski. I have a deep revulsion for this character, because he's actually a decent mathematician, but he's devoted his skills to creating convincing mathematical arguments based on invalid premises. But he's careful: he does his meticulous best to hide his assumptions under a flurry of mathematical jargon.
One of Dembski's favorite arguments is based on the no free lunch theorems. In simple language, the NFL theorems say "Averaged over all fitness landscapes, no search function can perform better than a random walk".
Let's take a moment to consider what Dembski says NFL means when applied to evolution.
In Dembski's framework, evolution is treated as a search algorithm. The search space is a graph. (This is graph in the discrete mathematics sense: a set of discrete nodes, with a finite number of edges to other nodes.) The nodes of the graph in this search space are outcomes of the search process at particular points in time; the edges exiting a node correspond to the possible changes that could be made to that node to produce a different outcome. To model the quality of a nodes outcome, we apply a fitness function, which produces a numeric value describing the fitness (quality) of the node.
The evolutionary search starts at some arbitrary node. It proceeds by looking at the edges exiting that node, and computes the fitness of their targets. Whichever edge produces the best result is selected, and the search algorithm progresses to that node, and then repeats the process.
How do you test how well a search process works? You select a fitness function which describes the desired outcome, and see how well the search process matches your assigned fitness. The quality of your search process is defined by the limit of the following:
- For all possible starting points in the graph:
- Run your search using your fitness metric for maxlength steps to reach an end point.
- Using the desired outcome fitness, compute the fitness of
the end point
- Compute the ratio of your outcome to the the maximum result
the desired outcome. This is the quality of your search for this length
So - what does NFL really say?
"Averaged over all fitness functions": take every possible assignment of fitness values to nodes. For each one, compute the quality of its result. Take the average of the overall quality. This is the quality of the directed, or evolutionary, search.
"blind search": blind search means instead of using a fitness function, at each step just pick an edge to traverse randomly.
So - NFL says that if you consider every possible assignment of fitness functions, you get the same result as if you didn't use a fitness function at all.
At heart, this is a fancy tautology. The key is that "averaged over all fitness functions" bit. If you average over all fitness functions, then every node has the same fitness. So, in other words, if you consider a search in which you can't tell the difference between different nodes, and a search in which you don't look at the difference between different nodes, then you'll get equivalently bad results.
Ok. So, let's look at how Dembski responds to critiques of his NFL work. I'm going to focus on his paper Fitness Among Competitive Agents.
Now, in this paper, he's responding to the idea that if you limit yourself to competitive fitness functions (loosely defined, that is, fitness functions where the majority of times that you compare two edges from a node, the target you select will be the one that is better according to the desired fitness function), then the result of running the search will, on average, be better than a random traversal.
Dembski's response to this is to go into a long discussion of pairwise competitive functions. His focus is on the fact that a pairwise fitness function is not necessarily transitive. In his words (from page 2 of the PDF):
From the symmetry properties of this matrix, it is evident that just because one item happens to be pairwise superior to another does not mean that it is globally superior to the other. But that's precisely the challenge of assigning fitness of competitive agents inasmuch as fitness is a global measure of adaptedness to an environment.
To provide such a global measure of adaptedness and thereby to overcome the intransitivities inherent in pairwise comparisons, fitness in competitive environments needs therefore to factor in average performance of agents as they compete across the board with other agents.
To translate that out of Dembski-speak: in pairwise competition, if A is better than B, and B is better than C, that doesn't mean A is better than C. So, what you need to do to measure competitive fitness, you need to average the performance of your competitive agents over all possible competitions.
The example he uses for this is a chess tournament: if you create a fitness function for chess players from the results of a serious of tournaments, you can wind up with results like player A can consistently beat player B; B can consistently beat C, and C can consistently beat A.
That's true. Competitive fitness functions can have that property. But it doesn't actually matter: because that's not what's happening in an evolutionary process. He's pulling the same old trick that he played in the non-competitive case: he's averaging out the differences. In a given situation, a competitor does not have to beat every possible other fitness function. It does not have to be the best possible competitor in every possible situation. It just has to be good enough.
And to make matters worse for Dembski, in an evolutionary process, you aren't limited to picking one "best" path. Evolution allows you to explore many paths at once, and the ones that meet the "good enough" criteria will survive. That's what speciation is. In one situation, A is better, so it "wins". Starting from the same point, but in a slightly different environment, B is better, so it wins. Both A and B win.
You're still selecting a better result. The fact that you can't always select one as best doesn't matter. And it doesn't change the fundamental outcome, which Dembski doesn't really address, that in an evolutionary landscape, competitive fitness functions do produce a better result that random walks.
In my taxonomy of statistical errors, this is basically modifying the search space: he's essentially arguing for properties of the search space that eliminate any advantage that can be gained by the nature of the evolutionary search algorithm. But his only argument for making those modifications have nothing to do with evolution: he's carefully picking search spaces that have the properties he wants, even though they have fundamentally different properties from evolution.
It's all hidden behind a lot of low-budget equations which are used to obfuscate things. (In "A Brief History of Time", Steven Hawking said that his publisher told him that each equation in the book would cut the readership in half. Dembski appears to have taken that idea to heart, and throws in equations even when they aren't needed, in order to try to prevent people from actually reading through the details of the paper where this error is hidden.)