cs201, Bill Gates, and Intelligent Design

27 April 2008

My shameless self-searching google alert occasionally turns up interesting things, like this letter to the editor of the Huntington News (West Virginia) by Gary Hurd. It refutes an op-ed piece that made all sorts of crazy pseudo-scientific arguments for “intelligent design”. The letter refutes one of the specific claims in the argument about the complexity of DNA using some material found in a lecture for my CS201J course:

And is this notion that human DNA is more complex than “any program ever devised” actually factual? The book by Watson was published in 1965, and the book by Gates that Ashby is misquoting was published in 1995, before the human genome project when we did not even know how many genes humans had! At the time, Gates’ statement was entirely reasonable, even though there was no actual data to test it. But Ashby makes a further claim, “… it is a well known fact that human DNA contains more organized information than the largest set of encyclopedias ever in print.”

David Evans, Professor of Computer Science at the University of Virginia has made some interesting comparisons between DNA and today’s computer software as part of his Computer Science 201: Engineering Software course. Let’s begin with his observation that complexity of computer software has grown at an amazing rate in the last 40 years (about since Watson’s book on the gene was published). The Apollo mission guidance programs had about 36,000 instructions, but today’s Windows XP made by Bill Gates’ Microsoft has about fifty million instructions! Professor Evans then compares this to what we now know about genes. For example, the smallest known set of genes of an organism belong to a bacterial parasite called Nanoarchaeum equitans which has 522 genes representing about 40,000 bytes of information. In other terms, it is slightly larger than the Apollo guidance system. The human genome, or as Evans called it “The Make-Human Program,” has a total of about 3 billion base pairs, which entail about 35 thousand genes. The total information content counting all of the bases is 750 megabytes, or just larger than the 650 megabytes that fit on your CDs at home. But, we have learned that massive amounts of human DNA are genetic “left overs,” non-coding segments and duplications. In short, Human DNA has fewer working instructions than Windows software, and even its total 3 billion bases are tiny compared to Wal-Mart’s 280 terabyte database (the equivalent of 1,120,000 billion DNA bases).

Like most antiscience, Ashby’s “well known facts” are not facts.

The lecture he is referring to is here: Lecture 23: Everything Else You Should Know (but won’t see on Exam 2) [PPT] (slides 18-26). Although I am happy to have anything I’ve done used to debunk intelligent design, the point I meant to make here is a bit different from what Dr. Hurd’s letter is claiming — I am not intending to suggest that the genome is not a complex program (since one could still claim it results in executions that are still far more complex, resillient, and sophisticated than anything humans have created), just that its encoding is incredibly expressive in order for such complex outcomes to be encoded with so little information. Of course, a lot of the information is not in the genome itself, but in the very complex biochemical operating system in which it is interpreted.

The specific claim from the original op-ed piece, that “DNA contains more organized information than the largest set of encyclopedias ever in print”, of course, is blatantly false. A few image-laden pages of a World Book volume contain far more information that the entire human genome.