Translate

Wednesday 14 May 2014

"20 scientific facts seldom taught to students" critically reviewed #12

Collyer's twelfth point was the assertion that the "genetic information encoded in each cell in the DNA, if written out in detail, would require as many as 4,000 large volumes of closely printed text. This is no accident of nature."

Wrong. Collyer has ignored the fact that around 44% of the human genome is made up of mobile genetic elements  - DNA transposons, retrotransposons -  that copy and paste themselves into the genome randomly, often causing disease in the process. This is very much an unguided, random process. A significant fraction of the human genome owes its origin to ancient retroviral infection. In fact, there is more retroviral genetic material – the evidence of past retroviral infection – in our genome than there is direct protein coding material. Only a a small percentage of the human genome directly codes for protein or has specific regulatory function.

 A better analogy would be 4000 volumes consisting mainly of gibberish, spelling errors and random insertions of sentences from languages written in other languages, with at best 40 volumes of sensible information scattered randomly through the library.


Breaking down the human genome into the various classes of genetic material we find there, the scale of how much parasitic DNA, decayed viral remnants and genetic equivalent of gibberish [1] is astonishing:

    Transposable Elements: 44% junk

        DNA transposons: functional < 0.1%, defective 3%
        Retrotransposons: active < 0.1%, co-opted < 0.1%, junk 41%

    Viruses: 9% junk

        DNA Viruses: active < 0.1%, defective ~1%
        RNA Viruses: active < 0.1%, co-opted < 0.1%, defective 8%

    Pseudogenes: 1.2% junk

        Derived from protein-coding genes: 1.2% junk
        Co-opted pseudogenes: < 0.1% useful, secondarily acquired new function

    Ribosomal RNA genes: 0.19% junk

        Essential: 0.22%
        Junk: 0.19%

    Other RNA encoding genes

        tRNA genes: < 0.1% essential
        known small RNA genes: < 0.1% essential
        putative regulatory genes: ~2% essential

    Protein-encoding genes: 9.6% junk (intron sequences), 1.8% essential transcribed

    Regulatory Sequences: 0.6% essential

    Origins of DNA replication: < 0.1% essential

    Scaffold attachment regions: < 0.1% essential

    Highly repetitive regions: 1% junk, 2% essential

    Intergenic DNA: 26.3% unknown function, most likely junk, 2% essential

    Essential / Functional DNA: 8.7%
    Junk DNA: 65%
    Unknown: 26.3%

Even if most of the intergenic DNA turns out to have a function, nearly 66% of our genome is rubbish consisting of remnants of ancient retroviral infection, damaged genes that can no longer work, mobile genetic elements that copy and insert themselves randomly around the genome irrespective of what benefit or harm that action does, and introns, the non-coding sections of DNA that interrupt genes.

Simple illustration of an unspliced mRNA precursor, with two introns and three exons (top). After the introns have been removed via splicing, the mature mRNA sequence is ready for translation (bottom). (Source: Wikipedia)
Let's take a further look at a few of these examples. Evolutionary biologist John Avise notes that the genetic processing required to strip intronic material out of the initial mRNA copy increases the risk of genetic disease considerably:
Do introns otherwise provide evidence of optimal genomic design? No, because premRNA processing also has opened vast opportunities for cellular mishaps in protein production. Such mishaps are not merely hypothetical. An astonishing discovery is that a large fraction (perhaps one-third) of all known human genetic disorders is attributable in at least some clinical cases to mutational blunders in how premRNA molecules are processed. For example, it has long been known that mutations at intron-exon borders often disrupt premRNA splicing in ways that alter gene products and lead to countless genetic disabilities, including various cancers and other metabolic defects. There is also good evidence that the number of introns in human genes is positively correlated with a gene’s probability of being a disease-causing agent. [2]
At the start of this post, I mentioned that nearly 45% of the genome consists of mobile genetic elements; sections of DNA that copy and paste themselves randomly. Over time, this will bloat the size of a genome, padding it with repetitive, wasteful junk. Avise notes that the figure of 45% may in fact be an underestimate:
"...the true fraction is probably 75% or more if the tally were to include (i) processed pseudogenes that originated as a byproduct of mobile element activity (51) and (ii) other intergenic DNA regions that probably originated long ago as mobile elements but are no longer identifiable as such because of postformational mutations."[3]
As retrotransposon insertion is random, it is not hard to see how this has the potential to cause harm by randomly overwriting promoter sequences of genes, rendering them non-functional. Retrotransposon activity has been linked with genetic disease. Avise points out:
Mobile elements have the potential to cause human diseases by several mechanisms. When a mobile element inserts into a host genome, it normally does so at random with respect to whether or not its impact at the landing site will harm the host. If it happens to land in an exon, it can disrupt the reading frame of a functional gene with disastrous consequences. If it jumps into an intron or an intron-exon boundary, it may cause problems by altering how a gene product is spliced during RNA processing. If it inserts into a gene’s regulatory region, it can also cause serious mischief. 

The potential for harm by such insertional mutagenesis is great. It has been estimated, for example, that an L1 or Alu mobile element newly inserts somewhere in the genome in about 1 – 2% and 5%, respectively, of human births. Another problem is that when a mobile element lands in a functional gene, genetic instabilities are sometimes observed that result in deleted portions of the recipient locus. Several genetic disorders have been traced to genomic deletions associated with de novo insertions of mobile elements. Finally, mobile elements (or their immobile descendents that previously accumulated in the human genome) can also cause genomic disruptions via non-allelic homologous recombination. Serious metabolic disorders can result. [4]
Collyer's bold assertion that the sheer size of the genome 'is no accident of nature' is clearly a ludicrous assertion, given that most of the genome is junk, and perfectly natural explanations for how the genome size increased (random copying and pasting of retrotransposons, retroviral integration) exist. A better metaphor would be several volumes of books with repeated letters extending for scores of pages, randomly pasted sentences with spelling errors and fragments sentences from other books written in foreign languages inserted randomly in the books. To use Avise's memorable turn of phrase, what we see in the human genome are footprints of nonsentient design.

References

1. Moran L “What’s in Your Genome?” Sandwalk May 8th 2011
2. Avise JC. "Colloquium paper: footprints of nonsentient design inside the human genome." Proc Natl Acad Sci (2010) 107 Suppl 2: 8969–8976.
3. ibid, p 8975
4. loc cit