Wednesday, 19 June 2013

Genetic Evidence for Human Evolution - Shared Genomic 'Errors' (A)

The evidence for human evolution: Shared genetic 'errors' (A)

If you were marking examination papers, and found that the papers from four students seated next to each other had exactly the same answer to each question, complete with the same spelling errors in the same words, you would conclude that they had cheated. The alternative explanation, that the students had independently arrived at the same answers and made the same mistakes, would be dismissed out of hand as preposterous. When we examine the genomes of humans and other animals, we find plenty of examples of shared genetic 'mistakes' at exactly the same place in their genomes. This is correctly regarded as overwhelming evidence for common ancestry, with the original genetic error occurring in a species ancestral to the currently living ones, and subsequently being inherited.

By genetic errors, I am referring to pseudogenes, retrotransposons and endogenous retroviruses, genetic elements that are evidence of loss of function, copying and pasting of a parasitic genetic element or prior infection by a retrovirus, respectively. For example, when we see the same retroviral element in the same place in the genomes of humans and apes, rather than postulate that the same retrovirus purely by chance inserted itself into the same location in human and ape genomes (at odds of billions to one against), we conclude that the original retroviral infection occurred in the human-ape common ancestor, and has been subsequently inherited by human and apes.

Likewise, when we see the same pseudogene (broken remnant of a gene) at the same place in human and ape, we conclude that the gene was inactivated in a human-ape common ancestor and was subsequently inherited by its descendants. Furthermore, when we see the same retrotransposon (a genetic element that copies and pastes itself randomly throughout the genome) in the same place in human and ape genomes, we conclude that in the human-ape common ancestor, a retrotransposon copied and pasted itself into that part of the genome, and this insertion has been passed on to the descendant species. The alternative - that the same gene became converted to a pseudogene in human and apes or the same retrotransposed element pasted itself in the same place in the genomes of humans and apes, is regarded as unlikely at best. The evidence for common descent from shared genomic errors is arguably the most powerful evidence for common descent.


Most laypeople are unfamiliar with what is meant by pseudogenes, retrotransposons and endogenous retroviruses, so some explanation will be needed. A pseudogene is a genetic element closely resembling a gene, but which in general is not able to code for its intended product. There are three classes of pseudogenes: unitary, duplicate and processed.

1. Unitary pseudogenes occur when a gene suffers a crippling mutation which renders it unable to function. The classic example of a unitary pseudogene is the GULO pseudogene. In most animals, GULO  codes for the enzyme L-gulono-γ-lactone oxidase, the terminal enzyme in the biosynthesis of ascorbic acid, or vitamin C. In humans, apes, monkeys, guinea pigs and a few other animals, the GULO gene is a pseudogene, having been crippled by a lethal mutation, which means they are unable to synthesise vitamin C, and have to rely on dietary ascorbic acid.

2. Duplicated pseudogenes occur when a functional gene is copied and picks up mutations which result in it becoming non-functional. As the organism already has a functional copy of the first gene, the presence of a duplicate gene which has lost function through mutation will not affect the organism in any substantive manner. Examples of duplicated pseudogenes include the  ψη-globin pseudogene and the CYP21 pseudogene. The former is a haemoglobin pseudogene, the latter is a pseudogene version of the gene coding for cytochrome P450 C21 which when functional is involved in steroid biosynthesis.

3. Processed pseudogenes occur when an RNA transcript of a gene is reverse transcribed randomly back into the genome. Normally, after DNA is transcribed to RNA, the introns (long non-coding sections in the gene) are removed, and a section of RNA called a poly A tail (used in assisting the transport of the RNA out of the nucleus and in assisting the translation process, where the RNA is used as the template for protein synthesis) is added. Transcription normally results in the creation of an RNA copy of a DNA gene, but the phenomenon of reverse transcription will create a DNA copy of RNA. Reverse transcription will copy the processed RNA transcript back into the genome at a random location. These are easily recognised as processed pseudogenes as they lack the introns of the normal gene, and possess the poly A tail which is not present in the DNA original. As they lack the promoter sequences (regulatory sequences near the gene which are critical to initiate transcription.

Formation of processed and duplicated pseudogenes.



Retrotransposons are mobile genetic elements which replicate by the creation of an RNA transcript of themselves which is reverse transcribed to a DNA copy which is randomly inserted back into the genome. This process has allowed them to amplify their number to such a degree that approximately 40% of the human genome consists of multiple copies of these genetic parasites.


There are two classes: LTR retrotransposons and non-LTR retrotransposons. The former are related to retroviruses, but unlike them have a completely intracellular life. They will not be considered further. Non-LTR retrotransposons include LINEs and SINEs.

LINEs - Long Interspersed Nuclear Elements are genetic elements around 7000 base pairs long which have the code for the reverse transcriptase enzyme.  In theory, they  are able to reverse transcribe their own RNA copies into DNA and insert this copy randomly back into the genome. Some LINEs are mutated, so they are unable to continue the retrotransposition cycle, while others are still functional. They make up around 20% of the genome

SINEs - Short Interspersed Nuclear Elements are short genetic elements (around 500 base pairs long) which do not have their own copy of reverse transcriptase; they are reliant on other transposable elements to aid in their transposition. Around 13% of the human genome is made up of SINEs

Retrotransposable elements are genetic parasites - they copy and paste themselves randomly throughout the genome. Uncommonly, the genome co-opts retrotransposable elements and creates a genomic element with a new function. Generally, retrotransposons are classic junk DNA, providing no benefit to the host genome and at times being implicated in genetic disease. [1-4] Needless to say the fact that nearly half of our genome is composed of parasitic DNA which can cause disease is impossible to honestly reconcile with intelligent design, but makes perfect sense under an evolutionary model where these elements copy and paste themselves randomly, causing the genome size to grow over time and contribute to genomic instability and disease.

Endogenous Retroviruses

Retroviruses are RNA viruses that reproduce intracellularly by using their reverse transcriptase enzyme to produce a DNA copy of their genome which is then inserted into the host genome. Once the DNA copy is part of the host cell, the host cell genetic replication machinery produces new copies of the retrovirus.

Retroviral life cycle

If the retrovirus integrates into the host's germ line, then it can be passed down to the next generation. When that happens, it becomes an endogenous retrovirus. As the DNA copy does not produce material essential to the well-being of the cell (as one would expect given its viral origins) it will eventually become inactivated by mutation. The presence of endogenous viral elements in an organism's genome is proof of a prior retroviral infection. When two related organisms share the same ERV at exactly the same place in the genome, we have powerful evidence that these organisms share a common ancestor in which the original viral infection took place and was then passsed down to the descendant species.

Special creationists are fond of comparing the human genome to an encyclopaedia, but the truth is that if your genome was represented by a 100 volume encyclopaedia, most of it would be gibberish, with only around 20-30 at most containing meaningful information:

Image from The Genome by Numbers, the Welcome Trust. 

Over half of our genome is parasitic DNA such as LINEs, SINEs, ERVs and pseudogenes. Just the existence of this almost completely non-functional parasitic material is impossible to square with an intelligent design of the genome, but makes sense only in the light of an evolutionary origin of the genome.

This summarises the basic science required to understand the evidence from shared genetic errors. Part 3 will look at several examples (a tiny selection of the total evidence) to show why many biologists regard the genomic evidence for common descent to be the most powerful demonstration of the reality of evolution.

This article first appeared at my Facebook page here

1. Ostertag E.M. et al "SVA Elements Are Nonautonomous Retrotransposons that Cause Disease in Humans" Am J Hum Genet (2003) 73:1444-1451

2. Callinan, P. and Batzer, M.A. (2006) Retrotransposable elements and human disease. In Genome and Disease. Genome Dynamics (Vol. 1) (Volff, J., ed.), pp. 104–115, Karger

3. Crow, Mary K. "Long interspersed nuclear elements (LINE-1): potential triggers of systemic autoimmune disease." Autoimmunity (2009) 43: 7-16.

4. Schneider, Anna M., et al. "Roles of retrotransposons in benign and malignant hematologic disease." Cellscience (2009) 6:121.