Friday, 16 January 2015

This is how we know the human race did not descent exclusively from two people living 6000 years ago

I've made the point many times that there is too much genetic variability in the human genome to have arisen from just two people living six thousands years ago. What population genetics tells us is that the minimum possible human population size was no smaller than several thousands people. The consequences of this are clear - we did not descent from two people, and any dogma based on this claim is made in defiance of some fairly hard evidence.

Late last year, I commented on an excellent new BioLogos series "Adam, Eve, and Human Population Genetics" by evolutionary geneticist Dennis Venema that covers for the educated layperson the relevant scientific issues. Venema has just posted part 4 which covers how single nucleotide polymorphism genetic variation allows us to make estimates of minimum human population sizes to account for this variation. The full post can be found at the link at the end of the post, but this quote should give you a feel for the relevant issues:
The advent of genome sequencing, as you might expect, has shed a great deal of light on how much genetic variation is present in modern human populations. One significant source of human genetic variation comes in the form of what are known as single nucleotide polymorphisms, or “SNPs” (pronounced “snips”). “Polymorphism” simply means “having many forms”. SNPs are single DNA letters that are variable among humans, and we have around 300,000 common SNPs in our genome of 3 billion DNA letters. In other words, the majority of our genomes are identical to each other, but a small number of DNA letter positions on our chromosomes are variable. Consider a short section of DNA sequence for six different individuals, with three variable positions:

For any one SNP position, there are a maximum of four possible versions (since there are four DNA letters). Once we consider a few SNPs linked together on the same chromosome, however, the number of possible combinations becomes very large. For example, for just the three SNPs shown above, there are 64 different possible combinations (4 x 4 x 4, or 43). Twenty SNPs, on the other hand, would have 420 possible combinations, more than the number of people on the planet. For the six individuals above, we can see that there are five different combinations present. The most likely explanation for these five variants is that they were inherited from five different ancestors, and that persons 5 and 6 inherited their identical combination from the same ancestor. There are other, less likely possibilities, however: some of the combinations might result from new mutations, or from mixing and matching between the different SNPs. For example, person 4 and persons 5 and 6 differ by only one letter: person 4 has an “a” for SNP 1 where persons 5 and 6 have a “t”. One possibility that we need to account for is that person 4 might be descended from the same ancestor as persons 5 and 6, but that a new mutation from t → a occurred at the SNP 1 location. Another possibility is that there was recombination, through a process called “crossing over”, that placed a “t” into this position in person 4. So, when using SNP variation to count the number of likely ancestors, we need to factor in mutation and recombination rates, both of which we can measure directly in humans. In practice, the effects of mutation are small on using SNPs to estimate ancestral population sizes, since the mutation rate in humans is very, very low. Direct measurements of the rate have been done by sequencing the entire genomes of parents and offspring, and on average there are only about 100 – 150 new mutations every time we copy our genome of three billion DNA letters. The effects of recombination can also be minimized by choosing SNPs that are linked closely together on the same chromosome. SNPs that are closely linked together recombine only rarely, since there is so little space for crossing over to occur between them. While scientists factor in mutation and recombination rates, in practice they are not a major issue for SNP-based methods. 
In practice, population size estimates based on SNP variation is simply a matter of sequencing a large number of people from around the globe, cataloging them for various SNPs, and estimating how many ancestors they would need to have the SNP variation we see in the present day. As you might expect, different people groups have characteristic sets of SNP variants within them. This makes sense, of course, because we know that the various groups are more closely related to each other than across groups. Tallying up the number of ancestors using this method consistently returns a total minimum population size of about 10,000 individuals: approximately 8,000 ancestors are needed to explain SNP diversity in sub-Saharan Africa, and about 2,000 ancestors for everyone else. SNP diversity in humans is far too large to result from one ancestral couple at any time in the last 200,000 years – we descend from a population. These values are also in good agreement with older, cruder methods of estimating population size from other types of genetic variation, giving us increased confidence that they are reasonable.
The full article is here.