Shortly before the publication of the first Neanderthal genome, a number of researchers had seen hints that there might be something strange lurking in the statistics of the human genome. The publication of the genome erased any doubts about these hints and provided a clear identity for the strangeness: a few percent of the bases in European and Asian populations came from our now-extinct relatives.
But what if we didn't have the certainty provided by the Neanderthal genome? That's the situation we find ourselves in now, as several studies have recently identified "ghost lineages"—hints of branches in the human family tree for which we have no DNA sequence but find their imprint on the genomes of populations alive today. The existence of these ghost lineages is based on statistical arguments, so it's very dependent upon statistical methods and underlying assumptions, which are prone to being the subject of disagreement within the community that studies human evolution.
Now, researchers at the University of Utah are arguing that they have evidence of a very old ghost lineage contributing to Neanderthals and Denisovans (and so, indirectly, possibly to us). This is a claim that others in the field will undoubtedly contest, in part because the evidence comes from an analysis that would also revise the dates of many key events in human evolution. But it's interesting to look at in light of how scientists deal with a question that may never be answered by definitive data.
of time to build up their own variations that are distinct to their lineage and not found in modern human populations. Thus, the DNA Neanderthals contributed to Eurasian populations included variants that fall well outside the range of the variation we see in other parts of the genome. And while we know about Neanderthals, it's possible you can get a similar contribution from a group we don't know about. The problem is that this sort of branching is impossible to identify at the single-base level. There's no way to distinguish a variant that has arisen recently due to mutation from one that was brought in from a more distantly related lineage. In the diagram below, we take some known branches of the recent human family tree and add a potential ghost lineage. We can imagine an example where, at a specific location in the genome, modern humans and Neanderthals have an A, while Denisovans have a G. One explanation for this is that modern humans got their A from Neanderthals, which we know interbred with us. But that interbreeding has mostly contributed to non-African populations, so this is unlikely. Another option is that a mutation occurred on the Denisovan lineage. But a third option is that the G came into the Denisovan population thanks to a completely separate human lineage that interbred with them. At the individual base level, these two options are impossible to tell apart.
Looking for ghosts
Ghost lineages have made their presence known in two ways. In the first, sequences of DNA from different populations can reveal shared ancestry groups. Native Americans, for example, have sequences that descended from an ancestral population that contributed DNA to modern East Asians, as well as another population that contributed to modern Siberians. In West Africans, we've found a significant contribution from a population that doesn't seem to have contributed to any other existing population (along with contributions from groups that do have current descendants). While that population's contribution is well within the range of normal human variation, we still don't know anything about who they were or where they interacted with the ancestors of West Africans. They're a historical ghost at the moment, though further studies could always provide more details. But there are hints of additional ghost lineages in our past. In these cases, the contribution comes from something outside the normal range of human variability. Take the Neanderthal DNA, for example. European and Asian populations all share common ancestors that seem to have left Africa about 50,000 years ago and thus have a relatively small range of variations in their DNA. Neanderthals, by contrast, split off from the lineage that produced modern humans hundreds of thousands of years ago and have been largely separated since. They had plentyof time to build up their own variations that are distinct to their lineage and not found in modern human populations. Thus, the DNA Neanderthals contributed to Eurasian populations included variants that fall well outside the range of the variation we see in other parts of the genome. And while we know about Neanderthals, it's possible you can get a similar contribution from a group we don't know about. The problem is that this sort of branching is impossible to identify at the single-base level. There's no way to distinguish a variant that has arisen recently due to mutation from one that was brought in from a more distantly related lineage. In the diagram below, we take some known branches of the recent human family tree and add a potential ghost lineage. We can imagine an example where, at a specific location in the genome, modern humans and Neanderthals have an A, while Denisovans have a G. One explanation for this is that modern humans got their A from Neanderthals, which we know interbred with us. But that interbreeding has mostly contributed to non-African populations, so this is unlikely. Another option is that a mutation occurred on the Denisovan lineage. But a third option is that the G came into the Denisovan population thanks to a completely separate human lineage that interbred with them. At the individual base level, these two options are impossible to tell apart.