From wolf to dog: Digging into the genome of Canis lupus familiaris

The domestication of plants and animals many thousands of years ago changed the course of human history. It revolutionised societies, as animals became everything from companions, to means of transport, to deities. Domestication has also had huge, unintentional consequences for the planet as societies started farming plants and animals for food.

But how did this domestication come about? What events happened in ancient human and animal populations that brought them together?

Studies into ancient DNA, taken from fragments of bone or hair and sometimes frozen in permafrost, are giving researchers clues. Modern techniques are enabling ancient DNA to be studied in more detail than ever before.

Such research is complemented by the knowledge of genomes of extant (as opposed to extinct) species – enabling researchers to compare ancient and modern DNA sequences.

Dogs, thought to be the very first animals to be domesticated by humans, are a prime example of the kind of insights genomics can bring.

The wolf family resemblance is more obvious in some domestic dogs than others, of course…

The Greenland wolf

Having a ‘reference’ genome is a critical first step in genome analysis because it provides a basis against which other genetic data can be compared. A reference genome is a gold-standard, high-quality DNA sequence of a species. Despite DNA sequencing being now relatively standardised, and relatively cheap, only 1.5 per cent of known complex organisms (including plants and animals) have had their genome sequenced. And that’s just the known species. If the estimated undiscovered species are added to the total, then only 0.1 per cent have had their genome sequence determined.

One of the genomes recently completed by researchers on the Darwin Tree of Life project is the Greenland Wolf, Canis lupus orionproviding a new reference genome for research. This wolf is one of 38 subspecies of the grey wolf. Another sub species is Canis lupus familiaris – the domesticated dog.

The Greenland wolf is a particularly interesting reference genome thanks to its distinctive evolutionary history. The subspecies split off from Eurasian grey wolves (Canis lupus lupus), the subspecies from which domestic dogs descend, before prehistoric humans began the domestication process. Finding itself in the remote northeast of the Americas, the Greenland wolf was also cut off evolutionarily from other North American wolf subspecies, which have since interbred with coyotes (Canis latrans). The reference genome from Greenland should in theory, therefore, provide  particular clarity when compared with the genomes of both Eurasian wolves and domestic dogs.

There are thought to be just 200 Greenland wolves alive today, mainly living in their namesake country, with a smaller population on Ellesmere Island, Canada. They have a huge range, travelling far across their territories. They hunt arctic hares and muskoxen, and live in small packs of just a few animals.

Greenland wolf
Greenland wolf (Canis lupus orion) taken in northern Greenland during an expedition funded by the Swedish Polar Research Secretariat (Image: Love Dalén, Centre for Palaeogenetics)

The reference sequence of the Greenland wolf will be used by researchers such as paleogenomicist, Professor Greger Larson, Director of Palaeogenomics & Bio-Archaeology Research at the University of Oxford. His work is uncovering stories of the ancient world.

“One of our main focuses is to understand the nature and to characterise the changing relationship between people and animals. And so we do that, not exclusively, but primarily by generating [genome] sequence data from the ancestors of modern populations.”

Larson uses DNA sequence data to create phylogenetic trees – similar to a family tree. The genetic similarities between individuals can be used to estimate relatedness. Population structures and lineages of species and sub-species can be mapped. Genome data give a much higher resolution view of evolutionary history than any other way of characterising a dog or a wolf.

What genome data can’t do is tell us about the relationship that an animal had with any other species – be that prey, or humans. This kind of information can be gathered from archaeological evidence though. For example, if a dog was buried in a grave, then researchers can start to infer something about the relationship between dogs and people at that time, in that place.

One of Professor Larson’s aims is to bring all this data together, and understand how it was that people and wolves first came together, how those wolves subsequently began to differ from their ancestors, and how wolves interacted with ancient human populations as the two species moved together all over the Earth.

“We try and plot that whole thing and actually watch the movie in real time,” he says.

Getting the reference

Genome data for analysis in evolution studies are generated by sequencing the letters, or bases, of DNA of an individual animal. The sequencing results in long strings of letters, representing fragments of DNA.

The fragments are compared to the reference genome for that species, to put them back together and determine the unique DNA sequence of an individual. This process is much easier and quicker than determining the DNA sequence from scratch every time. Larson likens it to piecing together a jigsaw puzzle – the reference sequence is like a picture on the box that shows you where the pieces go.

Until 2017, the only reference genome available for dogs or wolves was that of a Boxer dog. With just one reference genome for all wolf subspecies, genetic variation in populations may be underestimated, and analyses can be biased.

A boxer – though not the individual that provided the first dog/wolf reference genome.

Ancient DNA brings an extra difficulty in completing the genome jigsaw puzzle. The dog genome has a total of 2.5 billion pairs of bases in its genome – similar to humans. But because DNA degrades over time, ancient DNA comes in very short pieces.

“If we get 60 base pairs, well that’s great!” says Larson.

“It’s the equivalent of getting a puzzle, or two puzzles, they’re both exactly the same. But one consists of 10 pieces, pretty easy to put together. The other one consists of 100 million pieces, suddenly, that’s much harder to piece together.”

There are just four letters of DNA code, and in some places their sequence is repetitive.

“So having a very good, very high quality, very deep reference genome is really important. The more of them that we have, the better, because that means that we can have something to underlay underneath that puzzle. And then when you get that 60 base pair piece, you can much more reliably associate it with a part of the genome.”

dogs of DToL
Some of the canine companions of the humans working on the Darwin Tree of Life project.

Reconstructing history

In 2017, the grey wolf genome sequence was determined for the first time, by Professor Larson and colleagues. In more recent research, they sequenced 27 ancient dog genomes from locations linked to human ancient DNA sites. By analysing these genomes together with other ancient and modern dog genomes, they found that dogs likely arose once, from a now-extinct wolf population. They also found that 11,000 years ago, before agriculture was widespread, domesticated dog populations had already formed at least five genetically distinct groups. Some of those lineages are still visible in today’s dog breeds.

“We can use these data to ask big questions about evolution, but do so on a timeframe that is amenable to looking at genetics. Dinosaurs are amazing, but we’re not getting DNA from them. So by looking at domestication, we can look at very large-scale change at the level of the organism and at the level of genome, because now we have access to ancient DNA. We can timestamp genomes.”

Co-analysis with human genomes showed that some aspects of dog population history mirror humans, while other elements differ. Humans’ best friend has complex ancestry.

“Dogs are the first animal with which we formed this really very tight relationship, but there are dozens of others. And in order to understand all of that, and to be able to compare the evolutionary trajectories of all of these individual species, it will be really great to get a lot of really high-quality reference genomes…. not just of one wolf, or one red junglefowl, but the entire suite of closely related organisms that may or may not have contributed DNA to the modern population. And the only way we can know that is by sequencing the bejesus out of everything.”


This article was adapted from an article written by Alison Cranage, science writer at the Wellcome Sanger Institute, and originally published on the Wellcome Sanger Institute blog.