search this blog


Wednesday, August 13, 2014

Male height in Europe

This open access paper at Science Direct is probably the most detailed work on European stature I've ever seen. The conclusion is that male height in Europe is mostly determined by nutrition and genetics, which isn't really earth shattering. But the authors also point out that Y-chromosome haplogroup I-M170 shows a strong correlation with the highest average stature on the continent, and speculate that the link between the two might be Upper Paleolithic hunter-gatherer ancestry:

The average height of 45 national samples used in our study was 178.3 cm (median 178.5 cm). The average of 42 European countries was 178.3 cm (median 178.4 cm). When weighted by population size, the average height of a young European male can be estimated at 177.6 cm. The geographical comparison of European samples (Fig. 1) shows that above average stature (178+ cm) is typical for Northern/Central Europe and the Western Balkans (the area of the Dinaric Alps). This agrees with observations of 20th century anthropologists (Coon, 1939; Lundman 1977). At present, the tallest nation in Europe (and also in the world) are the Dutch (average male height 183.8 cm), followed by Montenegrins (183.2 cm) and possibly Bosnians (182.5 cm) (Table 1). In contrast with these high values, the shortest men in Europe can be found in Turkey (173.6 cm), Portugal (173.9 cm), Cyprus (174.6 cm) and in economically underdeveloped nations of the Balkans and former Soviet Union (mainly Albania, Moldova, and the Caucasian republics).


The trend of increasing height has already stopped in Norway, Denmark, the Netherlands, Slovakia and Germany. In Norway, military statistics date its cessation to late 1980s.


In contrast, the fastest pace of the height increase (≥1 cm/decade) can be observed in Ireland, Portugal, Spain, Latvia, Belarus, Poland, Bosnia and Herzegovina, Croatia, Greece, Turkey and at least in the southern parts of Italy.


Although the documented differences in male stature in European nations can largely be explained by nutrition and other exogenous factors, it is remarkable that the picture in Fig. 1 strikingly resembles the distribution of Y haplogroup I-M170 (Fig. 10a). Apart from a regional anomaly in Sardinia (sub-branch I2a1a-M26), this male genetic lineage has two frequency peaks, from which one is located in Scandinavia and northern Germany (I1-M253 and I2a2-M436), and the second one in the Dinaric Alps in Bosnia and Herzegovina (I2a1b-M423)16. In other words, these are exactly the regions that are characterized by unusual tallness. The correlation between the frequency of I-M170 and male height in 43 European countries (including USA) is indeed highly statistically significant (r = 0.65; p < 0.001) (Fig. 11a, Table 4). Furthermore, frequencies of Paleolithic Y haplogroups in Northeastern Europe are improbably low, being distorted by the genetic drift of N1c-M46, a paternal marker of Ugrofinian hunter-gatherers. After the exclusion of N1c-M46 from the genetic profile of the Baltic states and Finland, the r-value would further slightly rise to 0.67 (p < 0.001). These relationships strongly suggest that extraordinary predispositions for tallness were already present in the Upper Paleolithic groups that had once brought this lineage from the Near East to Europe.


Grasgruber et al., The role of nutrition and genetics as key determinants of the positive height trend, Economics & Human Biology, available online 7 August 2014, DOI: 10.1016/j.ehb.2014.07.002

Tuesday, July 29, 2014

Analysis of Upper Paleolithic Siberian forager Afontova Gora-2

Apparently, this 15,000 year-old genome from Central Siberia is heavily contaminated with modern DNA (see section SI 5.2.3. in Raghavan et al. 2013). However, apart from MA-1, it's the only Ancient North Eurasian (ANE) sample available right now, so I thought I'd take a closer look at it.

The shared drift statistics using f3(Mbuti;AG-2,Test) do suggest contamination from a present-day Eastern European source, with, for instance, Ukrainians from Lviv showing an unexpectedly strong signal (third on the list below just behind Pima Indians). This makes sense since AG-2 was probably mainly handled by Slavic-speaking Soviet archaeologists and museum staff.

Shared drift with AG-2 (spreadsheet)

Indeed, in the Eurogenes K15 test, the Baltic component is the most important for AG-2, and this component is modal among Balto-Slavic populations. However, AG-2 fails to register any Mediterranean-specific admixture. At the very least, this is interesting, because all present-day Europeans show this influence. In fact, out of the four K15 components typical of the Near East, only the West Asian component appears for AG-2. This component actually peaks in the Caucasus, where today ANE reaches its highest levels in West Eurasia.

Eurogenes K15 results for AG-2

North_Sea 11.3
Atlantic 0.01
Baltic 22.83
Eastern_Euro 20.53
West_Med 0
West_Asian 4.63
East_Med 0
Red_Sea 0
South_Asian 13.9
Southeast_Asian 0
Siberian 5.97
Amerindian 16.07
Oceanian 4.77
Northeast_African 0
Sub-Saharan 0

4 Ancestors Oracle results based on the K15 ancestry proportions suggest that AG-2 might simply be a more westerly ANE sample than MA-1, perhaps with some European forager ancestry. Below are a few examples of the best population approximations; note the strong showing by StoraFörvar11, a Mesolithic genome from near Gotland, Sweden. The full list can be seen here.

1 Brahmin_UP+North_Amerindian+StoraFörvar11+StoraFörvar11 @ 8.364493
2 Burusho+North_Amerindian+StoraFörvar11+StoraFörvar11 @ 8.411899
3 MA-1+MA-1+StoraFörvar11+Tatar @ 8.427561
4 Kshatriya+North_Amerindian+StoraFörvar11+StoraFörvar11 @ 8.437549
5 Gujarati+North_Amerindian+StoraFörvar11+StoraFörvar11 @ 8.45127

However, I was only able to use around 13K SNPs that overlapped with my dataset for all of the tests here. So perhaps these markers were much less affected by contamination than the rest? In any case, here are three Principal Component Analyses (PCA) to finish things off. Again, AG-2 basically looks like the genome of a late ANE survivor with a solid contribution from indigenous European foragers. Hopefully this can be confirmed or debunked in the near future with a much higher quality sequence of its genome.

Update 20/08/2014: In the above analysis I used variants from the 1stextraction AG-2 bam file. To try and get more markers I have now also processed the apparently lower quality supernatant bam. Merging the two files has given me just over 30K SNPs to play with, and I think the extra markers have made a positive difference. Below are the updated results, which I'd say appear more accurate because they're much more similar to those of MA-1 (see here and here).

Revised Eurogenes K15 results for AG-2

North_Sea 12.63
Atlantic 0
Baltic 12.77
Eastern_Euro 30.26
West_Med 0
West_Asian 1.13
East_Med 0
Red_Sea 0
South_Asian 18.44
Southeast_Asian 0
Siberian 3.84
Amerindian 17.34
Oceanian 3.6
Northeast_African 0
Sub-Saharan 0

Revised 4 Ancestors Oracle results for AG-2
Revised shared drift with AG-2 (spreadsheet)

PCA based on the new set of markers look almost identical to the PCA above, so I won't bother posting them. By the way, I updated the Eurogenes ancient genomes datasheet with the revised AG-2 K15 results (see here).

See also...

Analysis of Mesolithic Swedish forager StoraFörvar11

Wednesday, March 26, 2014

The story of R1a: the academics flounder on

There's been a lot of horseshit published over the years about Y-chromosome haplogroup R1a, which just happens to be my haplogroup. That includes academic papers in journals like PLoS ONE and Nature. My advice is, take all of that stuff with a very large pinch of salt and just look here for updates.

Indeed, a new paper on the phylogeography of R1a appeared at the Nature website today: Underhill et al. 2014. It's actually a much better effort than anything else on the topic at academic level thus far, but certainly not without issues.

For instance, the authors failed to include two well known and very important R1a subclades in their analysis: the Northwest European-specific R1a-CTS4385 and the East and Central European-specific R1a-Z280. As a result, the former is lumped with R1a-M417* and the latter with R1a-Z282*. In fact, Z280 is shown to be above Z282 in the topology of R1a-M420 (see Figure 1 here), which is plain wrong. These are major oversights and mean that this study is not a very useful resource as far as the phylogeography of European R1a is concerned.

But the paper does show a couple of interesting things. For instance, the maps below offer the best illustration to date of the dichotomy between the European-specific R1a-Z282 and Asian-specific R1a-Z93.

However, these are very closely related subclades, sharing the Z645 mutation (unfortunately not mentioned in the paper), and both reaching high frequencies among Indo-European speakers. It's therefore plausible that groups carrying these markers expanded to the west and east from a zone between their current hotspots, possibly the Volga-Ural region, rather recently.

Indeed, these migrations had to have happened after 4800-6800 YBP, which is the age of R1a-M417 reported by Underhill et al., and backed up by estimates from genetic genealogists using, among other things, complete R1a sequences (see here). In other words, the rapid expansions of R1a-Z282 and R1a-Z93 appear to have taken place from more or less the same region during the generally accepted early Indo-European timeframe, making them excellent candidates for paternal markers of the early Indo-European dispersals.

At the same time, the paucity of R1a-Z93 and derived lineages in Europe, including Eastern Europe, suggests that historic migrations originating in East and Central Asia, like those of the early Turks, had a negligible effect on the paternal ancestry of modern Europeans. This shows very clearly on the PCA in Figure 4 (see here).


Underhill et al., The phylogenetic and geographic structure of Y-chromosome haplogroup R1a, European Journal of Human Genetics, advance online publication, 26 March 2014; doi:10.1038/ejhg.2014.50

See also...

R1a-Z93 from Bronze Age Mongolia

Afghan Hindu Kush: a genetic sink

Saturday, March 15, 2014

PCA of ancient European mtDNA

The recent Wilde et al. paper on the ancient DNA of Eastern European steppe nomads included mitochondrial DNA (mtDNA) data for just over 60 of the studied individuals. Below is a Principal Component Analysis (PCA) featuring these samples, marked collectively as KGU, alongside the dataset from last year's Brandt et al. study on the genetic origins of Central Europeans.

Note that KGU falls closest to the Bernburg (BEC) and Unetice (UC) samples from Neolithic and Bronze Age eastern Germany, respectively. This is probably because all of these groups have similar levels of mtDNA haplogroups U5a and H. Moreover, UC is thought to be an Indo-European archaeological culture with origins in Eastern Europe. On the other hand, Brandt et al. hypothesized that BEC might have been of Scandinavian origin.

The Central European metapopulation (CEM) is composed of present-day individuals from Austria, Germany, Poland and the Czech Republic. Its position on the PCA plot suggests to me that modern Central Europeans are largely derived of Kurgan nomads, Bell Beakers from Iberia (BBC), and remnants of Neolithic farmers from the Near East, at least in terms of maternal ancestry.

In other words, I'd say the result correlates well with the findings of Brandt et al., who posited that long-range migrations from eastern and western Europe into the heart of the continent, particularly during the late Neolithic, played an important role in the formation of the modern Central European mtDNA gene pool.

Citations and credits...

Thanks to Eurogenes Project member PL16 for the PCA

Wilde et al., Direct evidence for positive selection of skin, hair, and eye pigmentation in Europeans during the last 5,000 y, PNAS, Published online before print on March 10, 2014, DO:I10.1073/pnas.1316513111

Guido Brandt, Wolfgang Haak et al., Ancient DNA Reveals Key Stages in the Formation of Central European Mitochondrial Genetic Diversity, Science 11 October 2013: Vol. 342 no. 6155 pp. 257-261 DOI: 10.1126/science.1241844

See also...

Extreme positive selection for light skin, hair and eyes on the Pontic-Caspian steppe...or not

Sunday, February 23, 2014

Genetic affinities of Estonian Poles

The Estonian Biocentre has a new genotype dataset available from the recently released "Khazar" preprint (see here). The samples include Poles from Estonia, so I ran a PCA to see whether there was a clear difference between them and their ethnic kin from Poland in terms of genome-wide genetic structure. This doesn't appear to be the case, except for a few individuals who probably have significant Estonian and/or northwest Russian ancestry (the several northernmost and easternmost Polish_Estonian samples on the plots below). It's an interesting result, considering that, as far as I know, most Estonian Poles are not of recent Polish origin, but have roots in the East Baltic dating back to the Polish-ruled Duchy of Livonia of the 1600s. Please note, the plots were rotated and stretched horizontally to fit with geography.


Behar, Doron M.; Metspalu, Mait; Baran, Yael; Kopelman, Naama M.; Yunusbayev, Bayazit; Gladstein, Ariella; Tzur, Shay; Sahakyan, Havhannes; Bahmanimehr, Ardeshir; Yepiskoposyan, Levon; Tambets, Kristiina; Khusnutdinova, Elza K.; Kusniarevich, Aljona; Balanovsky, Oleg; Balanovsky, Elena; Kovacevic, Lejla; Marjanovic, Damir; Mihailov, Evelin; Kouvatsi, Anastasia; Traintaphyllidis, Costas; King, Roy J.; Semino, Ornella; Torroni, Anotonio; Hammer, Michael F.; Metspalu, Ene; Skorecki, Karl; Rosset, Saharon; Halperin, Eran; Villems, Richard; and Rosenberg, Noah A., No Evidence from Genome-Wide Data of a Khazar Origin for the Ashkenazi Jews (2013). Human Biology Open Access Pre-Prints. Paper 41.

Monday, January 27, 2014

Poles more indigenous to Europe than Germans

This has actually been obvious for a while now, thanks to both modern and ancient DNA. But the figure below from the new Olalde et al. paper on the complete genome of a Mesolithic hunter-gatherer from Iberia illustrates it more effectively than anything else I've seen to date. Note that the Polish reference set (PL) shows significantly higher allele sharing with the ancient Iberian, La Brana 1, than do Germans (DE). In fact, only Swedes (SE) manage to better Poles in this regard. But it's also worth noting that Poles show the highest allele sharing with the two partial genomic sequences of Neolithic hunter-gatherers from Gotland, Ajv70 and Ajv52.

On the other hand, compared to Poles, Germans clearly show higher allele sharing with Gok4, the Neolithic farmer from Southern Sweden, and Otzi the Iceman from the Copper Age Tyrolean Alps. Unlike the hunter-gatherers, who are genetically more Northern European than any Europeans alive today, these ancient samples are more Mediterranean, and indeed more Near Eastern, than most present-day Europeans, which is something that can be seen clearly on the main Principal Component Analysis (PCA) from Olalde et al. below. This suggests that most of their ancestors arrived in Europe from the Near East during the Neolithic.

It's an intriguing outcome between these two large neighbouring European countries, but perhaps easily explained by geography and climate? Germany is situated west of Poland, so it has a warmer climate, and thus its territory was more heavily settled by early farmers from the Mediterranean Basin during the Neolithic. Moreover, much of what is now Germany was part of the Roman Empire, which might have facilitated gene flow between the ancestors of present-day Germans and southern Europeans.

Poles, on the other hand, show stronger genetic links to Baltic populations, especially Lithuanians and Estonians, who are arguably the most Mesolithic-like Europeans alive today (see here). In fact, if they were present on the graphs above, they'd probably easily top the allele-sharing list with La Brana 1 and all of the hunter-gatherers from Gotland. This might be due to the almost impenetrable primeval forests that once covered the areas just south and east of the Baltic, as well as the relatively cold climate in these regions.


Olalde et al., Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European, Nature (2014), doi:10.1038/nature12960

See also...

The really old Europe is mostly in Eastern Europe

Prehistoric Scandinavians genetically most similar to modern Poles

Mesolithic genome from Spain reveals markers for blue eyes, dark skin and Y-haplogroup C6