search this blog

Loading...

Friday, November 22, 2013

Long awaited fix and update to 23andMe's Ancestry Composition any day now


Update 7/12/2013: 23andMe's Ancestry Composition is now better, but still not great

...

Almost a year ago, amidst great marketing fanfare and expectations from customers, 23andMe rolled out the Ancestry Composition (AC). This wasn't just supposed to be an updated Ancestry Painting (the company's previous main ancestry tool), but a state-of-the-art global and local ancestry deconvolution analysis that would put all competitors to shame.

Alas, not everything went according to plan.

For one, 23andMe's decision to represent Europe as ten ancestral regions, but not break up Sub-Saharan Africa or eastern Asia, was met with disbelief from many of the punters. Secondly, the AC suffered from an overfitting problem.

Overfitting happens when the individuals being tested are also used as reference samples, so they end up with inflated ancestry proportions based on their self-reported ancestry. For more info check out these threads at the 23andMe forums here and here. I was one of the overfitted customers, and it really pissed me off.

23andMe tried to fix the overfitting problem about six weeks after the launch, but that didn't go too well at all. See here.

Last week 23andMe announced that it would again attempt to fix the overfitting problem, and also break up Sub-Saharan Africa and eastern Asia into three and five regions, respectively. I'm not sure if the process has started yet, but you can get updates on how things are going here. You'll see these question marks next to the new reference sets in the AC until your updated ancestry proportions have been computed.


I was very excited about the AC when it was first announced, but now, after a year of waiting for the overfitting fix, I find the whole thing very underwhelming. I might, at some stage, write a review and user guide, but then again I might not.

8 comments:

Seinundzeit said...

I'm sharing with around 55-60 South Asians, people as diverse and distinct as Punjabi Jatts and Bengali Muslims. All of them are around 100% South Asian! Anyone whose ancestry traces east of the Indus river, and west of Myanmar, is close to being 100% South Asian on AC. The only South Asians who are not around 100% South Asian on their AC's are Pashtuns. I'm 75% South Asian, and the Afghan Pashtuns I'm sharing with range from 80% to 48%. Although, there seems to be a convergence around (or bias towards) 75%. But we are the only South Asians whose results aren't completely underwhelming (still, 75% of a rather broad, one-size-fits-all category isn't necessarily exciting). I'm really disappointed that they didn't break the category down in this update. At least some kind of a Pakistani-Northwest Indian category, and a Peninsular Indian category. Even this would be sufficient. And they definitely have the samples for this. I wouldn't be surprised if my 75% South Asian became 75% Pakistani-Northwest Indian. I'm around 75% Sindhi according to Dr. McDonald, the rest being Georgian and Chuvash. And my Georgian and Chuvash percentages match my Middle Eastern and European percentages at 23andMe. Same applies to the other Pashtuns, if their AC is 80% South Asian, Dr. McDonald's program puts them at 80% Sindhi, and if their AC is 48% South Asian, they turn out 48% Sindhi.

RobertN said...

Hi Davidsky,

I'm relatively new to this genetics stuff, but just received my results from 23andme. It has me as 99.2% Euro, .1% East Asian/Native American, and .07 non-assigned. There is also a large 47.2% "non-specific Euro" designation in addition to 49.6% southern Euro which I would like to get more info on. Would programs like eurgones k13 help me further break it down to get a clearer more vivid picture? How can I go about this? Any help or guidance would be appreciated please. Thank you. -Robert

Davidski said...

Yes, upload your file to GEDmatch and run the K13. That should help you get a better idea about your overall ancestry.

But also make sure to try the AC in "speculative" mode, which should fill in some of the non assigned areas of the genome.

RobertN said...

Thanks for that Davidsky. Tried the speculative and the sub-regional AC, but still a bit confusing. It increased my Europeanness to 99.9%, basically getting rid of the .7 that was non-assigned, and also increases my southern Europeanness significantly, yet on the Global similarity map it seems to situate me closer to the Northern and Eastern Euro groups than the southern Euros. I tried registering with Gedmatch, and it said it would send me a registration code to my e-mail, but I never received it. Maybe their system is down? I'll try again soon. Thanks again.

Davidski said...

The Global Similarity maps shouldn't be taken too literarily. The best thing to do is to share with people from various ethnic groups to see where they land on the maps, rather than follow the abstract and often not very useful labels assigned by 23andMe to its reference samples.

Helgenes50 said...

How to differentiate between our genes less than 500 years (23andme)
and those of our Neolithics or Mesolithic ancestors?
That exceeds me
At 23andme, everybody is sure that what we see is recent, as Normand
when I see in my AC, my south European ( > 10%) for me it's prehistoric. A few percent could be recent but 10 %, I don't believe it.

Davidski said...

The range of 500 years given to the Ancestry Composition can be safely ignored. It's just a way of saying that the Ancestry Composition is better at picking up more recent ancestry from within the past few hundred years. But that some of the admixtures it shows, especially those designated as non-specific, surely go back much further than that.

Helgenes50 said...

Thank you for accuracy.

But when I see my 3.1 Scandinavian, it certainly is Viking.
So! Difficult to imagine a recent origin, no imigration from Scandinavia
since this period to Normandy

The 500 years, it's just where our ancestors were before the beginning
of this big mixture we are living today, ie before the discover of America

Your results are more accurate and more logical