Articles Comments

The Rose Bush » The Basics » The ABCs of DNA — IBD vs IBS vs mIBC

The ABCs of DNA — IBD vs IBS vs mIBC

Human Karyotype: Paired up chromosomes from a human male.

Human Karyotype: Paired up chromosomes from a human male.

IBD vs. IBS

When you are matched up to a potential family member at one of the genetic testing services, they do so by finding segments of DNA where the alleles are the same over a minimum distance of both your and your match’s DNA. At least that’s what is supposed to happen. There are several things that can go wrong when matching segments between people and here we’re going to discuss the unfortunate incident of the IBS cousin. No, IBS doesn’t stand for any kind of medical condition. The term IBS, Identical by State, refers to any segment that matches another person’s DNA no matter how that matching occurred. The two sub classifications are where the segment is not only IBS but also IBD, Identical by Decent, or — if the segment is ubiquitous within a specific population and you get it by random chance — a non-IBD segment. Since IBD segments are easier to understand, let’s tackle an IBD segment first. After we understand IBS, IBD and non-IBD segments, I’ll discuss mIBC segments — those misIdentified by the Computer.

IBD

A segment is IBD when two people inherited the matching alleles from a common ancestor. Now, common ancestor is a very open concept, is every allele that modern people (Homo sapiens sapiens) got from Lucy’s people (Australopithecus afarensis) IBD? It depends on what context you’re talking about it in. If you’re talking about the differences between Homo Erectus and Homo Habilis, two of our ancestors, then yes, you might consider all Homo species descended from Lucy’s people to be IBD. But we’re interested in a genealogical  timeframe of 500 to possibly slightly more than 1000 years. What we want to know is, “Is Bob your uncle?” So for us, the answer would be no, every allele that descended from Lucy’s people to all modern humans is not IBD. Instead, we’ll restrict ourselves to ‘in the last 20 generations’.

If you and your sister both got the sequence CGGTATTACCTG from your father, then that segment is IBD because you got it from a common ancestor. If your father got it from his mother, then that segment is IBD with any children or grandchildren of your paternal grandmother who happened to inherit that segment or part of that segment. These would be your first cousins. Let’s say we could establish that your paternal grandmother got that segment from her mother. Now, you know that the matching segment with a predicted second cousin, the offspring of  your great-grandmother on your father’s side, are IBD. And so on it goes until you are soon tracing fifth and eighth cousins back in time.

IBS

A sequence is IBS when the segment of DNA is identical at all possible allele positions. It could be that the segment is IBD and inherited from a common ancestor in a genealogical timeframe, but it also could be non-IBD but still IBS. How does that happen? Each allele position exists within a population at a certain frequency. An allele position by definition can’t be at 0 or 100%, but it can be darn close. Let’s look at a specific example. Ten percent of the population are left-handed. If you are right-handed and meet another person that is right-handed, does that mean you are related to them in a genealogical timeframe? Obviously not. The allele that controls right-handedness is ubiquitous in the population and non-IBD.

In these cases, which are normally small segments below somewhere below between 5 and 7 cM, someone could have the same DNA sequence as you, come back as a match on 23andMe and still not be closely related to you. Even if you knew for certain every ancestor on both sides all the way back to the 6th century, you might not be able to identify a common ancestor. That sequence could be very old and because it provided a very slight advantage to those that had it, it remained intact for thousands of years. But because the advantage was small enough, those that didn’t have the exact same sequence could also thrive and reproduce but at a slightly lower rate.

Another way sequences become very common is when a population goes through an evolutionary bottleneck. This is where the population gets very small for whatever reason — disease, famine, or isolation such as when small group migrates to a new land — and then expands to fill its niche again. When this happens, the assortment of alleles that are available is sharply reduced. In these cases, many people can have similar alleles that are IBS but their common ancestor would have to be traced all the way back to the bottleneck. When the bottle neck is far enough back that no (or few) genealogical records persist, those segments are now considered IBS.

Terminology

All IBD segments are also IBS although some genealogy websites might prefer to say that IBS segments refer to segments that aren’t IBD because it’s simpler. But more correctly they should say, there are IBS segments between people and some of them are IBD and some are non-IBD segments. The failure to properly use the correct terminology causes issues over and over in genetic genealogy. Not just in misunderstanding your matches, but also by affecting those creating new tools for the genetic genealogy community. In addition to these two types of inherited segments, there are anomalies caused by comparing unphased DNA data; I call these shadow segments and shorten it to mIBC or mis-identified by computer. We’ll talk about them next.

Complications

If your father and your mother both have the same sequence one of their chromosome pairs and both you and your sister inherit one copy of that sequence from different parents, then that segment is IBS between you and your sister and not IBD. But there is no way to determine that happened. The more similar or inbred the population that your parents come from were, the more likely this is to happen. Testing companies strive to test for SNPs that help to differentiate between individuals as well as those linking two people together but there are bound to still be overlap if your parents are from the same population. Of course, it’s always possible that your parents are distantly related within a genealogical timeframe and that segment came to them both from their great-great-great-great-grandmother!

You get 1 chromosome from each parent. Each chromosome consists of a forward strand of DNA sequences and a mirrored backwards strand.

You get 1 chromosome from each parent. Each chromosome consists of a forward strand of DNA sequences and a mirrored backwards strand.

mIBC

One giant roadblock to determining your ancestral matches with certainty is that your DNA data isn’t phased. That means, when each position is tested you get not one answer per location for most genes (excluding the X and Y in men) but two. You see, you have two of each chromosome. One from your mother and one from your father. The sequence detecting machine can only tell if there is an A,G,C or T at any position not whether it came from your mother or your father. If you open your results you’ll find that there is a position indicator on one side and two, not one, bases listed next to the position. If you’re homozygous for that allele position, you’ll  have two of the same base. That means you got the same base from mom as you did from dad. But if you got two different bases from your parents, the machine can’t tell which parent you got it from and therefore can’t create a phased set of genetic blueprints to use for comparisons.

The longer the SNP segment the less likely these SNPs are to be randomly like someone else’s in the database. This is where the SNP threshold comes in. This threshold is required for unphased data to try to eliminate matches that the computer has misidentified a matching sequence where none in fact exists. Unfortunately, as with all thresholds, you eliminate some real matches while allowing some false matches to slip through. I call these imaginary matching segments “shadow segments”, existing only in the realm of possibility and refer to them as mIBC matches or misidentified by computer. Neither of your chromosomes has this sequence.

Here’s an example:
You have two chromosomes and this is the actual sequence on each chromosome.

From Mother:  ACTAGAGTTAAC

From Father:  AGTCCACTATAG

Your match ‘Bob’ has two chromosomes as well and this is their actual sequence:

From Mother:  ACAAGAGTTAAG

From Father:  AGTCGACTATTC

The computer imagines that this matching sequence could possibly exist:

Shadow Seg:  ACTCGAGTTTAC

Based on this mIBC segment, the computer matches you with Bob. But you and Bob aren’t related, you have no MRCA and haven’t descended from the same people.

These mIBC segments can be differentiated from IBS segments through either phasing (an expensive method of DNA sequence determination that separates the chromosomes first) or pseudophasing (a software method I’ve been proposing FTDNA and 23andMe develop for several years) your DNA sequence. So while you can’t tell an IBD segment from a non-IBD segment, you can eliminate a mIBC segment mismatch. I’ll try to make a post soon saying how that can be accomplished by 23andMe or FTDNA today. Some other 3rd party programmers are also trying to create programs to do this but to accomplish it, you need to have the raw DNA from several other family members. If they succeed it will be a great start but really, it requires 23andMe or FTDNA to do it for everyone’s benefit.

Graphics Credit: K S Rose

Share

Written by

Filed under: The Basics · Tags: , , , ,

One Response to "The ABCs of DNA — IBD vs IBS vs mIBC"

  1. [...] Many genetic genealogists suggest that a one segment match of less than 10 cMs will not have a common ancestor in the last 200 years and thus hard to find the realtionship. Plus matches smaller than 7cM may not even be IBD (identical by descent) since the smaller the match the more likely that it is not real, but rather IBS (identical by state). Remember you have two copies of each chromosome and the testing cannot say which side each allele came from.For example, if you are AA GG AA CC you will match me when I am AG AG AG CT but there is no guarantee that my A G A C came from a single side (paternal or maternal) so we need a much longer match to have some confidence that we are related. There is a good explanation of this IBS vs IBD concept here: http://dna-footprints.com/203/the-abcs-of-dna-ibd-vs-ibs/ [...]