Science: The Basic DNA:
As mentioned in the opening paragraph, DNA and genetic genealogy are far too complicated to have anything but a basic introduction on a family website. However, you need to be aware of some science and terminology. If you think you know enough about DNA testing, do not care about DNA testing, or have already tested you can skip this page.
DNA is a chemical found in every cell in all plants and animals. DNA takes the form of long strands. These strands are made up of nucleotides or bases. There are four different nucleotides. Their names are not important, but they are abbreviated “A,” “C,” “G,” and “T.” Nucleotides come in pairs. “A” pairs with “T'” and “C” pairs with “G.” Chemically “A” and “T” cannot pair with “C” or “G.” Because of this pairing the combination is sometimes called a base pair which is abbreviated “bp.”
The order of the nucleotides or base pairs is a code–the genetic code. This code tells the cell to become a human eye cell or a maple tree root cell. The code also tells certain processes to turn on or turn off–growth, puberty, graying hair, etc. The code also contains information about the evolutionary history of the organism.
In humans there are two types of DNA. There is Mitochondrial DNA (mt-DNA) found in the mitochondria that provide energy for our cells. There is also nuclear DNA that is found in the nucleus of the cell. The long strands of mt-DNA loop back on itself to form a ring. There are over 16,000 base pairs in a human mt-DNA ring. In humans the other strands of nuclear DNA are arranged as 23 Chromosome pairs. A Chromosome is a long strand of DNA of variable length that is paired with an nearly identical segment of DNA, thus a Chromosome pair. The 23 pair of Chromosomes are usually identified as Chromosomes 1 (the longest) through 22 and the 23rd pair as the sex chromosomes. The 22 numbered pair are also called autosomes. The sex chromosomes are labelled X and Y. A human female has two X chromosomes, while a male has an X and a Y. In humans the autosomes contain a total of over 3000 million base pairs.
Sex cells (not sex chromosomes) are a little different. A sex cell only has one of each chromosome not a pair, and has either an X or a Y, but not both. When the female’s egg is fertilized by the male’s sperm the chromosomes become paired–one half of the pair from the female and one half from the male (i.e. one half contribution from each parent). Similarly an X from the female pairs with either an X from the male (creating a daughter) or a Y from the male (creating a son). Sex cells also have mt-DNA because they need energy and must have mitochondria. The mt-DNA from the male is usually lost during fertilization. Therefore the mt-DNA in our cells is only from our mother, who got it from her mother, who got it from her mother, etc. Similarly a male receives the Y-DNA from his father (women do not have a Y chromosome), who received it from his father, who received it from his father, etc.
When the sex cells are being prepared within the organism, the nuclear DNA recombines. For each chromosome sometimes a portion of the organisms’ maternal DNA combines with a portion of the organisms’ paternal DNA–recombing to form a chromosome unlike, but similar to, the mother’s and father’s chromosomes. This recombination allows for the variety seen in nature and the passing on of advantageous changes or the elimination of bad changes through natural selection. Although similar, two full siblings (same parents) will have different combinations of their parents’ DNA. Thus, recombination also allows for autosomal DNA testing. Even after recombination our chromosomes are still more like our parents, siblings, and cousins than other individuals in general.
The beginning of commercial DNA testing and genetic genealogy and the Y-DNA test:
In about 2000, a man by the name of Bennett Greenspan wanted to know how closely related he was to other men named Greenspan around the world. He started the company called Family Tree DNA (FTDNA) to provide Y-DNA testing services to himself and others. At that time he was testing portions of the Y Chromosome that consisted of segments of repeated sequences. These initial Y-DNA tests counted the number of repetitions in these repeated sequences at specific locations on the Y chromosome, or, in geek-speak, Single Tandem Repeats (STR). Recently some have begun calling the first Y-DNA tests Y-STR tests to differentiate between it and a new type of Y-DNA test. In the original tests, various levels of Y-STR testing were offered. One could ask for 12, 25, 37, 67, or 111 different locations of repetitions or markers to be tested. I tested 67 markers in my Y-STR test with FTDNA. In most western cultures Y-DNA tracks with the family surname. FTDNA allows space on their website for Surname Groups. I have created one of these surname groups, but to the best of my knowledge, at this time, I am the only male in the world with the Ragusin surname to have had my Y-DNA tested.
In theory, a father should pass an exact copy of his Y-DNA to his son. In reality, occasionally a random mutation occurs which causes one or more sons to have different Y-DNA (a different number of repeats at a specific location on the chromosome) than their father. The sons of the sons with mutations, would continue to have slightly different Y-DNA than their male siblings and cousins. Over time as humans moved out of Africa and populated the rest of the continents, these differences would increase until there was very little in common. Scientists have classified Y-DNA into haplogroups.. These haplogroups are identified by using the upper case letters from A to Z, although not all were used. Further subdivisions are represented by following the initial capital letter with Arabic numerals and lower case letters used alternately. I belong to haplogroup R1a as should all males with the Ragusin surname.
The haplogroups are actually defined by a series of Single Nucleotide Polymorphisms or SNP’s. This is just more geek-speak for which base (A, C, G, or T) is present at a certain location. The STR’s tested in the original Y-DNA tests could only give an approximation of the actual haplogroup. FTDNA offers a more complete (and more expensive) test of the Y Chromosome which they initially called the BIG Y test. This test is so extensive that each male tested should have a unique series of SNP’s. Although unique the sequence should be similar to others of the same surname. Because of the price and the fact there would be no one to be compared to, I have not taken the BIG Y test. However, I have had several additional SNP’s tested. I have been identified as R-Z282. This is far enough up the sequence to include all Ragusin males as well as some other surnames as well.
Mitochondrial DNA testing:
At about the same time mt-DNA testing was also offered. Not only did this test not use nuclear DNA, but did not count STR’s at specific markers. Instead the SNP was determined at each tested location. Two levels of testing were offered. One tested about 200 locations on either side of the center point of the mt-DNA ring. The other tested all the locations on the ring. I initially took the smaller test and then upgraded and had my whole mt-DNA sequenced. Mt-DNA is categorized in the same manner as Y-DNA with haplogroups represented by capital letters, and then followed by alternated Arabic numbers and smaller case Roman letters. I have been identified as haplogroup W1h1. This is my mother’s mother’s mother’s, etc. mt-DNA and has nothing whatsoever to do with the Ragusin family.
Much of the original work with mt-DNA was done at Cambridge University in England. The first mt-DNA test result was considered the standard and everything after that was compared to the first. Test results were a list of the differences between your mt-DNA and the standard. The standard was called the Cambridge Reference Sequence or CRS. Using this data a scientist named Bryan Sykes was able to demonstrate that all the living women in England descended from only seven different mt-DNA sequences (i.e. seven different ancestral women). Although later work tried to contradict or expand Sykes’ work, it still shows the power of time and mutations in genetic genealogy.
After enough people had been tested it became possible to estimate or calculate the mt-DNA of the first modern human female who is, believe it or not, generally referred to as “Eve.” The Reconstructed Sapien Reference Sequence (rSRS) became the new standard–replacing the CRS. Now your mt-DNA results are listed as the differences between your mt-DNA and “Eve’s.” I have 57 differences with Eve including four locations where Eve did not have anything.
Autosomal DNA testing:
In autosomal DNA (sometimes abbreviated at-DNA, although I prefer to spell out the word autosomal) testing, the allele or SNP value (A, C, G, or T) is determined at approximately 700,000 locations spread throughout the 22 pair of chromosomes making up our autosomal DNA. Although 700,000 might seem like a really large number, it is actually a very small portion of the total number of base pairs found in the chromosomes (about one fortieth of one percent). In theory, we should have very few differences from our parents and siblings. We should have more differences with our grandparents, aunts and uncles, and first cousins. We should have even more differences with our great grandparents, great aunts and uncles, and second cousins, etc.
When the gametes (the egg and the sperm) are being prepared prior to fertilization some of the chromosomes get jumbled up a little through recombination. When your mother’s egg cell is being prepared, the first chromosome (for example) might combine some of the first chromosome of her father with some first chromosome of her mother. This creates a different first chromosome for you that includes portions of the chromosomes from each of your grandparents and your mother. A similar recombination might occur on your father’s first chromosome when his sperm is being prepared. Because this occurs every time an egg or sperm is created, two full siblings will have similar but different DNA sequences unless they are identical twins.
I have tested myself, a sister and a brother. We are closer to each other than anyone else except our mother (also tested). I also have several first cousins who have tested and they are my closest matches except for the my mother and full siblings. Unfortunately, a statistical analysis (reported in centiMorgans and abbreviated cM’s) must be done to determine the likelihood that any similarities or differences occurred because of chance (probability of recombination) or descent (you are actually related to that person). The more closely related two individuals are the higher their cM value. Regrettably the probability ranges allow for some overlap. For example several of my first cousins are identified as potentially (statistically) “first cousin OR great/half uncle or aunt or nephew OR great grandparent or great grandchild.” Traditional genealogical documentation is required to sort out these overlaps for even some of the most closely related individuals.
I have already stated that the DNA test can only determine the SNP value at each of the 700,000 locations. But, in reality, the test determines two values at each location–remember they are chromosome pairs. Unfortunately the test CANNOT determine whether the value came from the chromosome you received from your mother or whether it came from your father. Further complicating the necessary statistical analysis is the fact that for most people about 60% of the determined values at each of the tested locations are the same for your father and mother.
Another aspect of autosomal testing are, unfortunately, the ethnicity estimates. Again this process is entirely statistical. As more people are tested the statistics should improve. I find the ethnicity estimates the least useful part of autosomal testing. But how does it work? I have ancestors who lived in the province of Cosenza in southern Italy for hundreds of years. They all married into families who had lived in Cosenza for hundreds of years. Therefore their DNA does not only identify their family, but that the family was from Cosenza. However, there are problems not related to any issues I have already identified. Look at the following sequence of letters–“thereward.” How should this be read? Should we read “the reward,” or should we read “there ward.” Imagine a sequence of A, C, G, and T that is not constrained by having to be English words. The same sequence of base pairs might be indicative of Cosenza in southern Italy, the Andean region of South America, or Mongolia (as hypothetical examples). In the early days of autosomal testing many people came to me crying that this ethnic group or geographical region did or did not show up in their test results. My usual response was to wait to tomorrow because the statistics will change.