Thursday 24 December 2020

December 2020 Update on BigY Results

In this blog post I want to provide a status report concerning results from BigY-700 test results within the Pike DNA Project.  The previous BigY update for our project was done two years ago, in December 2018.

To begin with a quick overview, the BigY-700 test is the most advanced Y-chromosome test offered by FamilyTreeDNA.  It analyses over 700 STR-based markers (including the 111 STR markers that we display on our project's webpage for Test Results) as well as approximately 22 million nucleotides at which SNP-based mutations might exist.  DNA profiles based on STR-based markers serve our project well by distinguishing genetic clusters of Pikes from each other, thereby allowing us to identify who belongs to each of our "Group 1", "Group 2", etc., clusters.  But we would like to do more.  In particular, we would like to rebuild our family trees, putting branches back into their proper place, along with determining how long ago various branches of a common tree separated from one another.  Thankfully, the SNP-based results from BigY tests are now enabling us to do just that.

In order to demonstrate this, I am going to focus this blog post on the BigY results for our project's Group 1 and Group 2 clusters.

 

Group 1

Within our project's Group 1, we now have a total of five BigY results.  Four of them are for descendants of John Pike who was married at Whiteparish, Wiltshire in 1612, and in 1635 settled in Massachusetts.  Of these four, three are for descendants of John's son John, while the fourth descends from John's son Robert.  The fifth member of Group 1 with BigY results traces his ancestry back to a Simon Pike who was married at Stratfield Saye, Hampshire in 1751.

The four results from known descendants of settler John Pike, combined with the previously established knowledge of their positions within the family tree, will help us to get a sense of how often SNP mutations can happen.  That is, we can estimate how many mutations are likely to arise per generation.  

Here is a mini family tree that shows the relationships between the five members of Group 1 with BigY results.  For this tree we are not yet taking into account their DNA results.



Note that at this point we do not know whether Jonathan's ancestor Simon is a descendant of John Pike.  It seems unlikely, given that Simon was in England when he married in 1751, but potentially one of John's grandsons or great-grandsons might have returned to England and had a family there.  We'll obtain a definitive answer once we consider the DNA results though.

Back in 2018 when I produced the previous BigY status update I had to spend a lot of time illustrating SNP-based trees that arose from people's individual results.  I am very happy to say that FamilyTreeDNA now does this on our behalf by maintaining (and regularly updating) a Y-DNA Block Tree.  Below is a portion of the Block Tree that encompasses our Group 1, which FamilyTreeDNA makes available to those who have done the BigY test.  I have editted this a little bit, for instance by adding the names of the people involved.

 

 

To discuss this Block Tree and how to interpret it, let's first hone in on Larry, Michael and Alan.  They are all located beneath a blue block that contains a SNP named FT77897.  This SNP is their most recent common SNP, which all three of them share.  But after that their lines split apart.  Also, this FT77897 SNP is not above Roger or Jonathan, so it is a SNP mutation that arose somewhere on the line of descent towards Larry, Michael and Alan after their lineage separated from those of Roger and Jonathan.  By looking back to the mini family tree shown above we can see that this FT77897 SNP must therefore have arisen in one of the five Pike males represented by the squares that are immediately below settler John and leading towards Larry, Michael and Alan.

Above Larry and Michael is a blue block containing the two SNPs named Y88233 and FT78318.  Larry and Michael share them both, but neither was detected in Alan's Y-DNA.  Hence they indicate another branch point in the tree, which we can see corresponds to the mini family tree that we have previously determined from genealogical records.  There is also a more recent split in the tree, where Larry's line separates from Michael's.  This split isn't explicitly represented in the Block Tree just yet.  My understanding is that the branches of the Block Tree are based on SNPs that are shared, and as Larry hasn't yet been found to share any SNPs that are distinct from Michael's (and vice-versa) then for the time being they (i.e., Larry and Michael) are shown together in the Block Tree.  But if/when another person on Larry's branch (or on Michael's) does the BigY test and is found to have a SNP that Larry has but Michael doesn't, then that will prompt an update to the Block Tree so that a new branch gets shown.  Note that the Block Tree does indicate that "private" variants (i.e., SNPs not yet found to be shared with others) are present, so such yet-to-be-seen-as-shared SNPs do exist.

Now look at the blue block of four SNPs that, like an umbrella, covers Larry, Michael, Alan and Roger.  The four SNPs in this block are named YP5461, FT178399, YP5462 and YP5463.  Since these SNPs are shared by each of Larry, Michael, Alan and Roger, they must have arisen among their common ancestors.  The earliest such ancestor is John Pike who moved from Wiltshire to Massachusetts in 1635.  We can therefore deduce that John carried all four of these SNPs.  But whether they arose first with him or first within one of his ancestors is not yet clear.

What is clear though is that Jonathan's BigY results show that he carries none of these four SNPs.  This proves that he does not descend from John.  Jonathan's line must trace back to some ancestor that is also one of John's ancestors, which is to say that following picture is what the family tree actually looks like, beginning with a common Pike ancestor:

 


Something profound warrants being pointed out here.  We have now been able to link Jonathan's branch of the family tree into its proper relative place on the tree, despite the fact that we can only trace his line back to the 1700s.  It is precisely because of BigY test results that we are able to perform this reconstruction of the family tree and determine its structure.  As more and more people do the BigY test, we will in turn be able to add more branches to where they belong.

Something else that we would like to do is to estimate the age of the Group 1 Pike family, by determining when the first two branches of it split apart.  To do that is a bit tricky, and I hasten to point out that what we get is an estimate for which we cannot reliably quantify the margin of error.

Nevertheless, to try to estimate the time frame of the first branching point in the family tree, recall that John has been found to carry four SNPs that are not in Jonathan's line.  The question now is how long did it take for these four SNPs to arise in John's line.  Mutation rates for Y-DNA SNPs are not firm.  That is, mutations don't occur with any assured regularity.  An analogy that I like to use is a row of slot machines at a casino.  Asking how many generations need to pass for four SNP mutations to arise is similar to asking how many slot machines need to be played in order for four of them to yield a jackpot.  Because of the random nature of mutations (and also the random nature of slot machines), there isn't a firm answer.  In our discussion above, we observed that in the first five men leading from John towards Larry, Michael and Alan there is a single SNP mutation (named FT77897), so here it took five generations for one SNP to mutate.  But then in the next three generations continuing towards Larry and Michael two mutations occurred (namely Y88233 and FT78318).  Clearly the rate at which mutations occur is not constant.  But with a combined three SNPs over eight generations, we can roughly estimate that approximately one mutation will happen every two generations.  I repeat, with emphasis, that this is an estimate.  And in this case it is based on an extremely small data set!  But working with this estimate anyway, with John having four SNPs that Jonathan doesn't, this estimate suggests that their most recent common Pike ancestor would have lived about eight generations prior to John.  And noting that John was born about 1572, if we also estimate that there are 25 years between generations, then that would suggest that Jonathan's branch of the family tree split away from that of John's around the year 1372, give or take a margin of error that is likely several decades in size.  Even with such a wide margin of error, it is clearly apparent that our Group 1 Pike family is an old one, for which the Pike surname was adopted many centuries ago.

One more thing to point out concerning the age of the Group 1 Pike family is that there is a blue block of five SNPs (named YP5465 to YP6209) under which all current BigY results in Group 1 can be found.  The closest non-Pike BigY result is not under this block, which is to say that this non-Pike does not have any of these five SNPs.  He does, however, share each of the five SNPs of the next block up (i.e., the block at the top of the diagram, containing the SNPs named YP4102 to YP5464).  So it is somewhere along the sequence of five SNPs named YP5465 to YP6209 that the Pike surname became fixed for our Group 1 cluster.  With only five SNPs in this sequence, it therefore appears that the split between Jonathan's line and that leading to John and his descendants took place within the first ten generations of the Pike surname (where this value of 10 is estimated based on mutations happening about once in every two generations).


Group 2

Within our project's Group 2, we now have a total of ten BigY results, all from men whose Pike ancestors lived in/near the town of Carbonear in eastern Newfoundland.  The earliest reference that I have yet found that shows a Pike present in Carbonear is dated 1681, although exactly when Pikes took up residence in Carbonear is unknown.  Throughout much of the 17th and 18th centuries it was common for large numbers of people to annually commute from southern England and Ireland and back again to work in the fishery in Newfoundland during the summer months.  Before the early 19th century Newfoundland records of births, etc., are rare.  The result is that when substantive records come into being in the 1800s we find a plethora of Pike families in and around Carbonear, without clear indications of how they might be related to one another.  Trying to sort them out is a lifelong quest of mine, and is one of the reasons why I established the Pike DNA Project in 2004 when I first learned that Y-DNA testing was available through FamilyTreeDNA.

Below is a mini family tree showing how the ten men of Group 2 who have BigY test results fit into six different branches.

 


 

Georgia and her cousin Tom have the deepest lineage that has so far been traced within Group 2.  The details of how their line traces back to Thomas Pike who was married at Poole in Dorset in 1680 are something that I've been working on writing up, but it isn't ready yet (I'll let people know when it is).  The rest of us, however, each become genealogically stuck at/near Carbonear in the 1700s or 1800s.  A few us do connect with others though.  For instance, Fred is a known fifth cousin of mine, who I located and arranged for a BigY test to be done.

Rodney, George and Kevin descend from three different sons of a Timothy Pike of Carbonear who was buried there in 1838 at age 76.  As I will explain, establishing that this is so has taken effort, for in some cases the connection to Timothy was not readily known.

Rodney's recent Pike ancestors have lived at Channel-Port aux Basques on the southwest coast of Newfoundland since the early 1840s when a different Timothy Pike of Carbonear settled there.  Voters lists from the 1830s at Carbonear show that there was a Timothy Pike Senior as well as Timothy Pike Junior, suggesting that Timothy junior was a son of Timothy senior. 

George traces his Pike ancestry back to Timothy William Pike who was born at Carbonear in 1838 to parents Thomas Pike and Frances (née Thistle) who married at nearby Harbour Grace in 1824.  Thomas appears in several voters lists from Carbonear as "Thomas of Timothy" and there is a corresponding baptism in 1809 for a child Thomas with parents Timothy and Mary Pike.

Recent generations of Kevin's ancestry have lived at Pinware on the coast of Labrador.  The remote nature of this coastline means that genealogical records are particularly hard to find, because baptisms, etc., could be recorded in registers of any number of travelling clergy.  For instance, the 1872 baptism of Kevin's great grandfather Mark Pike was eventually found (written in French) in the Drouin Collection of records located in Québec.  Mark's father Solomon was born in 1835 at Carbonear, to parents Edward and Ann Pike.  Edward died at Pinware in 1879, having moved there with his wife and children sometime around the 1840s or 1850s.  Because there were several Pikes at Carbonear named Edward, trying to identify which one was this Edward has not been easy.  One key piece of evidence in this case is that there are some entries in voters lists from 1832 to 1844 for "Edward of Timothy" and then an absence from 1847 onwards for such an Edward.  This disappearance of "Edward of Timothy" coincides well with when Edward would have relocated to Pinware.

Below is an image from the 1844 voters list, showing "Thomas of Timothy" and "Edward of Timothy" recorded next to each other.

Source:  1844 Voters List of Conception Bay, GN 43/7 Box 7 at The Rooms (Provincial Archives)

This evidence tying Rodney, George and Kevin back to Timothy senior at Carbonear is less compelling than we would ideally prefer, as more records would normally be relied upon.  But the reality is that we can only work with what is available to us, and in this case there is no baptism record for either Timothy junior or Edward (as is not uncommon for people born at Carbonear prior to 1820).  Thankfully we can supplement the historical records with BigY test results.  A portion of the Block Tree that encompasses our project's Group 2 is shown below.

 


 

Rodney, George and Kevin appear under a blue block with three SNPs (named BY111567, FT95073 and Y140924) indicating that each of these three SNP mutations was found in their BigY test results, but not in the results of the other testers.  So these SNPs arose within their branch of the family tree, after it split away from the rest of Group 2, and before their three individual branches split away from each other.  Also note that on average they each have two "private" variants (i.e., SNP mutations not yet found to be shared with others).  This is consistent with the previous discussion in which it appears that Rodney, George and Kevin descend, respectively, from the sons Timothy, Thomas and Edward of Timothy senior of Carbonear.

I confess that I'm telling this story backwards, having discussed the voters lists before the DNA results.  Chronologically, these events happened the other way around.  That is, it was actually because the BigY results showed that Rodney, George and Kevin are closely related that I was prompted to review a history of part of the Pike family of Carbonear that Gilbert Pike (who descends from "Thomas of Timothy") wrote in 1999.  And it was Gilbert's mention of the voters lists, and especially how they include an "Edward of Timothy" who disappears from Carbonear in the 1840s, that helped me make the connection to Timothy for Kevin and his Pike line.  To point out a profound observation here:  the BigY test results successfully rebuilt the family tree and showed the nature of the relationship between Rodney, George and Kevin without relying on historical records.  It was subsequent to the BigY testing that the historical records were used to corroborate the tree structure and to name the ancestors involved.

Regarding Angus, myself (David) and Fred, the Block Tree produced by FamilyTreeDNA has separated my father Angus and I from Fred.  Specifically, it shows me and my father beneath a blue block of five SNPs, and then above that there is a blue block of two SNPs that we and Fred are all underneath.  Hence the three of us have the two SNPs named BY23384 and BY23385, but Fred lacks the five SNPs (named BY23399 to BY23866) that were found within both mine and my father's BigY results.  It is because my father and I were found to share these five SNPs that FamilyTreeDNA has split us away from Fred when forming the Block Tree.  Since we have already traced our lines back to this split, in this case we see that these five mutations that my father and I share (but which Fred lacks) must have arisen in the span of the five generations from the split down to my father.  So here we have an example in which there is an average of one mutation per generation.  If such a high mutation rate were to persist throughout the Group 2 family tree then that would be a great benefit to us in terms of trying to rebuild our family tree on the basis of BigY test results, since each birth would be marked by a new mutation (but alas, this is probably too much to hope for).

Concerning Tom, Philip, a male cousin of Laverne, and Bob, they are currently collected together without a block of SNPs other than the large one (consisting of the fourteen SNPs named BY24054 to PH1079) that covers all of the BigY testers within Group 2.  This simply indicates that none of these four people have recent SNP mutations that have been found to be shared with others.  It is only when shared SNPs are found that the Block Tree presents people in a separate branch.  So at this point it is looking as though Group 2's BigY results indicate a family tree that has six separate branches, as was depicted at the beginning of our discussion for Group 2:  one branch each for Tom, Philip, Laverne's cousin and Bob, a fifth one for me, my father Angus and Fred, and then a sixth one that includes Rodney, George and Kevin.  

On the basis of the BigY results, these six branches appear to have arisen at about the same time.  If every birth were to give rise to a new mutation (which relies on a high mutation rate) then this would in turn suggest that these six lines correspond to six brothers who are sons of our most recent common ancestor.  I personally doubt that that is our reality though.  Instead, I suspect that the mutation rate is closer to one mutation every two generations, in which case these six branches could have arisen in men who weren't all brothers but who could be a combination of brothers and cousins.  More work, and more BigY test results will be needed to better refine and clarify the reality of our situation.  That said, it is interesting to note that Thomas Pike who married at Poole in 1680 had a son John who is known to have had six sons.

Meanwhile, something else to observe from the BigY test results is the block of 14 SNPs under which the ten BigY testers of Group 2 are found.  Somewhere along this sequence of 14 SNPs is when it appears that our patriline took on the Pike surname.  This can be deduced by observing that next to this block is a lineage that leads down to a non-Pike.  As for how old the Group 2 Pike family is, if we suppose an average mutation rate of two generations per mutation, then this would suggest that we acquired the surname sometime during the span of about 28 generations prior to the six identified Pike branches that appear to branch apart in the mid 1700s.  That's a horribly wide timespan, ranging from about the year 1000 to 1700.  For now that's the best that we can do, because for now our closest non-Pike match really just isn't very close at all.  And also it's partly because we haven't yet found any Group 2 Pikes who aren't from Newfoundland.


Genealogical Disclaimer

As with any genealogical research, the reasoning and conclusions that are outlined above are based on what information has been found to date.  New discoveries will be assessed and incorporated as they arise, and may on occasion require adjustment to conclusions.  That's one of the best features of genealogical DNA testing, for each new test result provides new information that can reinforce past conclusions, or sometimes it can focus scrutiny onto something that isn't quite right.


For those who have not yet done the BigY test

Part of my goal in concentrating on Groups 1 and 2 in this blog post was so that I could showcase some examples of how the BigY test and the Block Tree that it produces have the power to rebuild a family tree and to reveal how genealogically disconnected branches fit onto it in relation to one another.  Also, it is through our collective effort that this becomes possible, which is to say that more BigY results will lead to more and better revelations.  So I encourage everybody who has currently only done limited Y-DNA testing to consider upgrading.  Our project also welcomes financial donations from anybody who might want to help sponsor BigY tests for others.