Once, when I was in college, I was sitting around with two friends talking about our future careers.

One said: Someday when I’m a lawyer I’ll be able to offer you guys free legal advice.

The other said: When I’m a dentist I’ll offer you free dental care.

Then they both looked at me and said, “What about you? What can you do for us? How can you hook us up?”

To which I replied, “Would you like to know which of your children are actually yours?”

As it turns out, my one friend never became a lawyer. And the other seems to have dropped off of the face of the earth… I don’t know if he ever finished dental school.

But I kept studying genetics, and genetic relatedness analysis is something I’ve used a lot in my work. And I can tell you that resolving a simple case of paternity discrepancy is not that difficult.

But I don’t work on humans, I study fish, and sometimes it can be useful to identify genetic relationships between individuals. In particular, this helps us understand how the fish migrate. If two related individuals are sampled in different locations, at least one of them had to have moved at some point in time.

Yet, some people are skeptical about whether we can detect these relationships at all. With so many fish in the ocean, how can you ever hope to sample enough of them to find kinships… and especially across distances of tens to hundreds of miles?

### Enter the birthday principle

The birthday principle is a trick that math teachers like to use to impress students on the first day of class.

If you have a room full of 30 students, how likely is it that two of them will share the same birthday? Human intuition would suggest that it is not very likely at all. But as with so many other things, here human intuition is wrong.

There may be 365 days in the year, but when you have 30 people the number of chances for a shared birthday is a 30 x 30 half matrix. That’s 435 pairwise comparisons. Given those odds, it would be surprising not to find at least one shared birthday.

This same birthday principle can applied to finding genetic relationships among fish in the ocean. The pairwise dimension can work in your favor.

Last year I published a paper where we had collected 671 juvenile cod from beaches across southeastern Newfoundland. That’s 224,785 pairwise comparisons, which seems like a good sample, and we found two statistically significant pairs of related juveniles, using just 13 unlinked microsatellite loci. One of those pairs was separated by a coastline distance of about 500 km, and had identical genetic variants across nine of our loci.

One reviewer was critical of this result. He said:

*The average heterozygosity of the loci you used to genotype juveniles is on the order of >80%. Say the parents were heterozygous at 9 out of the 13 loci. Then the chances of them producing two identical genotypes is (1/4)^9, or roughly 3.8*10 ^{-6}. Considering the high fecundity of Atlantic cod, the chances of recovering the two larvae that have identical genotypes makes this result even more unlikely.*

But the high fecundity of Atlantic cod is exactly what makes this result plausible. Female cod produce about a million eggs per kg of body weight. If we assume that all these eggs are fertilized by one male, and our female weighs 10 kg, we have a 10 million x 10 million half-matrix of relatedness comparisons. That’s just under 50 Trillion comparisons… per female.

Given those numbers, we can expect a whopping 189,999,981 sibling pairs that are genetically identical at 9 loci.

Perhaps only about 1 in a million of those offspring will survive long enough to be collected, but that is still over a hundred and eighty pairs of siblings with high relatedness values. Is it so unlikely that we won’t sample one or two?

If the reviewer is correct, we are still looking for a small percentage of the population. And there may be many thousands of spawning females. If we assume a population size of 10,000 females, of 10kg each, a larval mortality rate of 1*10^{-6}, and a 3.8*10^{-6} chance that offspring are identical at nine loci, the odds of sampling one of these pairs in 224,785 is about one in 3 million.

But wait.

The odds of two unrelated individuals sharing the same allelic variants is even more remote: (mean allele frequency)^{18}. Using the same simplifying assumptions as above, that is about a 1 in 3 billion chance.

So, the chance that these two individuals are related is much better than if they were unrelated… by about three orders of magnitude.

But neither scenario is very likely, so what is going on here. Were we just lucky to have sampled sibling cod?

I think not.

Undoubtedly there are ecological factors coming into play that tip the odds in favor of finding cod kin. For example, there is the sweepstakes reproduction hypothesis, which states that some females will randomly have more surviving offspring than others. Given reproductive sweepstakes, the percentage of kin among the settling codlings may be higher than expected under simple calculations.

Also there may be maternal effects. According to http://www.fishbase.org, cod can weigh over 90 kgs. What if there are a handful of 90 kg females that dominate spawning, laying far more eggs than everyone else? And what if their eggs are larger and healthier than those of their juniors? Maybe they eat the smaller females that try and spawn with their males. You’d expect a skewed ratio of kin then.

Furthermore, maybe some females are better adapted to the local environment than others. The offspring of such females would be disproportionately represented in the population due to their superior pedigree. And those that had the right combinations of adaptive genes would be even more common still.

### Take home message

Detecting kin in populations of marine organisms is not as much of a crap shoot as you might think. Your human intuition is bound to lead you astray. And your calculator alone will not be enough, because it takes more than just number crunching to understand demographic processes in the sea.