0:00

Hello and welcome back to Introduction to Genetics and Evolution.

Â We've been talking about the Hardy-Weinberg Equilibrium and how,

Â under certain conditions and only under those conditions, you can calculate or

Â estimate expected genotype frequencies from observed allele frequencies.

Â The opposite is always true.

Â You can always know allele frequencies from genotype frequencies.

Â But you cannot always infer genotype frequencies from allele frequencies.

Â Now in the last example I showed you, the last video we saw a case where there were

Â fewer heterozygotes, fewer of the Aa individuals,

Â observed relative to what was expected from Hardy-Weinberg.

Â Now why might that be?

Â Well this is gonna be the first of many possible deviations from Hardy-Weinberg

Â that we'll discuss.

Â And it could be what's referred to often as the Wahlund Effect.

Â Well let's look at a little bit of real data.

Â 0:54

Here's some real data from a Navajo population at the MN blood groups.

Â MN, just like big A and little a, there's two alleles, M and N.

Â There's three possible genotypes.

Â MM, MN, and NN.

Â So let's take a look at what we see when we look at the Navajo of this

Â particular group.

Â Well, we can do the same tests for Hardy-Weinberg we've done before.

Â We figure out the total number of individuals, 361 in this case.

Â Get the genotype frequencies, the true observed genotype frequencies.

Â From them, say all of these plus half of these, there are allele frequencies.

Â So the frequency for big M is 0.971, for big N it's 0.083.

Â Take this squared, .841.

Â 2 p q, so two times this times this.

Â 0.152.

Â 0.83 squared is 0.007.

Â Now we see, this population's not absolutely at Hardy-Weinberg,

Â but it's very close, it's very close to the Hardy Weinberg predicted frequencies.

Â So let's look at another population now, let's look at these Aborigine.

Â Now again if we look at the MM blood group in these individuals,

Â let's follow the same procedure.

Â I won't go through all the steps but if you want some practice you can

Â pause the slide, take these first set of numbers and go through it yourself.

Â We come back to again, a set of genotype frequencies that are predicted,

Â and they're very close to those observed.

Â 0.031 is very close to 0.030.

Â 0.293 versus 0.296. .676 versus .674.

Â It's within .003 for all of these things of the expected genotype frequencies.

Â 2:59

Well, we get these allele frequencies for M and

Â N, but we get something that's deviating rather dramatically

Â from Hardy-Weinberg expectations of the genotype frequencies.

Â Look at that.

Â This is dramatically different, this is dramatically different,

Â and this one as well.

Â You notice especially, and this is what I want to point out in particular,

Â we expect almost half the individuals to be heterozygote.

Â We observe only about a quarter of individual heterozygote.

Â So there's a dramatic deficit of heterozygotes in the observed

Â relative to the expected.

Â That's like the last example from the last video.

Â Now why might we see this?

Â Why do we see this deviation?

Â We have a Hardy-Weinberg population and another Hardy-Weinberg population,

Â why is it when we put them together It's not a Hardy-Weinberg?

Â What assumption have we deviated from?

Â Well one big assumption we deviated from the list I showed you earlier,

Â was the assumption of random mating.

Â The idea is that any two individuals are as likely to breed as

Â any other two individuals.

Â Remember from the first video from the series,

Â that gametes just floating all around.

Â It's not like that there because the Navajo lady is not

Â as likely to breed with an Aborigine as she is with another Navajo.

Â So imagine that in one population big A is abundant.

Â Then big A's are gonna be very likely to encounter other big A's.

Â In little population, little As are very abundant.

Â Little a's are gonna be very likely to encounter other little A's.

Â But big A's and little a's are very unlikely during counter each other that's

Â why you see this deficit of heterozygous.

Â And then in this regard [NOISE] the Hardy Weinberg assumption was violated.

Â That Hardy Weinberg assumption was not rejected within the Navajo, or

Â within the Aborigine, but

Â it deviates from this combined population and this results from nonrandom mating.

Â And importantly, this will very typically result in having too few heterozygotes.

Â We expected about half.

Â We observed about a quarter.

Â This pattern is referred to as the Wahlund effect.

Â This is when you sample a cross population.

Â So the populations within each population may have random mating.

Â But when you sample across or between populations you get

Â an under-representation of heterozygotes relative to Hardy Weinberg.

Â So this is a way for potentially identifying different populations.

Â You can see how much of this deviation you see.

Â We'll use that, actually,

Â in a subsequent video for calculations that are referred to as FST.

Â But let me ask you a different question first.

Â Why does it matter?

Â Why does it matter if something's a Hardy Weinberg or not?

Â Well in fact, the first step in genome-wide association studies for

Â genetic diseases, or any trait, is Is or should be to test for Hardy Weinberg.

Â Now why is that?

Â Well, actually geno wide association studies assume Hardy-Weinberg is true or

Â assume that you're very very close to it.

Â basically you're assuming that there's linkage to this equilibrium.

Â linkage to this equilibrium caused by close proximity between marker alleles and

Â disease causing alleles.

Â Remember that fundamental purpose of all genetic

Â mappings to see an association between genotype and phenotype.

Â And we're hoping this association is from close proximity or lack of recombination.

Â So imagine that you see something like,

Â where 20% of individuals with AA genotypes have a disease.

Â And 5% on individuals with aa genotype have a disease.

Â Then we're assuming there's an association between the AA allele, or

Â the AA genotype, or the A marker gene more broadly, and the disease.

Â But just being in different populations also causes linkage disequilibrium.

Â Let me give you an extreme example to illustrate this point.

Â Let's imagine that in Population 1, every individual is AA.

Â Okay?

Â 6:59

The answer is yes, you would say this because just the way it laid it out.

Â Now this is actually a fake LD between disease and the gene.

Â Because the disease may not be on the same chromosome as the A gene in particular,

Â and in fact, the disease may not even be genetic.

Â Lets say, for example, in population one every eats a lot,

Â in population two everybody has very good weight.

Â You may see obesity is much more abundant in population one than population two, but

Â it may have nothing to do with your genotype at the A gene.

Â So its really important that you

Â have true Hardy-Weinberg in doing these genome wide association studies.

Â Otherwise the associations you see may have nothing to do with the genotypes

Â you're observing.

Â And the disease in fact not even be genetic at all.

Â Now punchline is if there are allele frequency differences between

Â populations at a SNP, which is very often true.

Â Let's say the SNP is being used as a marker.

Â And if disease incidence differences

Â exists between the two populations you're studying, which again, very often true.

Â Sometimes for genetic reasons, sometimes for not, but it may not have anything to

Â do with the particular marker or anything near that marker you're looking at.

Â Then a genome-wide association study will erroneously

Â make it seem that a gene near the SNP is causing or contributing to the disease.

Â Now if you test for Hardy-Weinberg then you can avoid this error because you can

Â identify if you're looking at one interbreeding population or not.

Â Your hope is that the population is at Hardy-Weinberg or is very, very,

Â very close to being at Hardy-Weinberg.

Â And then if you see an association you know it's not this weird bias.

Â 8:31

Now, although it's very important to test for

Â Hardyâ€“Weinberg, this is often not done.

Â Here are excerpts from two studies from not too long ago.

Â This is from the American Journal of Epidemiology 2006.

Â The exclusion of studies in which Hardyâ€“Weinberg was violated changed

Â the conclusions and

Â changed the statistical significance of gene-disease associations.

Â That's scary.

Â Think about it. Millions of dollars go into finding

Â these gene-disease associations.

Â We really need to be carefull and know that we're doing them right.

Â Here's something from the European Journal Human Genetics in 2005.

Â Testing and reporting for Hardyâ€“Weinberg equilibrium is often neglected and

Â deviations are rarely admitted in the published reports.

Â So this is a really big deal.

Â There's other issues about interpreting the deviations for Hardy-Weinburg.

Â Let me show you an example here.

Â So this is a real example where a Hardy-Weinburg test was done, but

Â interpreted incorrectly.

Â This is raw data from a 2000 study of BRCA2 variants.

Â These are from newborn males from a hospital in the United Kingdom.

Â Just as a little test here, I want you to look, or I want you to do the math for

Â this and figure out how close this is to Hardy-Weinburg expectations.

Â Do you see a particular kind of deviation?

Â So try that out.

Â 9:41

Well hope that wasn't too hard, let me go ahead and show you the answers,

Â these are the numbers you should have come up with, so these are the two.

Â Genotype frequencies.

Â These are the true allele frequencies.

Â These are the Hardy Weinberg expected genotype frequencies.

Â What we see here is our expected frequency of the heterozygote is 0.4,

Â our observed was .36.

Â So there is, or there at least seems to be, some slight deviation for

Â Hardy Weinberg, and in this direction of too few heterozygotes.

Â This was, by the way, statistically significant, too.

Â Interestingly, how did the authors interpret this?

Â The authors of the study interpreted it as that the Aa individuals are less

Â healthy than AA or aa.

Â They postulated that maybe there was some disease or

Â 10:26

there was some problem associated with Aa individuals.

Â In fact, there is a much simpler explanation.

Â That we're looking at newborn males in a hospital in the United Kingdom.

Â It's quite likely, imagine this is a hospital in a place like London.

Â It's quite likely that in a place like London,

Â there's a lot of subdivision of the population.

Â That people of say,

Â Indian descent are probably more likely to have kids with others of Indian descent.

Â People from the Far East may be more likely to have kids with people from

Â the Far East.

Â People who, who are of European decent are probably more likely to have kids with

Â others of European decent.

Â Yes there are cases where people will have kids with people from other ethnic groups.

Â But by having this tendency there, overall the population,

Â which undoubtedly exists, you will get exactly this pattern.

Â It's basically the simpler explanation than the Aa individuals are less healthy,

Â is that there's a little bit of a Wahlund effect there, but that wasn't considered.

Â It probably didn't take our pop gen class.

Â 11:28

This is a quote from Hardy's 1940 book.

Â This is when he was about 62.

Â His book is called A Mathematician's Apology.

Â I definitely recommend it.

Â It's a very interesting read about the elegance and beauty of math.

Â He said, I have never done anything useful.

Â No discovery of mine has made or is likely to make, directly or

Â indirectly, for good or ill, the least difference to the amenity of the world.

Â I would like to say very strongly, he was very wrong on this.

Â This Hardy-Weinburg idea, which he helped bring about and

Â he helped popularize, really has made a huge impact.

Â We're continuing to see it now even literally more than a hundred

Â years after the original publications of the Hardy-Weinburg work.

Â We still see these applications for things like genome-wide association studies.

Â The kinds of things he never would have imagined.

Â