After field work ended the question was ‘now what?’ I have a lot of datasets for several different projects, and I needed to figure out which to tackle first. But, despite exciting observations at the end of the telemetry season, I can’t start there. It’s a really messy dataset that’s going to require a lot of tinkering, input from collaborators, and there are some plans for more data collecting in the spring.
I ultimately decided to dust off a dataset from last year that was collected to determine the genetic structure of brook trout populations across the Loyalsock Creek watershed. But, I hesitated to start here. It’s no secret that I struggle with even the most fundamental concepts in genetics. I have a hard time understanding things I can’t visualize, and I managed to skirt my way around taking genetics classes as an undergraduate. But, the last few years I have grown to appreciate the questions you can answer about trout conservation through genetic studies. So, through a lot of hand holding from my friends who study genetics and our collaborators at the U.S. Fish and Wildlife Service, I am very slowly getting there.
One of the downfalls with genetics is that the topic can very quickly become technical and stray beyond the average person’s interest and understanding. But, that’s where my ignorance actually comes in handy. I don’t know enough to make the conversation technical! For the next few months I’ll be working on analyzing genetics data, and along the way I’ll try to breakdown all the concepts and terminology in a way that is both understandable and (hopefully) interesting.
For starters, let me go back. I am studying the population genetic structure of brook trout across the Loyalsock Creek watershed. What does that mean? Basically, I’m trying to see how genetically similar brook trout are from different locations around the watershed. We would expect that populations that are close to one another would be more similar than populations that are further away. This is because it’s more common for fish to move to, and then reproduce in, a neighboring stream than to make a long-distance movement to a stream many miles away. This is particularly true for species like brook trout which, compared to other species, don’t move very far. Trout also live in cold, headwater streams and warmer mainstem rivers act as barriers to movement, thereby further limiting exchange of individuals among populations.
There’s many ways to measure genetic diversity, but we are doing it by looking at differences in sections of DNA called microsatellites. To explain this concept a little further (mostly to myself...the visuals help), I often show the diagram below. Basically, every tissue in an organism’s body is made up of cells. Floating around inside the nucleus of those cells are chromosomes, and every chromosome contains thousands of genes. A gene is made of DNA, and DNA codes for proteins that ultimately produce all features of an organism. Put another way, genes are like instruction manuals, and DNA the step-by-step instructions for how to assemble, in this case, a trout.
The DNA inside genes is made up of base pairs (some of you may remember that there are four base pairs, adenine (A), cytosine (C), guanine (G), thymine (T)). While most of these base pairs code for specific proteins (for example, the base pair string of UCA codes for the protein ‘serine’), there are some sections of DNA that are “silent” and serve almost no biological function. One such case are sections of DNA known as microsatellites, which are sequences of 2-5 base pairs that are highly repetitive and do not code for a protein.
Microsatellites are powerful in population genetics studies for many reasons. First, because they do not code for a specific protein or trait, they are largely conserved in populations. This is important because if we were analyzing coding regions of DNA we wouldn’t know if absence of the DNA region was because of genetic isolation or because the environment was selecting against the trait that was being produce and therefor deleting it from the population.
The other reason microsatellites are useful is because they are prone to mutating. Again, because microsatellites do not have a functional purpose, these mutations are not harmful. But, these mutations work to give each population its own unique “signature” which we can track around the watershed as fish move around.
The location of a microsatellite on a gene is referred to as a ‘locus,’ and we analyze 12 different microsatellite loci to make inferences about population genetic diversity. Basically, we look across all loci to see how different individuals are within a population, and compare that to how different individuals are across all populations. Thankfully there are software programs to do this.
If you’re curious, below is what this data looks like in real life. Every row on that spreadsheet represents an individual, and the columns represent the ‘genotype,’ or genetic composition, for each microsatellite. In this case, the genotype represents the number of times the base pair sequences repeats. So, for example, the microsatellite loci B52, which is a base pair sequence of GCGT, is repeated 207 times in the first individual on that spreadsheet. You'll also see that there are two numbers for each loci because trout are diploid, meaning they get one copy of the gene from their mother and one copy from their father (just like humans). And, if you quickly glance down the spreadsheet, you see a lot of similarities because all of those fish are from the same population.
I’ve spent the last few days trying to run some summary statistics to describe the genetic diversity of all 28 samples sites and determine how similar each site is to one another. I’ll report some of those results in the coming weeks but, for now, don’t worry if all of that seemed confusing. Just remember- we study sections of DNA called microsatellites, and the more similar the microsatellites are the more genetically similar two fish, or two populations, are.
That’s not so bad, is it?