Exercise 18.1

The Biogeography of Riverine Midges: History and Ecology Shape the Genetics of a Species

(This exercise is based on Werle, S. F. 2005. Populations of a Connecticut River midge structured by geological history and downstream gene flow. Chromosome Research 13: 97–106.)

(Note: The reference above links directly to the article on the journal’s website. In order to access the full text of the article, you may need to be on your institution’s network [or logged in remotely], so that you can use your institution’s access privileges.)


In Chapter 18 you read about two broad categories of biogeography, historical and ecological. The former refers to biogeographical patterns in the distribution of a taxon or taxa that are a result of long-term historical processes such as continental drift or glaciation. The latter term, ecological biogeography, refers to patterns attributable to ongoing ecological processes such as gene flow, competition or resource/habitat availability. This paper presents an example where both historical and ecological biogeographic factors contribute to the distribution of a single species.

The author studied 15 populations of a fly that spends its larval life living in submerged clays in the Connecticut River in the New England region of the United States of America. These clays are only exposed in isolated areas of the river, and are lacking or buried in the river reaches that separate clay exposures. The result is that populations of the fly, a Chironomid midge, are semi-isolated from each other and gene flow between populations is reduced. The map below shows the locations of 15 clay exposure sites that were sampled by collecting midge larvae from the clay.

Sample sites in the Connecticut River

Another important characteristic of these midges is that they are polymorphic for a number of chromosomal mutations. By observing the chromosomes the author was able to calculate the frequencies of 4 chromosomal inversions and one insertion or deletion (indel) at each of the sample sites. Figure 1 shows the frequencies of these mutations at each of the populations (sites) sampled plotted against the distance from the mouth of the river in Long Island Sound (this distance is referred to as river kilometer).

Having these chromosomal mutation frequencies allows the calculation of a genetic distance index for pairwise comparisons of the different sites. These data are shown in Figure 2. As you can see it appears that genetic distance is positively correlated with geographical separation in these midge populations.

The next thing we have to consider is the geological history of the river within which these midges live. The Connecticut River is the largest river in New England and its recent geological history is dominated by the presence and subsequent retreat of the Laurentide Ice sheet. This was an unimaginably large sheet of ice that covered almost all of what is now Canada and large parts of what is now the northern United States during the most recent ice age. Much of the Connecticut River valley was buried under more than a kilometer-thick layer of ice from about 90,000–100,000 years ago until about 20,000 years ago when that ice began to melt. Water from the melting ice was dammed by glacial moraine deposits, and thus formed two lakes in the valley which existed for thousands of years after the glaciers retreated. These lakes are today known as ancient Lake Hitchcock and Lake Coös. Though they no longer exist, having drained when the ancient moraine dams failed, the clay deposits that exist in the riverbed today were laid down during the lifetimes of these lakes, and provide incontrovertible evidence of their existence.

If you look at Figure 3 you can see the locations of both lakes overlaid on the modern map of the area. Lake Coös was much smaller than Lake Hitchcock and is almost covered by the three northernmost green dots in the figure. Also shown in Figure 3 is the average physical distance and the average genetic distance between the northernmost Lake Hitchcock sample (Orford, NH; abbreviated ORF in the map above) and the three northernmost and the three southernmost samples in the study.

Another calculation that can be made using mutation frequency data such as those collected for this study is the construction of a graphical representation of the relationship between populations. The method that the author used for this is called UPGMA, for Unweighted Pair Group Method with Arithmetic mean. This is a simple algorithm that clusters data from the bottom up, assuming a constant molecular clock, and results in a distance matrix that can be represented as a rooted tree. UPGMA is not the best method for inferring phylogenies, because the assumption about a constant molecular clock is often violated (see Chapter 2), but for building a phenetic tree it is a good method. A phenetic tree is a tree showing relationships based on phenotype or morphology, which is essentially what chromosomal mutations can be thought of as because they are large-scale, observable characteristics of individual larvae. In this special case, the phylogenetic tree matches the phenetic tree because the phenotype being observed exactly matches the genotype.

Figure 4 shows a UPGMA tree of the relationships between the 15 populations that were sampled for this study. As you can see, there is strong support for the separation between populations from the two ancient lakes. Another, more subtle result that can be seen in this figure has to do with the nature of the relationships between the southern (Lake Hitchcock) populations. Although the bootstrap support for this isn’t very strong, it appears that the oldest populations are in the northern end and that the populations get more recent as you move south. The author proposes that this is an artifact introduced by the unique ecology of the system. What seems to be happening is this: adult midges are very short-lived and complete their life cycle (mate, lay eggs, and die) very soon after emerging from the water at their respective site on the river. They are very unlikely to move between sites, especially sites separated by any distance. The greatest amount of time of the midges’ lives is spent as larvae (these flies live for one year). The larvae spend their time at the bottom of the river burrowed into the exposed clay deposits. In the spring floods that occur every year the clay deposits, which have been greatly weakened by the holes burrowed by the midge larvae, are washed away by the river current. Huge numbers of larvae are also washed downstream in these floods, and while many are undoubtedly eaten by predators that await this yearly event, some are able to dig back into any clay that they encounter downstream. What this means is that gene flow in this system is directional, with genes moving between populations only in a downstream direction. This creates the artifact in Figure 4 that makes it appear that the oldest populations are in the north. In a phylogenetic tree showing relationships between separate taxa, genes only move “downstream” from ancestors to descendants, and thus the oldest taxa are closest to the root of the tree. The tree in Figure 4 has a similar shape, but for a different reason having nothing to do with the age of the populations, but rather because of the nature of gene flow between them.


Figure 1 Chromosomal mutation frequencies versus river distance from Long Island Sound.


Question 1. Refer to Figure 1 above. At about what river kilometer does the most drastic change in mutation frequencies occur?


Question 2. Is the mutation designated Gindel more common in the north or the south?


Question 3. Is the mutation designated G2-7 ever observed north of river kilometer 425?


Question 4. Is the mutation designated C1-6 more common in the north or the south? About what percent of larvae carry this mutation where it is most common?

Figure 2 Nei’s modified genetic distance (DA) plotted against geographic separation in river-kilometers for all pairwise comparisons between Axarus sp. varvestris populations. Data points are differentiated as “within lake” or “between lake,” depending upon the origin of the clays where larvae were collected.


Question 5. Refer to Figure 2 above. Do all of the data points seem to fit a pattern or are some data points outliers?


Question 6. If you ignore any outliers, what is the pattern seen in Figure 2?


Question 7. Thinking about the outliers in Figure 2, fill in the blanks in the following statement: The outlying data points in Figure 2 are anomalous because their __________.

Figure 3 Sampling locations with the locations of the two ancient lakes overlaid in blue. The location of Lake Coös is just visible under the three green dots at the northernmost collection sites. Shown to the right are the average genetic and physical distances between the Orford, NH population and the three northernmost and the three southernmost populations respectively.


Question 8. Refer to Figure 3 above. How does the average genetic distance between the Orford, NH population (ORF) and the three southernmost populations (the south) compare with the genetic distance between ORF and the three northernmost populations (the north)?


Question 9. How does the average physical distance between ORF and the south compare with the physical distance between ORF and the north?


Question 10. Can you see any explanation for the outlying data points in Figure 2?


Question 11. Which figure or figures in this paper relate most to the historical biogeography of the Axarus sp. midges?

Figure 4 UPGMA tree showing relationship between the 15 sampled populations. The ancient lake locations are separated by color (yellow = Lake Coös, green = Lake Hitchcock). Numbers at the nodes are bootstrap values.


Question 12. Which figure or figures in this paper relate most to the ecological biogeography of the Axarus sp. of midges?


When reading questions 13–16, bear in mind that references to the lakes actually pertain to the clay deposits left by the lakes rather than the lakes themselves, which no longer exist.


Figure 5 Modified version of Figure 2.


Question 13. Refer to Figure 5 above. In this figure, linear regression lines with their respective formulae have been added for both data series. Based on these regressions, what is the best estimate that you can propose for the genetic distance (Nei’s DA) between the two lakes?


Question 14. Again based on the regressions, what would be the genetic distance (Nei’s DA) between two hypothetical populations from within either lake if they were separated by 1000 kilometers?


Question 15. What about if those two hypothetical populations were each from a different lake; then what would the genetic difference (Nei’s DA) be if they were separated by 1000 kilometers?


Question 16. Based on the evidence presented in this exercise, do you think that midges lived in the ancient lakes when these lakes existed? Can you think of other possible explanations for the patterns in these data?