(This exercise is based on Siljic, M. et al. 2017. Forensic application of phylogenetic analyses – Exploration of suspected HIV-1 transmission case. Forensic Science International: Genetics 27: 100–105.)
(Note: The reference above links directly to the article on the journal’s website. In order to access the full text of the article, you may need to be on your institution’s network [or logged in remotely], so that you can use your institution’s access privileges.)
Phylogenetic analysis is not just useful for inferring the histories of living organisms or tracking the spread of diseases. It also can be used in forensic science, science used in the investigations of crimes and other legal matters. More and more individuals have been convicted or exonerated of crimes based on the phylogenetic analysis of DNA.
Several cases of the forensic use of phylogenetic analysis of DNA sequences involve human immunodeficiency birus-1 (HIV-1), the primary virus that causes AIDS. It is typically transmitted through the exchange of bodily fluids, including through sexual contact. HIV-1 evolves very rapidly and substantial evolutionary change can be observed within a single human individual.
In a case from Belgrade, Serbia, three individuals were HIV positive. Subjects 1 and 2 were respectively a man and a woman who had been married for more than 15 years. Subject 3 was a woman who was a long-term sexual partner of subject 1. Subject 1 brought a lawsuit against Subject 3 for knowingly infecting him and not disclosing that she had HIV. Can phylogenetic analysis shed light on what happened?
Researchers at the University of Belgrade School of Medicine collected samples from the three subjects as well as other individuals who were known to be infected, from both the local area and from outside the region. From the viral samples, they sequenced the pol and the env genes, which encode the polymerase and viral envelope proteins respectively. They then performed maximum likelihood and other phylogenetic analyses on the sequences.
Figure 1 shows the maximum likelihood analysis of the pol sequences. The query sequences (in dark blue) are those from the three subjects. The local control sequences (in light blue) are sequences from infected individuals in the local area. The background controls (in black) are from infected individuals from all over the world.
Question 1. Is the group of the queried sequences monophyletic, paraphyletic, or polyphyletic? Explain.
Question 2. Based on your answer to Questiong 1, what inference can be drawn about how the three subjects were infected?
Question 3. What can be inferred about the Serbian infected population from the observed phylogenetic patterns?
Figure 2 shows the maximum likelihood analysis of the env sequences. The color coding that was used in Figure 1 is used here.
Question 4. Do the env sequences show a similar pattern to the pol sequences?
Figure 3 shows a maximum likelihood phylogeny of the pol sequences from the three subjects (see Figure key) and local control sequences. At the bottom is a timeline.
Question 5. Do any of the individual subjects have sequences that form a monopyletic group? Explain.
Question 6. Which individual has the most phylogenetically diverse sequences?
Question 7. What inference about the transmission of HIV-1 among the subjects is most likely? Why?
Question 8. Based on the timeline at the bottom, what inference can we make about when Subject 1 was infected? Explain.
Question 9. What assumption are we making in Q8 regarding the dating of the infection of Subject 1?
Question 10. Recall that Subject 1 brought a lawsuit against Subject 3 for knowingly infecting him. What type of result would have supported that claim?
Question 11. The researchers used maximum likelihood methods in their analysis. Why do you think they would have used maximum likelihood over parsimony analysis?