Multispecies coalescent-based phylogenetic accuracy as a function of molecular marker information content using South African Stoebe.
Title: Multispecies coalescent-based phylogenetic accuracy as a function of molecular marker information content using South African Stoebe.
DNr: SNIC 2020/13-72
Project Type: SNIC Small Compute
Principal Investigator: Zaynab Shaik <zshaik@sun.ac.za>
Affiliation: Göteborgs universitet
Duration: 2020-08-30 – 2021-09-01
Classification: 10610
Keywords:

Abstract

This PhD research is concerned with the practical aspects of species tree inference and species delimitation using multispecies coalescent (MSC)-based methods. These methods use multilocus molecular data to jointly estimate the phylogenetic relationships among species, as well as the entities which together form idealised Wright-Fisher species. With the advent of massively-parallel sequencing technologies, genome-wide molecular data sets are becoming increasingly affordable, even for non-model organisms. The MSC-based method STACEY/DISSECT (Jones et al., 2014, Jones, 2017) is scalable to large molecular data sets (hundreds to thousands of loci), which it achieves by substituting the normal birth-death model with a novel “birth-death-collapse” model, effectively circumventing reversible-jump Markov-Chain Monte Carlo (cf. BP&P; Yang, 2015; Rannala & Yang, 2017; Flouris et al., 2019). Even so, our work on Seriphium plumosum showed that DISSECT/STACEY takes weeks on multiple processing cores to reach adequate convergence. We plan to serially sub-sample a genomic data set of 524 loci for the South African Stoebe clade species complex Seriphium plumosum to assess the extent to which molecular information content improves phylogenetic accuracy (posterior node supports) and species delimitation. This research will (i) provide information about biases in species estimation as a function of information content, with more ramified delimitation schemes resulting when more informative molecular data are used, (ii) inform the threshold beyond which added molecular markers (and additional computational resources) result in only marginal improvements in phylogenetic accuracy (point of “diminishing returns”), and (iii) inform the selection of hybridisation probes from the Asteraceae Conserved Orthologous locus probe Set (COS II; Mandel et al., 2014) for phylogenetic reconstruction for the entire Cape paper daisy Stoebe clade (~60 species), which forms part of my PhD work.