Allele phasing has minimal impact on phylogenetic reconstruction from targeted nuclear gene sequences in a case study of Artocarpus

HR Kates, Johnson MG, Gardner EM, Zerega NJC, and Wickett N., American Journal of Botany 105 (3) :404-416 (2018).


Premise of the Study

Untapped information about allele diversity within populations and individuals (i.e., heterozygosity) could improve phylogenetic resolution and accuracy. Many phylogenetic reconstructions ignore heterozygosity because it is difficult to assemble allele sequences and combine allele data across unlinked loci, and it is unclear how reconstruction methods accommodate variable sequences. We review the common methods of including heterozygosity in phylogenetic studies and present a novel method for assembling allele sequences from targetā€enriched Illumina sequencing libraries.


We performed supermatrix phylogeny reconstruction and species tree estimation of Artocarpus based on three methods of accounting for heterozygous sequences: a consensus method based on de novo sequence assembly, the use of ambiguity characters, and a novel method for incorporating read information to phase alleles. We characterize the extent to which highly heterozygous sequences impeded phylogeny reconstruction and determine whether the use of allele sequences improves phylogenetic resolution or decreases topological uncertainty.

Key Results

We show here that it is possible to infer phased alleles from targetā€enriched Illumina libraries. We find that highly heterozygous sequences do not contribute disproportionately to poor phylogenetic resolution and that the use of allele sequences for phylogeny reconstruction does not have a clear effect on phylogenetic resolution or topological consistency.


We provide a framework for inferring phased alleles from target enrichment data and for assessing the contribution of allelic diversity to phylogenetic reconstruction. In our data set, the impact of allele phasing on phylogeny is minimal compared to the impact of using phylogenetic reconstruction methods that account for gene tree incongruence.