Mathematical & Computational Biology

Cancer phylogenetic tree inference at scale from 1000s of single cell genomes

10.24072/pcjournal.292 - Peer Community Journal, Volume 3 (2023), article no. e63.

A new generation of scalable single cell whole genome sequencing (scWGS) methods allows unprecedented high resolution measurement of the evolutionary dynamics of cancer cell populations. Phylogenetic reconstruction is central to identifying sub-populations and distinguishing the mutational processes that gave rise to them. Existing phylogenetic tree building models do not scale to the tens of thousands of high resolution genomes achievable with current scWGS methods. We constructed a phylogenetic model and associated Bayesian inference procedure, sitka, specifically for scWGS data. The method is based on a novel phylogenetic encoding of copy number (CN) data, the sitka transformation, that simplifies the site dependencies induced by rearrangements while still forming a sound foundation to phylogenetic inference. The sitka transformation allows us to design novel scalable Markov chain Monte Carlo (MCMC) algorithms. Moreover, we introduce a novel point mutation calling method that incorporates the CN data and the underlying phylogenetic tree to overcome the low per-cell coverage of scWGS. We demonstrate our method on three single cell datasets, including a novel PDX series, and analyse the topological properties of the inferred trees. Sitka is freely available at

Published online:
DOI: 10.24072/pcjournal.292
Keywords: Phylogenetics, Cancer evolution, Bayesian statistics, MCMC, Copy number evolution, PDX, Triple negative breast cancer
Salehi, Sohrab 1; Dorri, Fatemeh 2; Chern, Kevin 3; Kabeer, Farhia 4; Rusk, Nicole 1; Funnell, Tyler 1; Williams, Marc J. 1; Lai, Daniel 4, 5; Andronescu, Mirela 4, 5; Campbell, Kieran R. 6, 7, 8; McPherson, Andrew 1; Aparicio, Samuel 4, 5; Roth, Andrew 2, 4, 5; Shah, Sohrab P. 1; Bouchard-Côté, Alexandre 3

1 Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, USA
2 Department of Computer Science, University of British Columbia, Canada
3 Department of Statistics, University of British Columbia, Canada
4 Department of Pathology and Laboratory Medicine, University of British Columbia, Canada
5 Department of Molecular Oncology, BC Cancer Research Centre, Canada
6 Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Canada
7 Department of Molecular Genetics, University of Toronto, Canada
8 Department of Statistical Sciences, University of Toronto, Canada
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
Salehi, Sohrab; Dorri, Fatemeh; Chern, Kevin; Kabeer, Farhia; Rusk, Nicole; Funnell, Tyler; Williams, Marc J.; Lai, Daniel; Andronescu, Mirela; Campbell, Kieran R.; McPherson, Andrew; Aparicio, Samuel; Roth, Andrew; Shah, Sohrab P.; Bouchard-Côté, Alexandre. Cancer phylogenetic tree inference at scale from 1000s of single cell genomes. Peer Community Journal, Volume 3 (2023), article  no. e63. doi : 10.24072/pcjournal.292.

Peer reviewed and recommended by PCI : 10.24072/pci.mcb.100112

Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

