Cellular DNA barcoding has become a popular approach to study heterogeneity of cell populations and to identify those clones showing differential response to various stimuli. However, statistical characteristics of the DNA barcode read count data in clone tracing experiments are strongly affected by sampling bottleneck effects, for instance, when the cells are treated with cytotoxic drugs. Therefore, traditional analysis methods for the identification of differentially represented sequencing tags fail to control for false discoveries in DNA barcoding experiments.
In a recent study published in EMBO Press Molecular Systems Biology, the researchers used mixtures of DNA-barcoded cell pools to generate a realistic benchmark read count dataset for modelling a range of outcomes of clone-tracing experiments. By accounting for the statistical properties intrinsic to the DNA barcode read count data, they implemented a novel algorithm, named DEBRA. DEBRA results in a significantly lower false positive rate, compared to RNA-seq data analysis algorithms, especially when detecting differentially responding clones in experiments with strong selection pressure.
The DEBRA approach to multi-dimensional phenotypic profiling of clonal lineages in heterogeneous samples.
“Our DEBRA algorithm enables reliable detection of differentially responding clones also in applications that lead to a narrow sampling bottleneck; for instance, high doses of a drug treatment, cell sorting for rare subpopulations, or xenotransplantation”, says FIMM-EMBL PhD student Yevhen Akimov, the lead author of the study.
Building on the reliable statistical methodology, the researchers further illustrated how multi-dimensional phenotypic profiling enables one to deconvolute phenotypically distinct clonal subpopulations within a single cancer cell line. This approach will further extend the applications of cellular DNA barcoding for inferring clonal subpopulations in heterogeneous samples, such as those originating from cancer patients. The mixture control dataset provides also a foundation for benchmarking new algorithms for clone-tracing experiments.
“This is a great example of how combining statistical analysis with next-generation experimental assays leads to high-resolution insights into clonal dynamics and new biological discoveries”, comments FIMM group leader Tero Aittokallio, the corresponding author of the study.
DEBRA (DESeq‐based Barcode Representation Analysis) is available through the Github portal.
Original publication: Akimov Y, Bulanova D, Timonen S, Wennerberg K, Aittokallio T. Improved detection of differentially represented DNA barcodes for high-throughput clonal phenomics. Molecular Systems Biology, DOI: 10.15252/msb.20199195
The development of the DEBRA tool was carried out in collaboration with researchers from FIMM and BRIC (Biotech Research and Innovation Centre, University of Copenhagen, group of Prof. Krister Wennerberg). The next-generation sequencing data for benchmarking studies was generated at the FIMM Sequencing Unit (Pekka Ellonen) and at the DNA Sequencing and Genomics Laboratory (BIDGEN) in Viikki campus (Lars Paulin). This project is part of the European Union’s Horizon 2020 research and innovation programme HERCULES project (grant 667403).
Further information:
Yevhen Akimov, FIMM-EMBL PhD student
Institute for Molecular Medicine Finland FIMM, HiLIFE, University of Helsinki
E-mail: yevhen.akimov@helsinki.fi
Professor Tero Aittokallio
Institute for Molecular Medicine Finland FIMM, HiLIFE, University of Helsinki
E-mail: tero.aittokallio@helsinki.fi