A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies
Authors
Yu, TianxiongHuang, Xiao
Dou, Shengqian
Tang, Xiaolu
Luo, Shiqi
Theurkauf, William E.
Lu, Jian
Weng, Zhiping
UMass Chan Affiliations
Program in Molecular MedicineProgram in Bioinformatics and Integrative Biology
Document Type
Journal ArticlePublication Date
2021-01-28Keywords
Computational MethodsGenomics
Bioinformatics
Computational Biology
Genetics and Genomics
Nucleic Acids, Nucleotides, and Nucleosides
Metadata
Show full item recordAbstract
Transposons are genomic parasites, and their new insertions can cause instability and spur the evolution of their host genomes. Rapid accumulation of short-read whole-genome sequencing data provides a great opportunity for studying new transposon insertions and their impacts on the host genome. Although many algorithms are available for detecting transposon insertions, the task remains challenging and existing tools are not designed for identifying de novo insertions. Here, we present a new benchmark fly dataset based on PacBio long-read sequencing and a new method TEMP2 for detecting germline insertions and measuring de novo 'singleton' insertion frequencies in eukaryotic genomes. TEMP2 achieves high sensitivity and precision for detecting germline insertions when compared with existing tools using both simulated data in fly and experimental data in fly and human. Furthermore, TEMP2 can accurately assess the frequencies of de novo transposon insertions even with high levels of chimeric reads in simulated datasets; such chimeric reads often occur during the construction of short-read sequencing libraries. By applying TEMP2 to published data on hybrid dysgenic flies inflicted by de-repressed P-elements, we confirmed the continuous new insertions of P-elements in dysgenic offspring before they regain piRNAs for P-element repression. TEMP2 is freely available at Github: https://github.com/weng-lab/TEMP2.Source
Yu T, Huang X, Dou S, Tang X, Luo S, Theurkauf WE, Lu J, Weng Z. A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies. Nucleic Acids Res. 2021 Jan 28:gkab010. doi: 10.1093/nar/gkab010. Epub ahead of print. PMID: 33511407. Link to article on publisher's site
DOI
10.1093/nar/gkab010Permanent Link to this Item
http://hdl.handle.net/20.500.14038/29729PubMed ID
33511407Related Resources
Rights
© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.Distribution License
http://creativecommons.org/licenses/by/4.0/ae974a485f413a2113503eed53cd6c53
10.1093/nar/gkab010
Scopus Count
Except where otherwise noted, this item's license is described as © The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.