The Mycobacterium tuberculosis transposon sequencing database (MtbTnDB): a large-scale guide to genetic conditional essentiality [preprint]
Authors
Jinich, AdrianZaveri, Anisha
DeJesus, Michael A.
Flores-Bautista, Emanuel
Smith, Clare M.
Sassetti, Christopher M.
Rock, Jeremy M.
Ehrt, Sabine
Schnappinger, Dirk
Ioerger, Thomas R.
Rhee, Kyu
UMass Chan Affiliations
Department of Microbiology and Physiological SystemsDocument Type
PreprintPublication Date
2021-03-06Keywords
MicrobiologyMycobacterium tuberculosis
Mtb transposon sequencing database
gene essentiality
Bacteria
Computational Biology
Databases and Information Systems
Microbiology
Metadata
Show full item recordAbstract
Characterization of gene essentiality across different conditions is a useful approach for predicting gene function. Transposon sequencing (TnSeq) is a powerful means of generating genome-wide profiles of essentiality and has been used extensively in Mycobacterium tuberculosis (Mtb) genetic research. Over the past two decades, dozens of TnSeq screens have been published, yielding valuable insights into the biology of Mtb in vitro, inside macrophages, and in model host organisms. However, these Mtb TnSeq profiles are distributed across dozens of research papers within supplementary materials, which makes querying them cumbersome and assembling a complete and consistent synthesis of existing data challenging. Here, we address this problem by building a central repository of publicly available TnSeq screens performed in M. tuberculosis, which we call the Mtb transposon sequencing database (MtbTnDB). The MtbTnDB encompasses 64 published and unpublished TnSeq screens, and is standardized, open-access, and allows users easy access to data, visualizations, and functional predictions through an interactive web-app (www.mtbtndb.app). We also present evidence that (i) genes in the same genomic neighborhood tend to have similar TnSeq profiles, and (ii) clusters of genes with similar TnSeq profiles tend to be enriched for genes belonging to the same functional categories. Finally, we test and evaluate machine learning models trained on TnSeq profiles to guide functional annotation of orphan genes in Mtb. In addition to facilitating the exploration of conditional genetic essentiality in this important human pathogen via a centralized TnSeq data repository, the MtbTnDB will enable hypothesis generation and the extraction of meaningful patterns by facilitating the comparison of datasets across conditions. This will provide a basis for insights into the functional organization of Mtb genes as well as gene function prediction.Source
bioRxiv 2021.03.05.434127; doi: https://doi.org/10.1101/2021.03.05.434127. Link to preprint on bioRxiv.
DOI
10.1101/2021.03.05.434127Permanent Link to this Item
http://hdl.handle.net/20.500.14038/29713Notes
This article is a preprint. Preprints are preliminary reports of work that have not been certified by peer review.
Rights
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.Distribution License
http://creativecommons.org/licenses/by-nc-nd/4.0/ae974a485f413a2113503eed53cd6c53
10.1101/2021.03.05.434127
Scopus Count
Collections
Except where otherwise noted, this item's license is described as The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.

