We are upgrading the repository! A content freeze is in effect until December 11, 2024. New submissions or changes to existing items will not be allowed during this period. All content already published will remain publicly available for searching and downloading. Updates will be posted in the Website Upgrade 2024 FAQ in the sidebar Help menu. Reach out to escholarship@umassmed.edu with any questions.
Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs
Authors
Chen, JennyShishkin, Alexander A.
Zhu, Xiaopeng
Kadri, Sabah
Maza, Itay
Guttman, Mitchell
Hanna, Jacob H.
Regev, Aviv
Garber, Manuel
UMass Chan Affiliations
Program in Molecular MedicineProgram in Bioinformatics and Integrative Biology
Document Type
Journal ArticlePublication Date
2016-02-02Keywords
Long non-coding RNAsEvolution
Comparative genomics
Molecular evolution
Annotation
LincRNA
RNA-seq
Transcriptome
UMCCTS funding
Biochemistry, Biophysics, and Structural Biology
Bioinformatics
Computational Biology
Genomics
Integrative Biology
Population Biology
Systems Biology
Metadata
Show full item recordAbstract
BACKGROUND: Recent advances in transcriptome sequencing have enabled the discovery of thousands of long non-coding RNAs (lncRNAs) across many species. Though several lncRNAs have been shown to play important roles in diverse biological processes, the functions and mechanisms of most lncRNAs remain unknown. Two significant obstacles lie between transcriptome sequencing and functional characterization of lncRNAs: identifying truly non-coding genes from de novo reconstructed transcriptomes, and prioritizing the hundreds of resulting putative lncRNAs for downstream experimental interrogation. RESULTS: We present slncky, a lncRNA discovery tool that produces a high-quality set of lncRNAs from RNA-sequencing data and further uses evolutionary constraint to prioritize lncRNAs that are likely to be functionally important. Our automated filtering pipeline is comparable to manual curation efforts and more sensitive than previously published computational approaches. Furthermore, we developed a sensitive alignment pipeline for aligning lncRNA loci and propose new evolutionary metrics relevant for analyzing sequence and transcript evolution. Our analysis reveals that evolutionary selection acts in several distinct patterns, and uncovers two notable classes of intergenic lncRNAs: one showing strong purifying selection on RNA sequence and another where constraint is restricted to the regulation but not the sequence of the transcript. CONCLUSION: Our results highlight that lncRNAs are not a homogenous class of molecules but rather a mixture of multiple functional classes with distinct biological mechanism and/or roles. Our novel comparative methods for lncRNAs reveals 233 constrained lncRNAs out of tens of thousands of currently annotated transcripts, which we make available through the slncky Evolution Browser.Source
Genome Biol. 2016 Feb 2;17(1):19. doi: 10.1186/s13059-016-0880-9. Link to article on publisher's site
DOI
10.1186/s13059-016-0880-9Permanent Link to this Item
http://hdl.handle.net/20.500.14038/25943PubMed ID
26838501Related Resources
Rights
Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.Distribution License
http://creativecommons.org/licenses/by/4.0/ae974a485f413a2113503eed53cd6c53
10.1186/s13059-016-0880-9
Scopus Count
Except where otherwise noted, this item's license is described as Open Access. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (<a href="http://creativecommons.org/licenses/by/4.0/">http://creativecommons.org/licenses/by/4.0/</a>), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (<a href="http://creativecommons.org/publicdomain/zero/1.0/">http://creativecommons.org/publicdomain/zero/1.0/</a>) applies to the data made available in this article, unless otherwise stated.
Related items
Showing items related by title, author, creator and subject.
-
The genome-wide multi-layered architecture of chromosome pairing in early Drosophila embryosErceg, Jelena; AlHaj Abed, Jumana; Goloborodko, Anton; Lajoie, Bryan R.; Fudenberg, Geoffrey; Abdennur, Nezar; Imakaev, Maxim; McCole, Ruth B.; Nguyen, Son C.; Saylor, Wren; et al. (2019-10-03)Genome organization involves cis and trans chromosomal interactions, both implicated in gene regulation, development, and disease. Here, we focus on trans interactions in Drosophila, where homologous chromosomes are paired in somatic cells from embryogenesis through adulthood. We first address long-standing questions regarding the structure of embryonic homolog pairing and, to this end, develop a haplotype-resolved Hi-C approach to minimize homolog misassignment and thus robustly distinguish trans-homolog from cis contacts. This computational approach, which we call Ohm, reveals pairing to be surprisingly structured genome-wide, with trans-homolog domains, compartments, and interaction peaks, many coinciding with analogous cis features. We also find a significant genome-wide correlation between pairing, transcription during zygotic genome activation, and binding of the pioneer factor Zelda. Our findings reveal a complex, highly structured organization underlying homolog pairing, first discovered a century ago in Drosophila. Finally, we demonstrate the versatility of our haplotype-resolved approach by applying it to mammalian embryos.
-
Combined experimental and computational analysis of DNA damage signaling reveals context-dependent roles for Erk in apoptosis and G1/S arrest after genotoxic stressTentner, Andrea R.; Lee, Michael J; Ostheimer, Gerry J.; Samson, Leona D.; Lauffenburger, Douglas A.; Yaffe, Michael B. (2012-01-31)Following DNA damage, cells display complex multi-pathway signaling dynamics that connect cell-cycle arrest and DNA repair in G1, S, or G2/M phase with phenotypic fate decisions made between survival, cell-cycle re-entry and proliferation, permanent cell-cycle arrest, or cell death. How these phenotypic fate decisions are determined remains poorly understood, but must derive from integrating genotoxic stress signals together with inputs from the local microenvironment. To investigate this in a systematic manner, we undertook a quantitative time-resolved cell signaling and phenotypic response study in U2OS cells receiving doxorubicin-induced DNA damage in the presence or absence of TNFalpha co-treatment; we measured key nodes in a broad set of DNA damage signal transduction pathways along with apoptotic death and cell-cycle regulatory responses. Two relational modeling approaches were then used to identify network-level relationships between signals and cell phenotypic events: a partial least squares regression approach and a complementary new technique which we term 'time-interval stepwise regression.' Taken together, the results from these analysis methods revealed complex, cytokine-modulated inter-relationships among multiple signaling pathways following DNA damage, and identified an unexpected context-dependent role for Erk in both G1/S arrest and apoptotic cell death following treatment with this commonly used clinical chemotherapeutic drug.
-
Mapping and analysis of Caenorhabditis elegans transcription factor sequence specificitiesNarasimhan, Kamesh; Lambert, Samuel A.; Yang, Ally; Riddell, Jeremy; Mnaimneh, Sanie; Zheng, Hong; Albu, Mihai; Najafabadi, Hamed S.; Reece-Hoyes, John S.; Fuxman Bass, Juan; et al. (2015-04-23)Caenorhabditis elegans is a powerful model for studying gene regulation, as it has a compact genome and a wealth of genomic tools. However, identification of regulatory elements has been limited, as DNA-binding motifs are known for only 71 of the estimated 763 sequence-specific transcription factors (TFs). To address this problem, we performed protein binding microarray experiments on representatives of canonical TF families in C. elegans, obtaining motifs for 129 TFs. Additionally, we predict motifs for many TFs that have DNA-binding domains similar to those already characterized, increasing coverage of binding specificities to 292 C. elegans TFs (~40%). These data highlight the diversification of binding motifs for the nuclear hormone receptor and C2H2 zinc finger families, and reveal unexpected diversity of motifs for T-box and DM families. Motif enrichment in promoters of functionally related genes is consistent with known biology, and also identifies putative regulatory roles for unstudied TFs.