Application of a Naïve Bayes Classifier to Assign Polyadenylation Sites from 3' End Deep Sequencing Data: A Dissertation
Authors
Sheppard, Sarah E.Faculty Advisor
Nathan Lawson, PhDAcademic Program
MD/PhDUMass Chan Affiliations
Molecular, Cell and Cancer Biology DepartmentDocument Type
Doctoral DissertationPublication Date
2013-04-29Keywords
Dissertations, UMMSBayes Theorem
Algorithms
Polyadenylation
RNA 3' Polyadenylation Signals
High-Throughput Nucleotide Sequencing
Bayes Theorem
Algorithms
Polyadenylation
RNA 3' Polyadenylation Signals
High-Throughput Nucleotide Sequencing
Bioinformatics
Computational Biology
Metadata
Show full item recordAbstract
Cleavage and polyadenylation of a precursor mRNA is important for transcription termination, mRNA stability, and regulation of gene expression. This process is directed by a multitude of protein factors and cis elements in the pre-mRNA sequence surrounding the cleavage and polyadenylation site. Importantly, the location of the cleavage and polyadenylation site helps define the 3’ untranslated region of a transcript, which is important for regulation by microRNAs and RNA binding proteins. Additionally, these sites have generally been poorly annotated. To identify 3’ ends, many techniques utilize an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Previously, simple heuristic filters relying on the number of adenines in the genomic sequence downstream of a putative polyadenylation site have been used to remove these sites of internal priming. However, these simple filters may not remove all sites of internal priming and may also exclude true polyadenylation sites. Therefore, I developed a naïve Bayes classifier to identify putative sites from oligo-dT primed 3’ end deep sequencing as true or false/internally primed. Notably, this algorithm uses a combination of sequence elements to distinguish between true and false sites. Finally, the resulting algorithm is highly accurate in multiple model systems and facilitates identification of novel polyadenylation sites.DOI
10.13028/M20K68Permanent Link to this Item
http://hdl.handle.net/20.500.14038/32005Rights
Copyright is held by the author, with all rights reserved.ae974a485f413a2113503eed53cd6c53
10.13028/M20K68
Scopus Count
Collections
Related items
Showing items related by title, author, creator and subject.
-
Role of the Cytoplasmic Polyadenylation Element Binding Proteins in Neuron: A DissertationOruganty, Aparna (2013-02-26)Genome regulation is an extremely complex phenomenon. There are various mechanisms in place to ensure smooth performance of the organism. Post-transcriptional regulation of gene expression is one such mechanism. Many proteins bind to mRNAs and regulate their translation. In this thesis, I have focused on the Cytoplasmic Polyadenylation Element Binding family of proteins (CPEB1-4); a group of sequence specific RNA binding proteins important for cell cycle progression, senescence, neuronal function and plasticity. CPEB protein binds mRNAs containing a short Cytoplasmic Polyadenylation Element (CPE) in 3’ untranslated Region (UTR) and regulates the polyadenylation of these mRNAs and thereby controls translation. In Chapter II, I have presented my work on the regulation of mitochondrial function by CPEB. CPEB knockout mice have brain specific defects in mitochondrial function owing to a reduction in Electron transport chain complex I component protein NDUFV2. CPEB controls the translation of this NDUFV2 mRNA and thus affects mitochondrial function. A consequence of this reduced bioenergetics is reduced growth and branching of neurons, again emphasizing the importance of this pathway. Chapter III focuses on the role of CPEB4 in neuronal survival and protection against apoptosis. CPEB4 shuttles between nucleus and cytoplasm and becomes nuclear in response to stimulation with ionotropic glutamate receptors, focal ischemia in vivo and when cultured neurons are deprived of oxygen and glucose; nuclear CPEB4 affords protection against apoptosis in ischemia model. The underlying cause for nuclear translocation is reduction in Endoplasmic Reticulum calcium levels. These studies give an insight into the function and dynamics of these two RNA binding proteins and provide a better understanding of cellular biology.
-
Pre-messenger RNA cleavage factor I (CFIm): potential role in alternative polyadenylation during spermatogenesisSartini, Becky L.; Wang, Hang; Wang, Wei; Millette, Clarke F.; Kilpatrick, Daniel L. (2007-11-23)A hallmark of male germ cell gene expression is the generation by alternative polyadenylation of cell-specific mRNAs, many of which utilize noncanonical A(A/U)UAAA-independent polyadenylation signals. Cleavage factor I (CFIm), a component of the pre-mRNA cleavage and polyadenylation protein complex, can direct A(A/U)UAAA-independent polyadenylation site selection of somatic cell mRNAs. Here we report that the CFIm subunits NUDT21/CPSF5 and CPSF6 are highly enriched in mouse male germ cells relative to somatic cells. Both subunits are expressed from spermatogenic cell mRNAs that are shorter than the corresponding somatic transcripts. Complementary DNA sequencing and Northern blotting revealed that the shorter Nudt21 and Cpsf6 mRNAs are generated by alternative polyadenylation in male germ cells using proximal poly(A) signals. Both sets of transcripts contain CFIm binding sites within their 3'-untranslated regions, suggesting autoregulation of CFIm subunit formation in male germ cells. CFIm subunit mRNA and protein levels exhibit distinct developmental variation during spermatogenesis, indicating stage-dependent translational and/or posttranslational regulation. CFIm binding sites were identified near the 3' ends of numerous male germ cell transcripts utilizing A(A/U)UAAA-independent sites. Together these findings suggest that CFIm complexes participate in alternative polyadenylation directed by noncanonical poly(A) signals during spermatogenesis.
-
Translational control of mitochondrial energy production mediates neuron morphogenesisOruganty-Das, Aparna; Ng, Teclise; Udagawa, Tsuyoshi; Goh, Eyleen L. K.; Richter, Joel D. (2012-12-05)Mitochondrial energy production is a tightly regulated process involving the coordinated transcription of several genes, catalysis of a plethora of posttranslational modifications, and the formation of very large molecular supercomplexes. The regulation of mitochondrial activity is particularly important for the brain, which is a high-energy-consuming organ that depends on oxidative phosphorylation to generate ATP. Here we show that brain mitochondrial ATP production is controlled by the cytoplasmic polyadenylation-induced translation of an mRNA encoding NDUFV2, a key mitochondrial protein. Knockout mice lacking the Cytoplasmic Polyadenylation Element Binding protein 1 (CPEB1) have brain-specific dysfunctional mitochondria and reduced ATP levels, which is due to defective polyadenylation-induced translation of electron transport chain complex I protein NDUFV2 mRNA. This reduced ATP results in defective dendrite morphogenesis of hippocampal neurons both in vitro and in vivo. These and other results demonstrate that CPEB1 control of mitochondrial activity is essential for normal brain development.