Publication

Accurate identification of polyadenylation sites from 3' end deep sequencing using a naive Bayes classifier

Sheppard, Sarah E.
Lawson, Nathan D.
Zhu, Lihua Julie
Citations
Altmetric:
Student Authors
Faculty Advisor
Academic Program
Document Type
Journal Article
Publication Date
2013-10-15
Subject Area
Embargo Expiration Date
Abstract

MOTIVATION: 3' end processing is important for transcription termination, mRNA stability and regulation of gene expression. To identify 3' ends, most techniques use an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Although heuristic filters have been applied in these cases, they typically result in a high proportion of both false-positive and -negative classifications. Therefore, there is a need to develop improved algorithms to better identify mis-priming events in oligo-dT primed sequences.

RESULTS: By analyzing sequence features flanking 3' ends derived from oligo-dT-based sequencing, we developed a naive Bayes classifier to classify them as true or false/internally primed. The resulting algorithm is highly accurate, outperforms previous heuristic filters and facilitates identification of novel polyadenylation sites.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Source

Sheppard S, Lawson ND, Zhu LJ. Accurate identification of polyadenylation sites from 3' end deep sequencing using a naive Bayes classifier. Bioinformatics. 2013 Oct 15;29(20):2564-71. doi:10.1093/bioinformatics/btt446. Link to article on publisher's site

Year of Medical School at Time of Visit
Sponsors
Dates of Travel
DOI
10.1093/bioinformatics/btt446
PubMed ID
23962617
Other Identifiers
Notes

Erratum published to correct corresponding author details: Sheppard S, Lawson ND, Zhu LJ. Accurate identification of polyadenylation sites from 3' end deep sequencing using a naive Bayes classifier. Bioinformatics. 2014 Feb 15;30(4):596. doi: 10.1093/bioinformatics/btt714. Link to erratum on publisher's site

Funding and Acknowledgements
Corresponding Author
Related Resources
Related Resources
Repository Citation
Rights
Distribution License