Show simple item record

dc.contributor.authorTsuji, Junko
dc.contributor.authorWeng, Zhiping
dc.date2022-08-11T08:07:59.000
dc.date.accessioned2022-08-23T15:38:32Z
dc.date.available2022-08-23T15:38:32Z
dc.date.issued2016-10-01
dc.date.submitted2017-01-27
dc.identifier.citationPLoS One. 2016 Oct 13;11(10):e0164228. doi: 10.1371/journal.pone.0164228. eCollection 2016. <a href="http://dx.doi.org/10.1371/journal.pone.0164228">Link to article on publisher's site</a>
dc.identifier.issn1932-6203 (Linking)
dc.identifier.doi10.1371/journal.pone.0164228
dc.identifier.pmid27736901
dc.identifier.urihttp://hdl.handle.net/20.500.14038/25955
dc.description.abstractWith the rapid accumulation of publicly available small RNA sequencing datasets, third-party meta-analysis across many datasets is becoming increasingly powerful. Although removing the 3 adapter is an essential step for small RNA sequencing analysis, the adapter sequence information is not always available in the metadata. The information can be also erroneous even when it is available. In this study, we developed DNApi, a lightweight Python software package that predicts the 3 adapter sequence de novo and provides the user with cleansed small RNA sequences ready for down stream analysis. Tested on 539 publicly available small RNA libraries accompanied with 3 adapter sequences in their metadata, DNApi shows near-perfect accuracy (98.5%) with fast runtime (~2.85 seconds per library) and efficient memory usage (~43 MB on average). In addition to 3 adapter prediction, it is also important to classify whether the input small RNA libraries were already processed, i.e. the 3 adapters were removed. DNApi perfectly judged that given another batch of datasets, 192 publicly available processed libraries were "ready-to-map" small RNA sequence. DNApi is compatible with Python 2 and 3, and is available at https://github.com/jnktsj/DNApi. The 731 small RNA libraries used for DNApi evaluation were from human tissues and were carefully and manually collected. This study also provides readers with the curated datasets that can be integrated into their studies.
dc.language.isoen_US
dc.relation<a href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=pubmed&cmd=Retrieve&list_uids=27736901&dopt=Abstract">Link to Article in PubMed</a>
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectsmall RNA sequencing datasets
dc.subject3´ adapter sequence
dc.subjectdownstream analysis
dc.subjectBioinformatics
dc.subjectComputational Biology
dc.subjectGenomics
dc.subjectIntegrative Biology
dc.titleDNApi: A De Novo Adapter Prediction Algorithm for Small RNA Sequencing Data
dc.typeJournal Article
dc.source.journaltitlePloS one
dc.source.volume11
dc.source.issue10
dc.identifier.legacyfulltexthttps://escholarship.umassmed.edu/cgi/viewcontent.cgi?article=1103&amp;context=bioinformatics_pubs&amp;unstamped=1
dc.identifier.legacycoverpagehttps://escholarship.umassmed.edu/bioinformatics_pubs/96
dc.identifier.contextkey9590768
refterms.dateFOA2022-08-23T15:38:32Z
html.description.abstract<p>With the rapid accumulation of publicly available small RNA sequencing datasets, third-party meta-analysis across many datasets is becoming increasingly powerful. Although removing the 3 adapter is an essential step for small RNA sequencing analysis, the adapter sequence information is not always available in the metadata. The information can be also erroneous even when it is available. In this study, we developed DNApi, a lightweight Python software package that predicts the 3 adapter sequence de novo and provides the user with cleansed small RNA sequences ready for down stream analysis. Tested on 539 publicly available small RNA libraries accompanied with 3 adapter sequences in their metadata, DNApi shows near-perfect accuracy (98.5%) with fast runtime (~2.85 seconds per library) and efficient memory usage (~43 MB on average). In addition to 3 adapter prediction, it is also important to classify whether the input small RNA libraries were already processed, i.e. the 3 adapters were removed. DNApi perfectly judged that given another batch of datasets, 192 publicly available processed libraries were "ready-to-map" small RNA sequence. DNApi is compatible with Python 2 and 3, and is available at https://github.com/jnktsj/DNApi. The 731 small RNA libraries used for DNApi evaluation were from human tissues and were carefully and manually collected. This study also provides readers with the curated datasets that can be integrated into their studies.</p>
dc.identifier.submissionpathbioinformatics_pubs/96
dc.contributor.departmentDepartment of Biochemistry and Molecular Pharmacology
dc.contributor.departmentProgram in Bioinformatics and Integrative Biology
dc.source.pagese0164228


Files in this item

Thumbnail
Name:
journal.pone.0164228.PDF
Size:
2.347Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

http://creativecommons.org/licenses/by/4.0/
Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by/4.0/