GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases
Authors
Zhu, Lihua JulieLawrence, Michael
Gupta, Ankit
Pages, Herve
Kucukural, Alper
Garber, Manuel
Wolfe, Scot A.
UMass Chan Affiliations
Department of Biochemistry and Molecular PharmacologyProgram in Bioinformatics and Integrative Biology
Program in Molecular Medicine
Department of Molecular, Cell and Cancer Biology
Document Type
Journal ArticlePublication Date
2017-05-15Keywords
BioconductorCRISPR
GUIDE-seq
Genome editing
Off-targets analysis
Biochemistry, Biophysics, and Structural Biology
Bioinformatics
Computational Biology
Genomics
Metadata
Show full item recordAbstract
BACKGROUND: Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed. RESULTS: Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction algorithm permitting the identification of genomic sequences with unexpected cleavage activity. CONCLUSION: The GUIDEseq package enables analysis of GUIDE-data from various nuclease platforms for any species with a defined genomic sequence. This software package has been used successfully to analyze several GUIDE-seq datasets. The software, source code and documentation are freely available at http://www.bioconductor.org/packages/release/bioc/html/GUIDEseq.html .Source
BMC Genomics. 2017 May 15;18(1):379. doi: 10.1186/s12864-017-3746-y. Link to article on publisher's site
DOI
10.1186/s12864-017-3746-yPermanent Link to this Item
http://hdl.handle.net/20.500.14038/25818PubMed ID
28506212Related Resources
Rights
Copyright © The Author(s). 2017.Distribution License
http://creativecommons.org/licenses/by/4.0/ae974a485f413a2113503eed53cd6c53
10.1186/s12864-017-3746-y