Show simple item record

dc.contributor.advisorZhiping Weng, PhD
dc.contributor.authorPurcaro, Michael J.
dc.date2022-08-11T08:08:46.000
dc.date.accessioned2022-08-23T16:07:48Z
dc.date.available2022-08-23T16:07:48Z
dc.date.issued2017-12-12
dc.date.submitted2017-12-29
dc.identifier.doi10.13028/M23T1Q
dc.identifier.urihttp://hdl.handle.net/20.500.14038/32321
dc.description.abstractThe goal of the Encyclopedia of DNA Elements (ENCODE) project has been to characterize all the functional elements of the human genome. These elements include expressed transcripts and genomic regions bound by transcription factors (TFs), occupied by nucleosomes, occupied by nucleosomes with modified histones, or hypersensitive to DNase I cleavage, etc. Chromatin Immunoprecipitation (ChIP-seq) is an experimental technique for detecting TF binding in living cells, and the genomic regions bound by TFs are called ChIP-seq peaks. ENCODE has performed and compiled results from tens of thousands of experiments, including ChIP-seq, DNase, RNA-seq and Hi-C. These efforts have culminated in two web-based resources from our lab—Factorbook and SCREEN—for the exploration of epigenomic data for both human and mouse. Factorbook is a peak-centric resource presenting data such as motif enrichment and histone modification profiles for transcription factor binding sites computed from ENCODE ChIP-seq data. SCREEN provides an encyclopedia of ~2 million regulatory elements, including promoters and enhancers, identified using ENCODE ChIP-seq and DNase data, with an extensive UI for searching and visualization. While we have successfully utilized the thousands of available ENCODE ChIP-seq experiments to build the Encyclopedia and visualizers, we have also struggled with the practical and theoretical inability to assay every possible experiment on every possible biosample under every conceivable biological scenario. We have used machine learning techniques to predict TF binding sites and enhancers location, and demonstrate machine learning is critical to help decipher functional regions of the genome.
dc.language.isoen_US
dc.rightsLicensed under a Creative Commons license
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.subjectENCODE
dc.subjectenhancer
dc.subjectregulatory element
dc.subjectgenome
dc.subjectepigenome
dc.subjectDNase
dc.subjectChIP-seq
dc.subjectBig Data
dc.subjectvisualization
dc.subjectComputational Biology
dc.subjectGenomics
dc.subjectIntegrative Biology
dc.titleAnalysis, Visualization, and Machine Learning of Epigenomic Data
dc.typeDoctoral Dissertation
dc.identifier.legacyfulltexthttps://escholarship.umassmed.edu/cgi/viewcontent.cgi?article=1944&context=gsbs_diss&unstamped=1
dc.identifier.legacycoverpagehttps://escholarship.umassmed.edu/gsbs_diss/938
dc.legacy.embargo2018-06-29T00:00:00-07:00
dc.identifier.contextkey11305040
refterms.dateFOA2022-08-24T03:33:22Z
html.description.abstract<p>The goal of the Encyclopedia of DNA Elements (ENCODE) project has been to characterize all the functional elements of the human genome. These elements include expressed transcripts and genomic regions bound by transcription factors (TFs), occupied by nucleosomes, occupied by nucleosomes with modified histones, or hypersensitive to DNase I cleavage, etc. Chromatin Immunoprecipitation (ChIP-seq) is an experimental technique for detecting TF binding in living cells, and the genomic regions bound by TFs are called ChIP-seq peaks. ENCODE has performed and compiled results from tens of thousands of experiments, including ChIP-seq, DNase, RNA-seq and Hi-C.</p> <p>These efforts have culminated in two web-based resources from our lab—Factorbook and SCREEN—for the exploration of epigenomic data for both human and mouse. Factorbook is a peak-centric resource presenting data such as motif enrichment and histone modification profiles for transcription factor binding sites computed from ENCODE ChIP-seq data. SCREEN provides an encyclopedia of ~2 million regulatory elements, including promoters and enhancers, identified using ENCODE ChIP-seq and DNase data, with an extensive UI for searching and visualization.</p> <p>While we have successfully utilized the thousands of available ENCODE ChIP-seq experiments to build the Encyclopedia and visualizers, we have also struggled with the practical and theoretical inability to assay every possible experiment on every possible biosample under every conceivable biological scenario. We have used machine learning techniques to predict TF binding sites and enhancers location, and demonstrate machine learning is critical to help decipher functional regions of the genome.</p>
dc.identifier.submissionpathgsbs_diss/938
dc.contributor.departmentProgram in Bioinformatics and Integrative Biology
dc.identifier.orcid0000-0002-4735-4215


Files in this item

Thumbnail
Name:
Purcaro__Thesis__final_reduced ...
Size:
6.243Mb
Format:
PDF

This item appears in the following Collection(s)

Show simple item record

Licensed under a Creative Commons license
Except where otherwise noted, this item's license is described as Licensed under a Creative Commons license