Defining a Registry of Candidate Regulatory Elements to Interpret Disease Associated Genetic Variation
Name:
1-Supplemental_Table_S9__eQTLs.xlsx
Size:
211.5Kb
Format:
Microsoft Excel 2007
Name:
10-Supplemental_Table_1___Gene ...
Size:
71.75Kb
Format:
Microsoft Excel 2007
Name:
2-Supplemental_Table_2___PROVE ...
Size:
77.63Kb
Format:
Microsoft Excel 2007
Name:
3-Supplemental_Table_3___Overl ...
Size:
159.6Kb
Format:
Microsoft Excel 2007
Name:
4-Supplemental_Table_S4___H3K2 ...
Size:
216.3Kb
Format:
Microsoft Excel 2007
Name:
5-Supplemental_Table_S6___DNas ...
Size:
216.3Kb
Format:
Microsoft Excel 2007
Name:
6-Supplemental_Table_S7___H3K2 ...
Size:
140.1Kb
Format:
Microsoft Excel 2007
Name:
7-Supplemental_Table_S10___eQT ...
Size:
85.48Kb
Format:
Microsoft Excel 2007
Name:
8-Supplemental_Table_S11___Clo ...
Size:
156.3Kb
Format:
Microsoft Excel 2007
Name:
9-Supplemental_Table_S12__100k ...
Size:
270.0Kb
Format:
Microsoft Excel 2007
Authors
Moore, Jill E.Faculty Advisor
Zhiping Weng, PhDAcademic Program
Bioinformatics and Computational BiologyUMass Chan Affiliations
Program in Bioinformatics and Integrative BiologyDocument Type
Doctoral DissertationPublication Date
2017-10-10Keywords
ENCODEenhancer
regulatory element
genome
epigenome
DNase
ChIP-seq
target gene
schizophrenia
bipolar disorder
major depressive disorder
Biology
Computational Biology
Genomics
Metadata
Show full item recordAbstract
Over the last decade there has been a great effort to annotate noncoding regions of the genome, particularly those that regulate gene expression. These regulatory elements contain binding sites for transcription factors (TF), which interact with one another and transcriptional machinery to initiate, enhance, or repress gene expression. The Encyclopedia of DNA Elements (ENCODE) consortium has generated thousands of epigenomic datasets, such as DNase-seq and ChIP-seq experiments, with the goal of defining such regions. By integrating these assays, we developed the Registry of candidate Regulatory Elements (cREs), a collection of putative regulatory regions across human and mouse. In total, we identified over 1.3M human and 400k mouse cREs each annotated with cell-type specific signatures (e.g. promoter-like, enhancer-like) in over 400 human and 100 mouse biosamples. We then demonstrated the biological utility of these regions by analyzing cell type enrichments for genetic variants reported by genome wide association studies (GWAS). To search and visualize these cREs, we developed the online database SCREEN (search candidate regulatory elements by ENCODE). After defining cREs, we next sought to determine their potential gene targets. To compare target gene prediction methods, we developed a comprehensive benchmark of enhancer-gene links by curating ChIA-PET, Hi-C and eQTL datasets. We then used this benchmark to evaluate unsupervised linking approaches such as the correlation of epigenomic signal. We determined that these methods have low overall performance and do not outperform simply selecting the closest gene. We then developed a supervised Random Forest model which had notably better performance than unsupervised methods. We demonstrated that this model can be applied across cell types and can be used to predict target genes for GWAS associated variants. Finally, we used the registry of cREs to annotate variants associated with psychiatric disorders. We found that these "psych SNPs" are enriched in cREs active in brain tissue and likely target genes involved in neural development pathways. We also demonstrated that psych SNPs overlap binding sites for TFs involved in neural and immune pathways. Finally, by identifying psych SNPs with allele imbalance in chromatin accessibility, we highlighted specific cases of psych SNPs altering TF binding motifs resulting in the disruption of TF binding. Overall, we demonstrated our collection of putative regulatory regions, the Registry of cREs, can be used to understand the potential biological function of noncoding variation and develop hypotheses for future testing.DOI
10.13028/M2NX0BPermanent Link to this Item
http://hdl.handle.net/20.500.14038/32309Rights
Licensed under a Creative Commons licenseDistribution License
http://creativecommons.org/licenses/by/4.0/ae974a485f413a2113503eed53cd6c53
10.13028/M2NX0B
Scopus Count
Except where otherwise noted, this item's license is described as Licensed under a Creative Commons license