The Role of Non-Coding Regulatory Elements in Complex Traits and Immune-Mediated Disease
Authors
Pratt, Henry
Faculty Advisor
Zhiping WengAcademic Program
Bioinformatics and Computational BiologyUMass Chan Affiliations
Program in Bioinformatics and Integrative BiologyDocument Type
Doctoral DissertationPublication Date
2022-09-12Keywords
immune-mediated disease
Metadata
Show full item recordAbstract
The completion of the Human Genome Project ushered in the age of genome-wide association studies (GWAS), which have associated thousands of single nucleotide polymorphisms (SNPs) and other sequence variants with complex traits and diseases. Despite this success, progress bridging these associations to pathophysiologic understanding and new therapeutic interventions has been limited. In large part, this owes to the fact that 90% of GWAS-identified variants are non-coding–they do not impact the structure or function of proteins. Unraveling the impacts of non-coding sequence variants is one of the most significant unsolved problems in biology. Non-coding GWAS variants are enriched within cis-regulatory elements (CREs), sequences of DNA which modulate the expression, rather than the function, of target genes. These include promoters, which are immediately adjacent to the gene they regulate; enhancers, which increase expression of distant genes; silencers, which reduce the expression of distant genes; and insulators, which divide chromatin into domains to regulate interactions between other CREs. The function of CREs is modulated in part by transcription factors (TFs), DNA binding proteins which recognize and bind short characteristic DNA sequences called motifs. TFs and CREs are tissue- and cell type-specific, frequently regulating gene expression in only a few of the thousands of distinct cell and tissue types comprising the human body. Here we present work leveraging deep sequencing data and evolutionary conservation to build comprehensive atlases of cis-regulatory elements and transcription factor binding sites in the human genome, along with work architecting visualization platforms to make these atlases more accessible to, and impactful for, the scientific community. We then illustrate a key role for the sites in our atlases, particularly those evolutionarily constrained throughout the mammalian lineage, in complex traits and diseases. We conclude by presenting two case studies utilizing these datasets: one to better understand the role of non-coding variants in primary sclerosing cholangitis, a rare immune-mediated liver disease, and a second to understand the sequence features underlying strong insulator elements in the human genome.DOI
10.13028/e7hv-m694Permanent Link to this Item
http://hdl.handle.net/20.500.14038/51137Rights
Copyright © 2022 Pratt.Distribution License
All Rights Reservedae974a485f413a2113503eed53cd6c53
10.13028/e7hv-m694