The Role of Non-Coding Regulatory Elements in Complex Traits and Immune-Mediated Disease
Faculty AdvisorZhiping Weng
Academic ProgramBioinformatics and Computational Biology
UMass Chan AffiliationsProgram in Bioinformatics and Integrative Biology
Document TypeDoctoral Dissertation
MetadataShow full item record
AbstractThe completion of the Human Genome Project ushered in the age of genome-wide association studies (GWAS), which have associated thousands of single nucleotide polymorphisms (SNPs) and other sequence variants with complex traits and diseases. Despite this success, progress bridging these associations to pathophysiologic understanding and new therapeutic interventions has been limited. In large part, this owes to the fact that 90% of GWAS-identified variants are non-coding–they do not impact the structure or function of proteins. Unraveling the impacts of non-coding sequence variants is one of the most significant unsolved problems in biology. Non-coding GWAS variants are enriched within cis-regulatory elements (CREs), sequences of DNA which modulate the expression, rather than the function, of target genes. These include promoters, which are immediately adjacent to the gene they regulate; enhancers, which increase expression of distant genes; silencers, which reduce the expression of distant genes; and insulators, which divide chromatin into domains to regulate interactions between other CREs. The function of CREs is modulated in part by transcription factors (TFs), DNA binding proteins which recognize and bind short characteristic DNA sequences called motifs. TFs and CREs are tissue- and cell type-specific, frequently regulating gene expression in only a few of the thousands of distinct cell and tissue types comprising the human body. Here we present work leveraging deep sequencing data and evolutionary conservation to build comprehensive atlases of cis-regulatory elements and transcription factor binding sites in the human genome, along with work architecting visualization platforms to make these atlases more accessible to, and impactful for, the scientific community. We then illustrate a key role for the sites in our atlases, particularly those evolutionarily constrained throughout the mammalian lineage, in complex traits and diseases. We conclude by presenting two case studies utilizing these datasets: one to better understand the role of non-coding variants in primary sclerosing cholangitis, a rare immune-mediated liver disease, and a second to understand the sequence features underlying strong insulator elements in the human genome.
Permanent Link to this Itemhttp://hdl.handle.net/20.500.14038/51137
RightsCopyright © 2022 Pratt.
Distribution LicenseAll Rights Reserved