An encyclopedia of enhancer-gene regulatory interactions in the human genome [preprint]
Authors
Gschwind, Andreas RMualim, Kristy S
Karbalayghareh, Alireza
Sheth, Maya U
Dey, Kushal K
Jagoda, Evelyn
Nurtdinov, Ramil N
Xi, Wang
Tan, Anthony S
Jones, Hank
Ma, X Rosa
Yao, David
Nasser, Joseph
Avsec, Žiga
James, Benjamin T
Shamim, Muhammad S
Durand, Neva C
Rao, Suhas S P
Mahajan, Ragini
Doughty, Benjamin R
Andreeva, Kalina
Ulirsch, Jacob C
Fan, Kaili
Perez, Elizabeth M
Nguyen, Tri C
Kelley, David R
Finucane, Hilary K
Moore, Jill E
Weng, Zhiping
Kellis, Manolis
Bassik, Michael C
Price, Alkes L
Beer, Michael A
Guigó, Roderic
Stamatoyannopoulos, John A
Lieberman Aiden, Erez
Greenleaf, William J
Leslie, Christina S
Steinmetz, Lars M
Kundaje, Anshul
Engreitz, Jesse M
UMass Chan Affiliations
Program in Bioinformatics and Integrative BiologyDocument Type
PreprintPublication Date
2023-11-13
Metadata
Show full item recordAbstract
Identifying transcriptional enhancers and their target genes is essential for understanding gene regulation and the impact of human genetic variation on disease1-6. Here we create and evaluate a resource of >13 million enhancer-gene regulatory interactions across 352 cell types and tissues, by integrating predictive models, measurements of chromatin state and 3D contacts, and largescale genetic perturbations generated by the ENCODE Consortium7. We first create a systematic benchmarking pipeline to compare predictive models, assembling a dataset of 10,411 elementgene pairs measured in CRISPR perturbation experiments, >30,000 fine-mapped eQTLs, and 569 fine-mapped GWAS variants linked to a likely causal gene. Using this framework, we develop a new predictive model, ENCODE-rE2G, that achieves state-of-the-art performance across multiple prediction tasks, demonstrating a strategy involving iterative perturbations and supervised machine learning to build increasingly accurate predictive models of enhancer regulation. Using the ENCODE-rE2G model, we build an encyclopedia of enhancer-gene regulatory interactions in the human genome, which reveals global properties of enhancer networks, identifies differences in the functions of genes that have more or less complex regulatory landscapes, and improves analyses to link noncoding variants to target genes and cell types for common, complex diseases. By interpreting the model, we find evidence that, beyond enhancer activity and 3D enhancer-promoter contacts, additional features guide enhancerpromoter communication including promoter class and enhancer-enhancer synergy. Altogether, these genome-wide maps of enhancer-gene regulatory interactions, benchmarking software, predictive models, and insights about enhancer function provide a valuable resource for future studies of gene regulation and human genetics.Source
Gschwind AR, Mualim KS, Karbalayghareh A, Sheth MU, Dey KK, Jagoda E, Nurtdinov RN, Xi W, Tan AS, Jones H, Ma XR, Yao D, Nasser J, Avsec Ž, James BT, Shamim MS, Durand NC, Rao SSP, Mahajan R, Doughty BR, Andreeva K, Ulirsch JC, Fan K, Perez EM, Nguyen TC, Kelley DR, Finucane HK, Moore JE, Weng Z, Kellis M, Bassik MC, Price AL, Beer MA, Guigó R, Stamatoyannopoulos JA, Lieberman Aiden E, Greenleaf WJ, Leslie CS, Steinmetz LM, Kundaje A, Engreitz JM. An encyclopedia of enhancer-gene regulatory interactions in the human genome. bioRxiv [Preprint]. 2023 Nov 13:2023.11.09.563812. doi: 10.1101/2023.11.09.563812. PMID: 38014075; PMCID: PMC10680627.DOI
10.1101/2023.11.09.563812Permanent Link to this Item
http://hdl.handle.net/20.500.14038/52868PubMed ID
38014075Notes
This article is a preprint. Preprints are preliminary reports of work that have not been certified by peer review.Rights
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.; Attribution-NoDerivatives 4.0 InternationalDistribution License
http://creativecommons.org/licenses/by-nd/4.0/ae974a485f413a2113503eed53cd6c53
10.1101/2023.11.09.563812
Scopus Count
Collections
The following license files are associated with this item:
- Creative Commons
Except where otherwise noted, this item's license is described as The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.