• Orgo-Seq integrates single-cell and bulk transcriptomic data to identify cell type specific-driver genes associated with autism spectrum disorder

      Lim, Elaine T.; Chan, Yingleong; Dawes, Pepper; Erdin, Serkan; Reichert, Julia M.; Burns, Mannix J.; Church, George M. (2022-06-10)
      Cerebral organoids can be used to gain insights into cell type specific processes perturbed by genetic variants associated with neuropsychiatric disorders. However, robust and scalable phenotyping of organoids remains challenging. Here, we perform RNA sequencing on 71 samples comprising 1,420 cerebral organoids from 25 donors, and describe a framework (Orgo-Seq) to integrate bulk RNA and single-cell RNA sequence data. We apply Orgo-Seq to 16p11.2 deletions and 15q11-13 duplications, two loci associated with autism spectrum disorder, to identify immature neurons and intermediate progenitor cells as critical cell types for 16p11.2 deletions. We further applied Orgo-Seq to identify cell type-specific driver genes. Our work presents a quantitative phenotyping framework to integrate multi-transcriptomic datasets for the identification of cell types and cell type-specific co-expressed driver genes associated with neuropsychiatric disorders.
    • Single cell transcriptomics reveals dysregulated cellular and molecular networks in a fragile X syndrome model

      Donnard, Elisa; Shu, Huan; Garber, Manuel (2022-06-08)
      Despite advances in understanding the pathophysiology of Fragile X syndrome (FXS), its molecular basis is still poorly understood. Whole brain tissue expression profiles have proved surprisingly uninformative, therefore we applied single cell RNA sequencing to profile an FMRP deficient mouse model with higher resolution. We found that the absence of FMRP results in highly cell type specific gene expression changes that are strongest among specific neuronal types, where FMRP-bound mRNAs were prominently downregulated. Metabolic pathways including translation and respiration are significantly upregulated across most cell types with the notable exception of excitatory neurons. These effects point to a potential difference in the activity of mTOR pathways, and together with other dysregulated pathways, suggest an excitatory-inhibitory imbalance in the Fmr1-knock out cortex that is exacerbated by astrocytes. Our data demonstrate that FMRP loss affects abundance of key cellular communication genes that potentially affect neuronal synapses and provide a resource for interrogating the biological basis of this disorder.
    • Risk Prediction Score for Pediatric Patients with Suspected Ebola Virus Disease

      Genisca, Alicia E.; Chu, Tzu-Chun; Huang, Lawrence; Gainey, Monique; Adeniji, Moyinoluwa; Mbong, Eta N.; Kennedy, Stephen B.; Laghari, Razia; Nganga, Fiston; Muhayangabo, Rigo F.; et al. (2022-06-01)
      Rapid diagnostic tools for children with Ebola virus disease (EVD) are needed to expedite isolation and treatment. To evaluate a predictive diagnostic tool, we examined retrospective data (2014-2015) from the International Medical Corps Ebola Treatment Centers in West Africa. We incorporated statistically derived candidate predictors into a 7-point Pediatric Ebola Risk Score. Evidence of bleeding or having known or no known Ebola contacts was positively associated with an EVD diagnosis, whereas abdominal pain was negatively associated. Model discrimination using area under the curve (AUC) was 0.87, which outperforms the World Health Organization criteria (AUC 0.56). External validation, performed by using data from International Medical Corps Ebola Treatment Centers in the Democratic Republic of the Congo during 2018-2019, showed an AUC of 0.70. External validation showed that discrimination achieved by using World Health Organization criteria was similar; however, the Pediatric Ebola Risk Score is simpler to use.
    • Ancestry-inclusive dog genomics challenges popular breed stereotypes

      Morrill, Kathleen; Li, Xue; McClure, Jesse; Logan, Brittney; Gao, Mingshi; Dong, Yinan; Carmichael, Elena; White, Michelle E.; Weng, Zhiping; Colubri, Andres; et al. (2022-04-29)
      Behavioral genetics in dogs has focused on modern breeds, which are isolated subgroups with distinctive physical and, purportedly, behavioral characteristics. We interrogated breed stereotypes by surveying owners of 18,385 purebred and mixed-breed dogs and genotyping 2155 dogs. Most behavioral traits are heritable [heritability (h(2)) > 25%], and admixture patterns in mixed-breed dogs reveal breed propensities. Breed explains just 9% of behavioral variation in individuals. Genome-wide association analyses identify 11 loci that are significantly associated with behavior, and characteristic breed behaviors exhibit genetic complexity. Behavioral loci are not unusually differentiated in breeds, but breed propensities align, albeit weakly, with ancestral function. We propose that behaviors perceived as characteristic of modern breeds derive from thousands of years of polygenic adaptation that predates breed formation, with modern breeds distinguished primarily by aesthetic traits.
    • MafB, WDR77, and ß-catenin interact with each other and have similar genome association profiles

      He, Lizhi; Gao, Mingshi; Pratt, Henry E.; Weng, Zhiping; Struhl, Kevin (2022-04-28)
      MafB (a bZIP transcription factor), ss-catenin (the ultimate target of the Wnt signal transduction pathway that acts as a transcriptional co-activator of LEF/TCF proteins), and WDR77 (a transcriptional co-activator of multiple hormone receptors) are important for breast cellular transformation. Unexpectedly, these proteins interact directly with each other, and they have similar genomic binding profiles. Furthermore, while some of these common target sites coincide with those bound by LEF/TCF, the majority are located just downstream of transcription initiation sites at a position near paused RNA polymerase (Pol II) and the +1 nucleosome. Occupancy levels of these factors at these promoter-proximal sites are strongly correlated with the level of paused Pol II and transcriptional activity.
    • AAV-delivered suppressor tRNA overcomes a nonsense mutation in mice

      Wang, Jiaming; Zhang, Yue; Mendonca, Craig A.; Yukselen, Onur; Muneeruddin, Khaja; Ren, Lingzhi; Liang, Jialing; Zhou, Chen; Xie, Jun; Li, Jia; et al. (2022-03-23)
      Gene therapy is a potentially curative medicine for many currently untreatable diseases, and recombinant adeno-associated virus (rAAV) is the most successful gene delivery vehicle for in vivo applications(1-3). However, rAAV-based gene therapy suffers from several limitations, such as constrained DNA cargo size and toxicities caused by non-physiological expression of a transgene(4-6). Here we show that rAAV delivery of a suppressor tRNA (rAAV.sup-tRNA) safely and efficiently rescued a genetic disease in a mouse model carrying a nonsense mutation, and effects lasted for more than 6 months after a single treatment. Mechanistically, this was achieved through a synergistic effect of premature stop codon readthrough and inhibition of nonsense-mediated mRNA decay. rAAV.sup-tRNA had a limited effect on global readthrough at normal stop codons and did not perturb endogenous tRNA homeostasis, as determined by ribosome profiling and tRNA sequencing, respectively. By optimizing the AAV capsid and the route of administration, therapeutic efficacy in various target tissues was achieved, including liver, heart, skeletal muscle and brain. This study demonstrates the feasibility of developing a toolbox of AAV-delivered nonsense suppressor tRNAs operating on premature termination codons (AAV-NoSTOP) to rescue pathogenic nonsense mutations and restore gene function under endogenous regulation. As nonsense mutations account for 11% of pathogenic mutations, AAV-NoSTOP can benefit a large number of patients. AAV-NoSTOP obviates the need to deliver a full-length protein-coding gene that may exceed the rAAV packaging limit, elicit adverse immune responses or cause transgene-related toxicities. It therefore represents a valuable addition to gene therapeutics.
    • The Antarctic Weddell seal genome reveals evidence of selection on cardiovascular phenotype and lipid handling

      Noh, Hyun Ji.; Turner-Maier, Jason; Schulberg, S. Anne; Fitzgerald, Michael L.; Johnson, Jeremy; Allen, Kaitlin N.; Huckstadt, Luis A.; Batten, Annabelle J.; Alfoldi, Jessica; Costa, Daniel P.; et al. (2022-02-17)
      The Weddell seal (Leptonychotes weddellii) thrives in its extreme Antarctic environment. We generated the Weddell seal genome assembly and a high-quality annotation to investigate genome-wide evolutionary pressures that underlie its phenotype and to study genes implicated in hypoxia tolerance and a lipid-based metabolism. Genome-wide analyses included gene family expansion/contraction, positive selection, and diverged sequence (acceleration) compared to other placental mammals, identifying selection in coding and non-coding sequence in five pathways that may shape cardiovascular phenotype. Lipid metabolism as well as hypoxia genes contained more accelerated regions in the Weddell seal compared to genomic background. Top-significant genes were SUMO2 and EP300; both regulate hypoxia inducible factor signaling. Liver expression of four genes with the strongest acceleration signals differ between Weddell seals and a terrestrial mammal, sheep. We also report a high-density lipoprotein-like particle in Weddell seal serum not present in other mammals, including the shallow-diving harbor seal.
    • Darwinian genomics and diversity in the tree of life

      Stephan, Taylorlyn; Karlsson, Elinor K. (2022-01-25)
      Genomics encompasses the entire tree of life, both extinct and extant, and the evolutionary processes that shape this diversity. To date, genomic research has focused on humans, a small number of agricultural species, and established laboratory models. Fewer than 18,000 of approximately 2,000,000 eukaryotic species ( < 1%) have a representative genome sequence in GenBank, and only a fraction of these have ancillary information on genome structure, genetic variation, gene expression, epigenetic modifications, and population diversity. This imbalance reflects a perception that human studies are paramount in disease research. Yet understanding how genomes work, and how genetic variation shapes phenotypes, requires a broad view that embraces the vast diversity of life. We have the technology to collect massive and exquisitely detailed datasets about the world, but expertise is siloed into distinct fields. A new approach, integrating comparative genomics with cell and evolutionary biology, ecology, archaeology, anthropology, and conservation biology, is essential for understanding and protecting ourselves and our world. Here, we describe potential for scientific discovery when comparative genomics works in close collaboration with a broad range of fields as well as the technical, scientific, and social constraints that must be addressed.
    • Why sequence all eukaryotes

      Blaxter, Mark; Karlsson, Elinor K. (2022-01-18)
      Life on Earth has evolved from initial simplicity to the astounding complexity we experience today. Bacteria and archaea have largely excelled in metabolic diversification, but eukaryotes additionally display abundant morphological innovation. How have these innovations come about and what constraints are there on the origins of novelty and the continuing maintenance of biodiversity on Earth? The history of life and the code for the working parts of cells and systems are written in the genome. The Earth BioGenome Project has proposed that the genomes of all extant, named eukaryotes-about 2 million species-should be sequenced to high quality to produce a digital library of life on Earth, beginning with strategic phylogenetic, ecological, and high-impact priorities. Here we discuss why we should sequence all eukaryotic species, not just a representative few scattered across the many branches of the tree of life. We suggest that many questions of evolutionary and ecological significance will only be addressable when whole-genome data representing divergences at all of the branchings in the tree of life or all species in natural ecosystems are available. We envisage that a genomic tree of life will foster understanding of the ongoing processes of speciation, adaptation, and organismal dependencies within entire ecosystems. These explorations will resolve long-standing problems in phylogenetics, evolution, ecology, conservation, agriculture, bioindustry, and medicine.
    • The Earth BioGenome Project 2020: Starting the clock

      Lewin, Harris A.; Karlsson, Elinor K. (2022-01-18)
      November 2020 marked 2 y since the launch of the Earth BioGenome Project (EBP), which aims to sequence all known eukaryotic species in a 10-y timeframe. Since then, significant progress has been made across all aspects of the EBP roadmap, as outlined in the 2018 article describing the project’s goals, strategies, and challenges (1). The launch phase has ended and the clock has started on reaching the EBP’s major milestones. This Special Feature explores the many facets of the EBP, including a review of progress, a description of major scientific goals, exemplar projects, ethical legal and social issues, and applications of biodiversity genomics. In this Introduction, we summarize the current status of the EBP, held virtually October 5 to 9, 2020, including recent updates through February 2021. References to the nine Perspective articles included in this Special Feature are cited to guide the reader toward deeper understanding of the goals and challenges facing the EBP.
    • Integration of high-resolution promoter profiling assays reveals novel, cell type-specific transcription start sites across 115 human cell and tissue types

      Moore, Jill E.; Zhang, Xiao-Ou; Elhajjajy, Shaimae I.; Fan, Kaili; Pratt, Henry E.; Reese, Fairlie; Mortazavi, Ali; Weng, Zhiping (2021-12-23)
      Accurate transcription start site (TSS) annotations are essential for understanding transcriptional regulation and its role in human disease. Gene collections such as GENCODE contain annotations for tens of thousands of TSSs, but not all of these annotations are experimentally validated nor do they contain information on cell type-specific usage. Therefore, we sought to generate a collection of experimentally validated TSSs by integrating RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression (RAMPAGE) data from 115 cell and tissue types, which resulted in a collection of approximately 50 thousand representative RAMPAGE peaks. These peaks are primarily proximal to GENCODE-annotated TSSs and are concordant with other transcription assays. Because RAMPAGE uses paired-end reads, we were then able to connect peaks to transcripts by analyzing the genomic positions of the 3' ends of read mates. Using this paired-end information, we classified the vast majority (37 thousand) of our RAMPAGE peaks as verified TSSs, updating TSS annotations for 20% of GENCODE genes. We also found that these updated TSS annotations are supported by epigenomic and other transcriptomic data sets. To show the utility of this RAMPAGE rPeak collection, we intersected it with the NHGRI/EBI genome-wide association study (GWAS) catalog and identified new candidate GWAS genes. Overall, our work shows the importance of integrating experimental data to further refine TSS annotations and provides a valuable resource for the biological community.
    • Somatic piRNAs and Transposons are Differentially Expressed Coincident with Skeletal Muscle Atrophy and Programmed Cell Death

      Tsuji, Junko; Thomson, Travis; Brown, Christine; Ghosh, Subhanita; Theurkauf, William E.; Weng, Zhiping; Schwartz, Lawrence M. (2021-12-22)
      PIWI-interacting RNAs (piRNAs) are small single-stranded RNAs that can repress transposon expression via epigenetic silencing and transcript degradation. They have been identified predominantly in the ovary and testis, where they serve essential roles in transposon silencing in order to protect the integrity of the genome in the germline. The potential expression of piRNAs in somatic cells has been controversial. In the present study we demonstrate the expression of piRNAs derived from both genic and transposon RNAs in the intersegmental muscles (ISMs) from the tobacco hawkmoth Manduca sexta. These piRNAs are abundantly expressed, approximately 27 nt long, map antisense to transposons, are oxidation resistant, exhibit a 5' uridine bias, and amplify via the canonical ping-pong pathway. An RNA-seq analysis demonstrated that 19 piRNA pathway genes are expressed in the ISMs and are developmentally regulated. The abundance of piRNAs does not change when the muscles initiate developmentally-regulated atrophy, but are repressed coincident with the commitment of the muscles undergo programmed cell death at the end of metamorphosis. This change in piRNA expression is correlated with the repression of several retrotransposons and the induction of specific DNA transposons. The developmentally-regulated changes in the expression of piRNAs, piRNA pathway genes, and transposons are all regulated by 20-hydroxyecdysone, the steroid hormone that controls the timing of ISM death. Taken together, these data provide compelling evidence for the existence of piRNA in somatic tissues and suggest that they may play roles in developmental processes such as programmed cell death.
    • How to make a rodent giant: Genomic basis and tradeoffs of gigantism in the capybara, the world's largest rodent

      Herrera-Alvarez, Santiago; Karlsson, Elinor K.; Ryder, Oliver A.; Lindblad-Toh, Kerstin; Crawford, Andrew J. (2020-11-10)
      Gigantism results when one lineage within a clade evolves extremely large body size relative to its small-bodied ancestors, a common phenomenon in animals. Theory predicts that the evolution of giants should be constrained by two tradeoffs. First, because body size is negatively correlated with population size, purifying selection is expected to be less efficient in species of large body size, leading to increased mutational load. Second, gigantism is achieved through generating a higher number of cells along with higher rates of cell proliferation, thus increasing the likelihood of cancer. To explore the genetic basis of gigantism in rodents and uncover genomic signatures of gigantism-related tradeoffs, we assembled a draft genome of the capybara (Hydrochoerus hydrochaeris), the world's largest living rodent. We found that the genome-wide ratio of non-synonymous to synonymous mutations (omega) is elevated in the capybara relative to other rodents, likely caused by a generation-time effect and consistent with a nearly-neutral model of molecular evolution. A genome-wide scan for adaptive protein evolution in the capybara highlighted several genes controlling post-natal bone growth regulation and musculoskeletal development, which are relevant to anatomical and developmental modifications for an increase in overall body size. Capybara-specific gene-family expansions included a putative novel anticancer adaptation that involves T cell-mediated tumor suppression, offering a potential resolution to the increased cancer risk in this lineage. Our comparative genomic results uncovered the signature of an intragenomic conflict where the evolution of gigantism in the capybara involved selection on genes and pathways that are directly linked to cancer.
    • HIV-1-induced cytokines deplete homeostatic innate lymphoid cells and expand TCF7-dependent memory NK cells

      Wang, Yetao; Lifshitz, Lawrence M.; McCauley, Sean M.; Vangala, Pranitha; Kim, Kyusik; Derr, Alan G.; Jaiswal, Smita; Kucukural, Alper; McDonel, Patrick; Greenough, Thomas C.; et al. (2020-03-01)
      Human immunodeficiency virus 1 (HIV-1) infection is associated with heightened inflammation and excess risk of cardiovascular disease, cancer and other complications. These pathologies persist despite antiretroviral therapy. In two independent cohorts, we found that innate lymphoid cells (ILCs) were depleted in the blood and gut of people with HIV-1, even with effective antiretroviral therapy. ILC depletion was associated with neutrophil infiltration of the gut lamina propria, type 1 interferon activation, increased microbial translocation and natural killer (NK) cell skewing towards an inflammatory state, with chromatin structure and phenotype typical of WNT transcription factor TCF7-dependent memory T cells. Cytokines that are elevated during acute HIV-1 infection reproduced the ILC and NK cell abnormalities ex vivo. These results show that inflammatory cytokines associated with HIV-1 infection irreversibly disrupt ILCs. This results in loss of gut epithelial integrity, microbial translocation and memory NK cells with heightened inflammatory potential, and explains the chronic inflammation in people with HIV-1.
    • Ribosomes guide pachytene piRNA formation on long intergenic piRNA precursors

      Sun, Yu H.; Zhu, Jiang; Xie, Li Huitong; Li, Ziwei; Meduri, Rajyalakshmi; Zhu, Xiaopeng; Song, Chi; Chen, Chen; Ricci, Emiliano P.; Weng, Zhiping; et al. (2020-02-03)
      PIWI-interacting RNAs (piRNAs) are a class of small non-coding RNAs essential for fertility. In adult mouse testes, most piRNAs are derived from long single-stranded RNAs lacking annotated open reading frames (ORFs). The mechanisms underlying how piRNA sequences are defined during the cleavages of piRNA precursors remain elusive. Here, we show that 80S ribosomes translate the 5'-proximal short ORFs (uORFs) of piRNA precursors. The MOV10L1/Armitage RNA helicase then facilitates the translocation of ribosomes into the uORF downstream regions (UDRs). The ribosome-bound UDRs are targeted by piRNA processing machinery, with the processed ribosome-protected regions becoming piRNAs. The dual modes of interaction between ribosomes and piRNA precursors underlie the distinct piRNA biogenesis requirements at uORFs and UDRs. Ribosomes also mediate piRNA processing in roosters and green lizards, implying that this mechanism is evolutionarily conserved in amniotes. Our results uncover a function for ribosomes on non-coding regions of RNAs and reveal the mechanisms underlying how piRNAs are defined.
    • The History of Farm Foxes Undermines the Animal Domestication Syndrome

      Lord, Kathryn A.; Larson, Greger; Coppinger, Raymond P.; Karlsson, Elinor K. (2020-02-01)
      The Russian Farm-Fox Experiment is the best known experimental study in animal domestication. By subjecting a population of foxes to selection for tameness alone, Dimitry Belyaev generated foxes that possessed a suite of characteristics that mimicked those found across domesticated species. This 'domestication syndrome' has been a central focus of research into the biological pathways modified during domestication. Here, we chart the origins of Belyaev's foxes in eastern Canada and critically assess the appearance of domestication syndrome traits across animal domesticates. Our results suggest that both the conclusions of the Farm-Fox Experiment and the ubiquity of domestication syndrome have been overstated. To understand the process of domestication requires a more comprehensive approach focused on essential adaptations to human-modified environments.
    • Performance of ZDOCK and IRAD in CAPRI rounds 39-45

      Vreven, Thom; Vangaveti, Sweta; Borrman, Tyler M.; Gaines, Jennifer C.; Weng, Zhiping (2020-01-29)
      We report docking performance on the six targets of Critical Assessment of PRedicted Interactions (CAPRI) rounds 39-45 that involved heteromeric protein-protein interactions and had the solved structures released since the rounds were held. Our general strategy involved protein-protein docking using ZDOCK, reranking using IRAD, and structural refinement using Rosetta. In addition, we made extensive use of experimental data to guide our docking runs. All the experimental information at the amino-acid level proved correct. However, for two targets, we also used protein-complex structures as templates for modeling interfaces. These resulted in incorrect predictions, presumably due to the low sequence identity between the targets and templates. Albeit a small number of targets, the performance described here compared somewhat less favorably with our previous CAPRI reports, which may be due to the CAPRI targets being increasingly challenging.
    • Evolutionarily conserved pachytene piRNA loci are highly divergent among modern humans

      Ozata, Deniz M.; Yu, Tianxiong; Mou, Haiwei; Colpan, Cansu; Cecchini, Katharine; Kaymaz, Yasin; Fan, Kaili; Kucukural, Alper; Weng, Zhiping; Zamore, Phillip D. (2020-01-04)
      In the fetal mouse testis, PIWI-interacting RNAs (piRNAs) guide PIWI proteins to silence transposons but, after birth, most post-pubertal pachytene piRNAs map to the genome uniquely and are thought to regulate genes required for male fertility. In the human male, the developmental classes, precise genomic origins and transcriptional regulation of postnatal piRNAs remain undefined. Here, we demarcate the genes and transcripts that produce postnatal piRNAs in human juvenile and adult testes. As in the mouse, human A-MYB drives transcription of both pachytene piRNA precursor transcripts and messenger RNAs encoding piRNA biogenesis factors. Although human piRNA genes are syntenic to those in other placental mammals, their sequences are poorly conserved. In fact, pachytene piRNA loci are rapidly diverging even among modern humans. Our findings suggest that, during mammalian evolution, pachytene piRNA genes are under few selective constraints. We speculate that pachytene piRNA diversity may provide a hitherto unrecognized driver of reproductive isolation.
    • DolphinNext: A Graphical User Interface for Distributed Data Processing of High Throughput Genomics

      Kucukural, Alper (2019-12-30)
      Emergence of new biomedical technologies, like next-generation sequencing (NGS) which is producing vast amounts of genomic data every day, is driving a big data revolution in biology. The dramatic increase in the volume, as well as the production rate of genomic data, has now made the data analysis new bottleneck for scientific discovery. Naturally, the need for highly-parallel data processing frameworks is greater than ever. It is also important for these frameworks to have certain design characteristics such as flexibility, portability, and reproducibility. Processing of sequencing data usually involves many different programs, each of which performs a specific step in the overall pipeline. Flexibility ensures that the pipelines can support a variety of use cases or data types without the need to modify existing pipelines or create new ones. Portability gives user the freedom to choose computational resources as he/she deems fit. Reproducibility across computing environments, which warrants credibility of the results, is a particularly important feature in the face of the sheer volume of data and complexity of the pipelines. There exist several platforms that offer graphical user interfaces for designing and execution of complex pipelines (e.g. Galaxy, GenePattern, GeneProf). Unfortunately, none of these platforms supports parallelism or portability across computing environments. To address these and additional shortcomings discussed in this paper, we have created DolphinNext, an easy-to-use graphical user interface for creating and deploying complex workflows for parallel processing of high throughput genomic data. DolphinNext relies on Nextflow which is a framework enabling scalable and reproducible workflows using software containers. The central idea behind the creation of DolphinNext is to facilitate building and deployment of complex pipelines using a graphically-enabled modular approach.
    • Solenodon genome reveals convergent evolution of venom in eulipotyphlan mammals

      Casewell, Nicholas R.; Karlsson, Elinor K.; Petras, Daniel; Card, Daren C.; Suranse, Vivek; Mychajliw, Alexis M.; Richards, David; Koludarov, Ivan; Albulescu, Laura-Oana; Slagboom, Julien; et al. (2019-12-17)
      Venom systems are key adaptations that have evolved throughout the tree of life and typically facilitate predation or defense. Despite venoms being model systems for studying a variety of evolutionary and physiological processes, many taxonomic groups remain understudied, including venomous mammals. Within the order Eulipotyphla, multiple shrew species and solenodons have oral venom systems. Despite morphological variation of their delivery systems, it remains unclear whether venom represents the ancestral state in this group or is the result of multiple independent origins. We investigated the origin and evolution of venom in eulipotyphlans by characterizing the venom system of the endangered Hispaniolan solenodon (Solenodon paradoxus). We constructed a genome to underpin proteomic identifications of solenodon venom toxins, before undertaking evolutionary analyses of those constituents, and functional assessments of the secreted venom. Our findings show that solenodon venom consists of multiple paralogous kallikrein 1 (KLK1) serine proteases, which cause hypotensive effects in vivo, and seem likely to have evolved to facilitate vertebrate prey capture. Comparative analyses provide convincing evidence that the oral venom systems of solenodons and shrews have evolved convergently, with the 4 independent origins of venom in eulipotyphlans outnumbering all other venom origins in mammals. We find that KLK1s have been independently coopted into the venom of shrews and solenodons following their divergence during the late Cretaceous, suggesting that evolutionary constraints may be acting on these genes. Consequently, our findings represent a striking example of convergent molecular evolution and demonstrate that distinct structural backgrounds can yield equivalent functions.