Systematic Analysis of Duplications and Deletions in the Malaria Parasite P. falciparum: A Dissertation
AuthorsDeConti, Derrick K.
Faculty AdvisorJeffrey Bailey, MD, PhD
Academic ProgramInterdisciplinary Graduate Program
UMass Chan AffiliationsMedicine
Document TypeDoctoral Dissertation
Segmental Duplications, Genomic
DNA Copy Number Variations
Genomic Segmental Duplications
DNA Copy Number Variations
MetadataShow full item record
AbstractDuplications and deletions are a major source of genomic variation. Duplications, specifically, have a significant impact on gene genesis and dosage, and the malaria parasite P. falciparum has developed resistance to a growing number of anti-malarial drugs via gene duplication. It also contains highly duplicated families of antigenically variable allelic genes. While specific genes and families have been studied, a comprehensive analysis of duplications and deletions within the reference genome and population has not been performed. We analyzed the extent of segmental duplications (SD) in the reference genome for P. falciparum, primarily by a whole genome self alignment. We discovered that while 5% of the genome identified as SD, the distribution within the genome was partition clustered, with the vast majority localized to the subtelomeres. Within the SDs, we found an overrepresentation of genes encoding antigenically diverse proteins exposed to the extracellular membrane, specifically the var, rifin, and stevor gene families. To examine variation of duplications and deletions within the parasite populations, we designed a novel computational methodology to identify copy number variants (CNVs) from high throughput sequencing, using a read depth based approach refined with discordant read pairs. After validating the program against in vitro lab cultures, we analyzed isolates from Senegal for initial tests into clinical isolates. We then expanded our search to a global sample of 610 strains from Africa and South East Asia, identifying 68 CNV regions. Geographically, genic CNV were found on average in less than 10% of the population, indicating that CNV are rare. However, CNVs at high frequency were almost exclusively duplications associated with known drug resistant CNVs. We also identified the novel biallelic duplication of the crt gene – containing both the chloroquine resistant and sensitive allele. The synthesis of our SD and CNV analysis indicates a CNV conservative P. falciparum genome except where drug and human immune pressure select for gene duplication.
Permanent Link to this Itemhttp://hdl.handle.net/20.500.14038/32244
RightsCopyright is held by the author, with all rights reserved.
Showing items related by title, author, creator and subject.
Duplication of U3 sequences in the long terminal repeat of mink cell focus-inducing viruses generates redundancies of transcription factor binding sites important for the induction of thymomasDiFronzo, Nancy L.; Frieder, Marisa; Loiler, Scott A.; Pham, Quynh N.; Holland, Christie A. (2003-02-14)The ability of mink cell focus-inducing (MCF) viruses to induce thymomas is determined, in part, by transcriptional enhancers in the U3 region of their long terminal repeats (LTRs). To elucidate sequence motifs important for enhancer function in vivo, we injected newborn mice with MCF 1dr (supF), a weakly pathogenic, molecularly tagged (supF) MCF virus containing only one copy of a sequence that is present as two copies (known as the directly repeated [DR] sequence) in the U3 region of MCF 247 and analyzed LTRs from supF-tagged proviruses in two resulting thymomas. Tagged proviruses integrated upstream and in the reverse transcriptional orientation relative to c-myc provided the focus of our studies. These proviruses are thought to contribute to thymoma induction by enhancer-mediated deregulation of c-myc expression. The U3 region in a tagged LTR in one thymoma was cloned and sequenced. Relative to MCF 1dr (supF), the cloned U3 region contained an insertion of 140 bp derived predominantly from the DR sequence of the injected virus. The inserted sequence contains predicted binding sites for transcription factors known to regulate the U3 regions of various murine leukemia viruses. Similar constellations of binding sites were duplicated in two proviral LTRs integrated upstream from c-myc in a second thymoma. We replaced the U3 sequences in an infectious molecular clone of MCF 247 with the cloned proviral U3 sequences from the first thymoma and generated an infectious chimeric virus, MCF ProEn. When injected into neonatal AKR mice, MCF ProEn was more pathogenic than the parental virus, MCF 1dr (supF), as evidenced by the more rapid onset and higher incidence of thymomas. Molecular analyses of the resultant thymomas indicated that the U3 region of MCF ProEn was genetically stable. These data suggest that the arrangement and/or redundancy of transcription factor binding sites generated by specific U3 sequence duplications are important to the biological events mediated by MCF proviruses integrated near c-myc that contribute to transformation.
Genome-wide analysis of copy number variants in attention deficit hyperactivity disorder: the role of rare variants and duplications at 15q13.3Williams, Nigel M.; Franke, Barbara; Mick, Eric; Anney, Richard J.; Freitag, Christine M.; Gill, Michael; Thapar, Anita; O'Donovan, Michael C.; Owen, Michael J.; Holmans, Peter; et al. (2012-02-01)OBJECTIVE: Attention deficit hyperactivity disorder (ADHD) is a common, highly heritable psychiatric disorder. Because of its multifactorial etiology, however, identifying the genes involved has been difficult. The authors followed up on recent findings suggesting that rare copy number variants (CNVs) may be important for ADHD etiology. METHOD: The authors performed a genome-wide analysis of large, rare CNVs (100 kb in size, which segregated into 912 independent loci. Overall, the rate of rare CNVs >100 kb was 1.15 times higher in ADHD case subjects relative to comparison subjects, with duplications spanning known genes showing a 1.2-fold enrichment. In accordance with a previous study, rare CNVs >500 kb showed the greatest enrichment (1.28-fold). CNVs identified in ADHD case subjects were significantly enriched for loci implicated in autism and in schizophrenia. Duplications spanning the CHRNA7 gene at chromosome 15q13.3 were associated with ADHD in single-locus analysis. This finding was consistently replicated in an additional 2,242 ADHD case subjects and 8,552 comparison subjects from four independent cohorts from the United Kingdom, the United States, and Canada. Presence of the duplication at 15q13.3 appeared to be associated with comorbid conduct disorder. CONCLUSIONS: These findings support the enrichment of large, rare CNVs in ADHD and implicate duplications at 15q13.3 as a novel risk factor for ADHD. With a frequency of 0.6% in the populations investigated and a relatively large effect size (odds ratio=2.22, 95% confidence interval=1.5-3.6), this locus could be an important contributor to ADHD etiology.
Chromosomal duplications and cointegrates generated by the bacteriophage lamdba Red system in Escherichia coli K-12Poteete, Anthony R.; Fenton, Anita C.; Nadkarni, Ashwini (2004-12-15)BACKGROUND: An Escherichia coli strain in which RecBCD has been genetically replaced by the bacteriophage lambda Red system engages in efficient recombination between its chromosome and linear double-stranded DNA species sharing sequences with the chromosome. Previous studies of this experimental system have focused on a gene replacement-type event, in which a 3.5 kbp dsDNA consisting of the cat gene and flanking lac operon sequences recombines with the E. coli chromosome to generate a chloramphenicol-resistant Lac- recombinant. The dsDNA was delivered into the cell as part of the chromosome of a non-replicating lambda vector, from which it was released by the action of a restriction endonuclease in the infected cell. This study characterizes the genetic requirements and outcomes of a variety of additional Red-promoted homologous recombination events producing Lac+ recombinants. RESULTS: A number of observations concerning recombination events between the chromosome and linear DNAs were made: (1) Formation of Lac+ and Lac- recombinants depended upon the same recombination functions. (2) High multiplicity and high chromosome copy number favored Lac+ recombinant formation. (3) The Lac+ recombinants were unstable, segregating Lac- progeny. (4) A tetracycline-resistance marker in a site of the phage chromosome distant from cat was not frequently co-inherited with cat. (5) Recombination between phage sequences in the linear DNA and cryptic prophages in the chromosome was responsible for most of the observed Lac+ recombinants. In addition, observations were made concerning recombination events between the chromosome and circular DNAs: (6) Formation of recombinants depended upon both RecA and, to a lesser extent, Red. (7) The linked tetracycline-resistance marker was frequently co-inherited in this case. CONCLUSIONS: The Lac+ recombinants arise from events in which homologous recombination between the incoming linear DNA and both lac and cryptic prophage sequences in the chromosome generates a partial duplication of the bacterial chromosome. When the incoming DNA species is circular rather than linear, cointegrates are the most frequent type of recombinant.