Systematic Analysis of Duplications and Deletions in the Malaria Parasite P. falciparum: A Dissertation
Authors
DeConti, Derrick K.Faculty Advisor
Jeffrey Bailey, MD, PhDAcademic Program
Interdisciplinary Graduate ProgramUMass Chan Affiliations
MedicineDocument Type
Doctoral DissertationPublication Date
2015-04-15Keywords
Dissertations, UMMSPlasmodium falciparum
Gene Duplication
Segmental Duplications, Genomic
DNA Copy Number Variations
Plasmodium falciparum
Gene Duplication
Genomic Segmental Duplications
DNA Copy Number Variations
Bioinformatics
Computational Biology
Genetics
Genomics
Metadata
Show full item recordAbstract
Duplications and deletions are a major source of genomic variation. Duplications, specifically, have a significant impact on gene genesis and dosage, and the malaria parasite P. falciparum has developed resistance to a growing number of anti-malarial drugs via gene duplication. It also contains highly duplicated families of antigenically variable allelic genes. While specific genes and families have been studied, a comprehensive analysis of duplications and deletions within the reference genome and population has not been performed. We analyzed the extent of segmental duplications (SD) in the reference genome for P. falciparum, primarily by a whole genome self alignment. We discovered that while 5% of the genome identified as SD, the distribution within the genome was partition clustered, with the vast majority localized to the subtelomeres. Within the SDs, we found an overrepresentation of genes encoding antigenically diverse proteins exposed to the extracellular membrane, specifically the var, rifin, and stevor gene families. To examine variation of duplications and deletions within the parasite populations, we designed a novel computational methodology to identify copy number variants (CNVs) from high throughput sequencing, using a read depth based approach refined with discordant read pairs. After validating the program against in vitro lab cultures, we analyzed isolates from Senegal for initial tests into clinical isolates. We then expanded our search to a global sample of 610 strains from Africa and South East Asia, identifying 68 CNV regions. Geographically, genic CNV were found on average in less than 10% of the population, indicating that CNV are rare. However, CNVs at high frequency were almost exclusively duplications associated with known drug resistant CNVs. We also identified the novel biallelic duplication of the crt gene – containing both the chloroquine resistant and sensitive allele. The synthesis of our SD and CNV analysis indicates a CNV conservative P. falciparum genome except where drug and human immune pressure select for gene duplication.DOI
10.13028/M2601CPermanent Link to this Item
http://hdl.handle.net/20.500.14038/32244Rights
Copyright is held by the author, with all rights reserved.ae974a485f413a2113503eed53cd6c53
10.13028/M2601C
Scopus Count
Collections
Related items
Showing items related by title, author, creator and subject.
-
The Importance of the Centrosomal Localization Sequence of Cyclin E for Promoting Centrosome Duplication: A DissertationNordberg, Joshua J. (2011-05-24)This thesis comprises three separate studies that investigate the consequences of supernumary centrosomes, the effect of centrosome loss, and a control mechanism for regulating CDK2/cyclin E activity in centrosome duplication. The centrosome is the major microtubule-organizing center of the cell. When the cell enters mitosis, it is of critical importance that the cell has exactly two centrosomes in order to properly segregate the chromosomes to two daughter cells. Supernumary centrosomes are a problem for the cell in that they increase the incidence of chromosomal instability. Aberrant centrosome numbers are seen in a number of cancers, and there has been a proposed connection between the loss of function of p53 and multiple centrosomes. We investigated the consequences of multiple centrosomes in p53-null mouse embryonic fibroblasts (MEFs) to determine how cells with multiple centrosomes can continue to propagate and become cancer. We found that even in the face of extra centrosomes, p53-null MEFs are able to divide in a bipolar fashion by bundling extra centrosomes into two spindle poles. The centrosome has also been proposed to play a role in cell cycle control. We followed up on a previous study, which had suggested that centrosome loss causes a G1 arrest. We found that cells did not arrest in G1 due to centrosome removal as previously reported, but instead the arrest was viii dependent on additional stressors, namely the incident light used for our long-term live-cell observations. Our study showed that centrosome loss is a detectable stress that, in conjunction with additional stresses, can contribute to cell cycle arrest. It is known that CDK2/cyclin E activity is required to promote centrosome duplication. But with the discovery of a centrosomal localization sequence (CLS) in cyclin E, we wanted to know if centrosome duplication required a specific sub-cellular localization of CDK2 kinase activity. We found that centrosome duplication in Xenopus extract was dependent on CLS-mediated centrosomal localization of cyclin E, in complex with CDK2. Our results point to a mechanism for regulating centrosome duplication in the face of high cytoplasmic CDK2/cyclin E kinase activity.
-
Multiplying madly: deacetylases take charge of centrosome duplication and amplificationHung, Hui-Fang; Hehnly, Heidi; Doxsey, Stephen J (2012-12-01)Comment on: Ling H, et al. Cell Cycle 2012; 11:3779–91; PMID:23022877; http://dx.doi.org/10.4161/cc.21985
-
Elimination of PCR duplicates in RNA-seq and small RNA-seq using unique molecular identifiers [preprint]Fu, Yu; Beane, Timothy J.; Zamore, Phillip D.; Weng, Zhiping (2018-01-22)RNA-seq and small RNA-seq are powerful, quantitative tools to study gene regulation and function. Common high-throughput sequencing methods rely on polymerase chain reaction (PCR) to expand the starting material, but not every molecule amplifies equally, causing some to be overrepresented. Unique molecular identifiers (UMIs) can be used to distinguish undesirable PCR duplicates derived from a single molecule and identical but biologically meaningful reads from different molecules. We have incorporated UMIs into RNA-seq and small RNA-seq protocols and developed tools to analyze the resulting data. Our UMIs contain stretches of random nucleotides whose lengths sufficiently capture diverse molecule species in both RNA-seq and small RNA-seq libraries generated from mouse testis. Our approach yields high-quality data while allowing unique tagging of all molecules in high-depth libraries. Using simulated and real datasets, we demonstrate that our methods increase the reproducibility of RNA-seq and small RNA-seq data. Notably, we find that the amount of starting material and sequencing depth, but not the number of PCR cycles, determine PCR duplicate frequency. Finally, we show that computational removal of PCR duplicates based only on their mapping coordinates introduces substantial bias into data analysis.