Progressive alignment with Cactus: a multiple-genome aligner for the thousand-genome era [preprint]
Armstrong, Joel ; Karlsson, Elinor K ; Paten, Benedict
Citations
Student Authors
Faculty Advisor
Academic Program
UMass Chan Affiliations
Document Type
Publication Date
Subject Area
Embargo Expiration Date
Link to Full Text
Abstract
Cactus, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequence. We describe progressive extensions to Cactus that enable reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We show that Cactus is capable of scaling to hundreds of genomes and beyond by describing results from an alignment of over 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment yet created. Further, we show improvements in orthology resolution leading to downstream improvements in annotation.
Source
bioRxiv 730531; doi: https://doi.org/10.1101/730531. Link to preprint on bioRxiv service.
Year of Medical School at Time of Visit
Sponsors
Dates of Travel
DOI
Permanent Link to this Item
PubMed ID
Other Identifiers
Notes
Full author list omitted for brevity. For the full list of authors, see paper.