A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections
Name:
A Snakemake Toolkit for the Batch ...
Size:
3.532Mb
Format:
PDF
Description:
Published version
Average rating
Cast your vote
You can rate an item by clicking the amount of stars they wish to award to
this item.
When enough users have cast their vote on this item, the average rating will also be shown.
Star rating
Your vote was cast
Thank you for your feedback
Thank you for your feedback
Issue date
2024-10-28Subject Terms
assemblybioinformatics
DNA barcode
genome skim
iBOL
museomics
phylogenetics
snakemake
Metadata
Show full item recordAbstract
ABSTRACT - Low coverage ‘genome‐skims’ are often used to assemble organelle genomes and ribosomal gene sequences for cost‐effective phylogenetic and barcoding studies. Natural history collections hold invaluable biological information, yet poor preservation resulting in degraded DNA often hinders polymerase chain reaction‐based analyses. However, it is possible to generate libraries and sequence the short fragments typical of degraded DNA to generate genome‐skims from museum collections. Here we introduce a snakemake toolkit comprised of three pipelines <jats:italic>skim2mito</jats:italic>, <jats:italic>skim2rrna</jats:italic> and <jats:italic>gene2phylo</jats:italic>, designed to unlock the genomic potential of historical museum specimens using genome skimming. Specifically, <jats:italic>skim2mito</jats:italic> and <jats:italic>skim2rrna</jats:italic> perform the batch assembly, annotation and phylogenetic analysis of mitochondrial genomes and nuclear ribosomal genes, respectively, from low‐coverage genome skims. The third pipeline <jats:italic>gene2phylo</jats:italic> takes a set of gene alignments and performs phylogenetic analysis of individual genes, partitioned analysis of concatenated alignments and a phylogenetic analysis based on gene trees. We benchmark our pipelines with simulated data, followed by testing with a novel genome skimming dataset from both recent and historical solariellid gastropod samples. We show that the toolkit can recover mitochondrial and ribosomal genes from poorly preserved museum specimens of the gastropod family Solariellidae, and the phylogenetic analysis is consistent with our current understanding of taxonomic relationships. The generation of bioinformatic pipelines that facilitate processing large quantities of sequence data from the vast repository of specimens held in natural history museum collections will greatly aid species discovery and exploration of biodiversity over time, ultimately aiding conservation efforts in the face of a changing planet.Citation
White, O.W., Hall, A., Price, B.W., Williams, S.T. and Clark, M.D. (2025), A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections. Mol Ecol Resour, 25: e14036. https://doi.org/10.1111/1755-0998.14036Publisher
WileyJournal
Molecular Ecology ResourcesType
Journal ArticleItem Description
Copyright © 2024 The Author(s). Molecular Ecology Resources published by John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. The attached file is the published version of the article.NHM Repository
ISSN
1755-098XEISSN
1755-0998ae974a485f413a2113503eed53cd6c53
10.1111/1755-0998.14036
Scopus Count
Collections