Rashid, U.; Wu, C.; Shiller, J.; Smith, K.; Crowhurst, R.; Davy, M.; Chen, T.-H.; Carvajal, I.; Bailey, S.; Thomson, S.; Deng, C. AssemblyQC: A NextFlow Pipeline for Reproducible Reporting of Assembly Quality. Preprints2024, 2024060518. https://doi.org/10.20944/preprints202406.0518.v1
APA Style
Rashid, U., Wu, C., Shiller, J., Smith, K., Crowhurst, R., Davy, M., Chen, T. H., Carvajal, I., Bailey, S., Thomson, S., & Deng, C. (2024). AssemblyQC: A NextFlow Pipeline for Reproducible Reporting of Assembly Quality. Preprints. https://doi.org/10.20944/preprints202406.0518.v1
Chicago/Turabian Style
Rashid, U., Susan Thomson and Cecilia Deng. 2024 "AssemblyQC: A NextFlow Pipeline for Reproducible Reporting of Assembly Quality" Preprints. https://doi.org/10.20944/preprints202406.0518.v1
Abstract
SummaryGenome assembly projects have grown exponentially due to breakthroughs in sequencing technologies and assembly algorithms. Evaluating the quality of genome assemblies is critical to ensure the reliability of downstream analysis and interpretation. To fulfil this task, we have developed the AssemblyQC pipeline that performs file-format validation, contaminant checking, contiguity measurement, gene- and repeat-space completeness quantification, telomere inspection, taxonomic assignment, synteny alignment, scaffold examination through Hi-C contact-map visualisation, and assessments of completeness, consensus quality and phasing through K-mer analysis. It produces a comprehensive HTML report with method descriptions, tables, and visualisations.Availability and ImplementationThe pipeline uses NextFlow for workflow orchestration and adheres to the best-practice established by the nf-core community. This pipeline offers a reproducible, scalable, and portable method to assess the quality of genome assemblies – the code is available online.GitHub: https://github.com/Plant-Food-Research-Open/assemblyqc Supplementary information Pipeline usage documentation, parameter descriptions and example outputs are available on GitHub: https://github.com/Plant-Food-Research-Open/assemblyqc/tree/main/docs. A preview report is also hosted online: https://plant-food-research-open.github.io/assemblyqc
Keywords
Genome; Quality assessment; Nextflow
Subject
Biology and Life Sciences, Other
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.