Contamination of fungal genomes of Onygenaceae (Phylum Ascomycota) in public databases: incidence, detection, and impact
- Author: mycolabadmin
- 11/19/2025
- View Source
Summary
Scientists found that many fungal genome sequences stored in public databases contain unwanted bacterial DNA that can interfere with research results. They developed a method to identify and remove this contamination using related high-quality fungal genomes as reference. After cleaning four contaminated genomes, the quality improved significantly and the contamination dropped from 5-12% to below 3%, demonstrating that careful screening is essential for reliable genetic research.
Background
Genomic datasets in public databases often contain unwanted foreign or erroneous nucleotide sequences that compromise genome analyses. Few studies have addressed contamination specifically in fungal genomes, despite its potential to reduce accuracy and reliability of results.
Objective
To assess the presence of contaminant sequences in publicly available reference genomes of fungi from the family Onygenaceae, improve their quality through decontamination, and demonstrate the potential impact of using contaminated data on downstream analyses.
Results
Four genomes showed contamination levels between 5-12%, primarily of bacterial origin. After filtering, contamination dropped below 3% and assembly quality metrics improved. Functional annotation revealed reduction in bacteria-associated protein families, while phylogenetic analyses confirmed the effectiveness of the decontamination strategy.
Conclusion
Publicly available Onygenaceae fungal genomes contain significant contamination that can bias downstream analyses. The use of high-quality reference genomes as a custom database effectively filters contaminants, emphasizing the critical importance of rigorous quality control measures for reliable genomic data.
- Published in:BMC Genomics,
- Study Type:Genome Quality Assessment Study,
- Source: 10.1186/s12864-025-12223-3, PMID: 41257550