Abstract
The Pseudomonas syringae species complex (PSSC) is a diverse group of plant pathogens with a collective host range encompassing almost every food crop grown today. As a threat to global food security, rapid detection and characterization of epidemic and emerging pathogenic lineages is essential. However, phylogenetic identification is often complicated by an unclarified and ever-changing taxonomy, making practical use of available databases and the proper training of classifiers difficult. As such, while amplicon sequencing is a common method for routine identification of PSSC isolates, there is no efficient method for accurate classification based on this data. Here we present a suite of five Naïve bayes classifiers for PCR primer sets widely used for PSSC identification, trained on in-silico amplicon data from 2,161 published PSSC genomes using the life identification number (LIN) hierarchical clustering algorithm in place of traditional Linnaean taxonomy. Additionally, we include a dataset for translating classification results back into traditional taxonomic nomenclature (i.e. species, phylogroup, pathovar), and for predicting virulence factor repertoires.
| Original language | English (US) |
|---|---|
| Article number | 178 |
| Journal | Scientific Data |
| Volume | 11 |
| Issue number | 1 |
| DOIs | |
| State | Published - Dec 2024 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 2 Zero Hunger
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Information Systems
- Education
- Computer Science Applications
- Statistics, Probability and Uncertainty
- Library and Information Sciences
Fingerprint
Dive into the research topics of 'Naïve Bayes Classifiers and accompanying dataset for Pseudomonas syringae isolate characterization'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver