TY - JOUR
T1 - PlantTribes
T2 - A gene and gene family resource for comparative genomics in plants
AU - Wall, P. Kerr
AU - Leebens-Mack, Jim
AU - Müller, Kai F.
AU - Field, Dawn
AU - Altman, Naomi S.
AU - Depamphilis, Claude W.
N1 - Funding Information:
The authors thank our faculty, postdoctoral and student colleagues in the Floral Genome Project, the Ancestral Angiosperm Genome Project and the poplar and papaya genome projects for their enthusiastic support and use of PlantTribes through its initial stages of development. We would like to thank Hong Ma, John Carlson and Victor Albert for invaluable discussions of the biological implications of PlantTribes. Finally, we thank Josh Marion, Tony Orenga, Severn Everett, Kevin Beckmann, Anthony Carroll and Erik Wolcott for their assistance in the development of portions of the PlantTribes database and web interface. This work was funded by National Science Foundation (DEB 0115684 to C.W.D. and J.L.-M., DEB 0638595 to C.W.D. and J.L.-M, and DBI-0501890 to C.W.D.). K.F.M. was supported by a scholarship from the Deutsche Telekom Stiftung. Funding to pay the Open Access publication charges for this article was provided by NSF DEB 0638595.
PY - 2008/1
Y1 - 2008/1
N2 - The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575-1584)] to classify all of these species' protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study.
AB - The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575-1584)] to classify all of these species' protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study.
UR - http://www.scopus.com/inward/record.url?scp=38549087500&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38549087500&partnerID=8YFLogxK
U2 - 10.1093/nar/gkm972
DO - 10.1093/nar/gkm972
M3 - Article
C2 - 18073194
AN - SCOPUS:38549087500
SN - 0305-1048
VL - 36
SP - D970-D976
JO - Nucleic acids research
JF - Nucleic acids research
IS - SUPPL. 1
ER -