Myb domain proteins contain a conserved DNA-binding domain composed of one to four conserved repeat motifs. In animals, Myb proteins are encoded by a small gene family and commonly contain three repeat motifs (R1R2R3); whereas, plant Myb proteins are encoded by a very large and diverse gene family in which a motif containing two repeats (R2R3) is the most common. In contrast to the conservation in the Myb domain, other regions of Myb proteins are highly variable. To explore the evolutionary origin of Myb genes, we cloned and sequenced Myb domains from maize and sorghum, and conducted a comprehensive phylogenetic analysis of Myb genes. The results indicate that the origins of individual Myb repeats are strikingly distinct, and that the R2 repeat has evolved more slowly than the R1 and R3 repeats. However, it is not clear which repeat is the most ancient one. The evidence also suggests that R2R3 and R1R2R3 Myb genes co-existed in eukaryotes before the divergence of plants and animals. Based on our results, we propose that R1R2R3 Myb genes were derived from R2R3 Myb genes by gain of the R1 repeat through an ancient intragenic duplication; this gain model is more parsimonious than the previous proposal that R2R3 Myb genes were derived from R1R2R3 Mybs by loss of the R1 repeat. A separate group of diverse non-typical Myb proteins exhibits a polyphyletic origin and a complex evolutionary pattern. Finally, a small group of ancient Myb paralogs prior to the amplification of current Myb genes is identified. Together, these results support a new model for the ordered evolution of Myb gene family.
All Science Journal Classification (ASJC) codes