TY - JOUR
T1 - Computer vision cracks the leaf code
AU - Wilf, Peter
AU - Zhang, Shengping
AU - Chikkerur, Sharat
AU - Little, Stefan A.
AU - Wing, Scott L.
AU - Serre, Thomas
N1 - Funding Information:
ACKNOWLEDGMENTS. We thank A. Young and J. Kissell for image preparations, the anonymous reviewers for helpful comments, Y. Guo for software, A. Rozo for book scanning, and D. Erwin for assistance with the Axelrod collection. We acknowledge financial support from the David and Lucile Packard Foundation (P.W.); National Science Foundation Early Career Award IIS-1252951, Defense Advanced Research Projects Agency Young Investigator Award N66001-14-1-4037, Office of Naval Research Grant N000141110743, and the Brown Center for Computation and Visualization (T.S.); and National Natural Science Foundation of China Grant 61300111 and Key Program Grant 61133003 (to S.Z.).
PY - 2016/3/22
Y1 - 2016/3/22
N2 - Understanding the extremely variable, complex shape and venation characters of angiosperm leaves is one of the most challenging problems in botany. Machine learning offers opportunities to analyze large numbers of specimens, to discover novel leaf features of angiosperm clades that may have phylogenetic significance, and to use those characters to classify unknowns. Previous computer vision approaches have primarily focused on leaf identification at the species level. It remains an open question whether learning and classification are possible among major evolutionary groups such as families and orders, which usually contain hundreds to thousands of species each and exhibit many times the foliar variation of individual species. Here, we tested whether a computer vision algorithm could use a database of 7, 597 leaf images from 2, 001 genera to learn features of botanical families and orders, then classify novel images. The images are of cleared leaves, specimens that are chemically bleached, then stained to reveal venation. Machine learning was used to learn a codebook of visual elements representing leaf shape and venation patterns. The resulting automated system learned to classify images into families and orders with a success rate many times greater than chance. Of direct botanical interest, the responses of diagnostic features can be visualized on leaf images as heat maps, which are likely to prompt recognition and evolutionary interpretation of a wealth of novel morphological characters. With assistance from computer vision, leaves are poised to make numerous new contributions to systematic and paleobotanical studies.
AB - Understanding the extremely variable, complex shape and venation characters of angiosperm leaves is one of the most challenging problems in botany. Machine learning offers opportunities to analyze large numbers of specimens, to discover novel leaf features of angiosperm clades that may have phylogenetic significance, and to use those characters to classify unknowns. Previous computer vision approaches have primarily focused on leaf identification at the species level. It remains an open question whether learning and classification are possible among major evolutionary groups such as families and orders, which usually contain hundreds to thousands of species each and exhibit many times the foliar variation of individual species. Here, we tested whether a computer vision algorithm could use a database of 7, 597 leaf images from 2, 001 genera to learn features of botanical families and orders, then classify novel images. The images are of cleared leaves, specimens that are chemically bleached, then stained to reveal venation. Machine learning was used to learn a codebook of visual elements representing leaf shape and venation patterns. The resulting automated system learned to classify images into families and orders with a success rate many times greater than chance. Of direct botanical interest, the responses of diagnostic features can be visualized on leaf images as heat maps, which are likely to prompt recognition and evolutionary interpretation of a wealth of novel morphological characters. With assistance from computer vision, leaves are poised to make numerous new contributions to systematic and paleobotanical studies.
UR - http://www.scopus.com/inward/record.url?scp=84962310740&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962310740&partnerID=8YFLogxK
U2 - 10.1073/pnas.1524473113
DO - 10.1073/pnas.1524473113
M3 - Article
C2 - 26951664
AN - SCOPUS:84962310740
SN - 0027-8424
VL - 113
SP - 3305
EP - 3310
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 12
ER -