TY - CONF
T1 - Branch detection with apple trees trained in fruiting wall architecture using stereo vision and Regions-Convolutional Neural Network(R-CNN)
AU - Zhang, Jing
AU - He, Long
AU - Karkee, Manoj
AU - Zhang, Qin
AU - Zhang, Xin
AU - Gao, Zongmei
N1 - Funding Information:
This research was supported in part by USDA Hatch and Multistate Project Funds (Accession Nos. 1005756 and 1001246), a USDA National Institute for Food and Agriculture competitive grant (Accession No. 1005200), Beijing Municipal Science and Technology Commission Project (Grant No. D161100003216002), and the Washington State University (WSU) Agricultural Research Center. The China Scholarship Council (CSC) sponsored Jing Zhang in conducting collaborative PhD dissertation research at the WSU Center for Precision and Automated Agricultural Systems (CPAAS). Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the USDA and Washington State University.
PY - 2017
Y1 - 2017
N2 - Due to the rising cost and decreasing availability of labor, manual picking is becoming increasingly challenging for apple and other tree fruit growers. A targeted shake-and-catch apple harvesting machine is under development at Washington State University to address this challenge. This machine is showing a promise for harvesting some varieties of apples. However, the performance and productivity of such a harvesting system could greatly be increased if the shaking process could be automated. First step towards automated shaking is the detection of branches and localization of shaking points according to the position of branches in real world. A b r a n c h detection method was developed in this work for apple trees trained in fruiting wall architecture using a stereo vision system and a Regions-Convolutional Neural Network (R-CNN). A stereo vision camera was used to acquire RGB images, depth images as well as index images in natural orchard environment. The R-CNN composed of improved AlexNet, which was trained to detect the apple tree branches. In this study, a fusion detection method called Depth & Index (D&I) was proposed to fuse the detection results of branches from both depth images and index images. The results showed that the value of average recall and Average Accuracy (AA) from the D&I method was 70.5% and 63.3% when the R-CNN confidence of depth image was 50.0%. However, in the same conditions, the average recall and AA was only 62.7% and 59.2% using the depth images alone. Furthermore, the D&I method also had better performance in terms of the morphology fitting of apple tree branches. This study showed a great potential using both of depth and index images to detect and fit apple tree branches in real-time.
AB - Due to the rising cost and decreasing availability of labor, manual picking is becoming increasingly challenging for apple and other tree fruit growers. A targeted shake-and-catch apple harvesting machine is under development at Washington State University to address this challenge. This machine is showing a promise for harvesting some varieties of apples. However, the performance and productivity of such a harvesting system could greatly be increased if the shaking process could be automated. First step towards automated shaking is the detection of branches and localization of shaking points according to the position of branches in real world. A b r a n c h detection method was developed in this work for apple trees trained in fruiting wall architecture using a stereo vision system and a Regions-Convolutional Neural Network (R-CNN). A stereo vision camera was used to acquire RGB images, depth images as well as index images in natural orchard environment. The R-CNN composed of improved AlexNet, which was trained to detect the apple tree branches. In this study, a fusion detection method called Depth & Index (D&I) was proposed to fuse the detection results of branches from both depth images and index images. The results showed that the value of average recall and Average Accuracy (AA) from the D&I method was 70.5% and 63.3% when the R-CNN confidence of depth image was 50.0%. However, in the same conditions, the average recall and AA was only 62.7% and 59.2% using the depth images alone. Furthermore, the D&I method also had better performance in terms of the morphology fitting of apple tree branches. This study showed a great potential using both of depth and index images to detect and fit apple tree branches in real-time.
UR - http://www.scopus.com/inward/record.url?scp=85035327628&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85035327628&partnerID=8YFLogxK
U2 - 10.13031/aim.201700427
DO - 10.13031/aim.201700427
M3 - Paper
AN - SCOPUS:85035327628
T2 - 2017 ASABE Annual International Meeting
Y2 - 16 July 2017 through 19 July 2017
ER -