TY - GEN
T1 - Preliminary Results on Distribution Shift Performance of Deep Networks for Synthetic Aperture Sonar Classification
AU - Gerg, Isaac D.
AU - Monga, Vishal
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - We demonstrate the ease by which deep networks are fooled by sonar-relevant distribution shifts induced by sonar-relevant transforms, imaging errors common to SAS, and unseen target/background combinations. Furthermore, the generated images fooling the networks are trivial for human operators to interpret. We posit this disconnect between human and machine performance is an open area of research and, when reconciled, improves human-machine trust. Our goal with this work is to begin discerning where deep network performance (specifically convolutional neural networks (CNNs)) deteriorates and how their perception model differs from humans. Specifically, we show network performance varies widely across contemporary architectures and training schemes on: (1) images derived from a set of sonar-relevant transformations, which we call semantically stable, (2) imagery perturbed with quadratic phase error (common to SAS), and (3) a synthetic target dataset created by injecting real targets into unseen real backgrounds. Finally, we delineate the relationship between spatial frequency and network performance and find many networks rely almost exclusively on low-frequency content to make their predictions. These results may help illuminate why changes to a sonar system or simulation sometimes necessitate complete network retraining to accommodate the 'new' data; a time consuming process. Consequently, we hope this work stimulates future research bridging the gap between human and machine perception in the space of automated SAS image interpretation.
AB - We demonstrate the ease by which deep networks are fooled by sonar-relevant distribution shifts induced by sonar-relevant transforms, imaging errors common to SAS, and unseen target/background combinations. Furthermore, the generated images fooling the networks are trivial for human operators to interpret. We posit this disconnect between human and machine performance is an open area of research and, when reconciled, improves human-machine trust. Our goal with this work is to begin discerning where deep network performance (specifically convolutional neural networks (CNNs)) deteriorates and how their perception model differs from humans. Specifically, we show network performance varies widely across contemporary architectures and training schemes on: (1) images derived from a set of sonar-relevant transformations, which we call semantically stable, (2) imagery perturbed with quadratic phase error (common to SAS), and (3) a synthetic target dataset created by injecting real targets into unseen real backgrounds. Finally, we delineate the relationship between spatial frequency and network performance and find many networks rely almost exclusively on low-frequency content to make their predictions. These results may help illuminate why changes to a sonar system or simulation sometimes necessitate complete network retraining to accommodate the 'new' data; a time consuming process. Consequently, we hope this work stimulates future research bridging the gap between human and machine perception in the space of automated SAS image interpretation.
UR - http://www.scopus.com/inward/record.url?scp=85145774686&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85145774686&partnerID=8YFLogxK
U2 - 10.1109/OCEANS47191.2022.9977362
DO - 10.1109/OCEANS47191.2022.9977362
M3 - Conference contribution
AN - SCOPUS:85145774686
T3 - Oceans Conference Record (IEEE)
BT - OCEANS 2022 Hampton Roads
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 OCEANS Hampton Roads, OCEANS 2022
Y2 - 17 October 2022 through 20 October 2022
ER -