TY - JOUR
T1 - Multiobjective Reinforcement Learning for Cognitive Satellite Communications Using Deep Neural Network Ensembles
AU - Ferreira, Paulo Victor Rodrigues
AU - Paffenroth, Randy
AU - Wyglinski, Alexander M.
AU - Hackett, Timothy M.
AU - Bilen, Sven G.
AU - Reinhart, Richard C.
AU - Mortensen, Dale J.
N1 - Publisher Copyright:
© 1983-2012 IEEE.
PY - 2018/5
Y1 - 2018/5
N2 - Future spacecraft communication subsystems will potentially benefit from software-defined radios controlled by artificial intelligence algorithms. In this paper, we propose a novel radio resource allocation algorithm leveraging multiobjective reinforcement learning and artificial neural network ensembles able to manage available resources and conflicting mission-based goals. The uncertainty in the performance of thousands of possible radio parameter combinations and the dynamic behavior of the radio channel over time producing a continuous multidimensional state-action space requires a fixed-size memory continuous state-action mapping instead of the traditional discrete mapping. In addition, actions need to be decoupled from states in order to allow for online learning, performance monitoring, and resource allocation prediction. The proposed approach leverages the authors' previous research on constraining decisions predicted to have poor performance through 'virtual environment exploration.' The simulation results show the performance for different communication mission profiles, and accuracy benchmarks are provided for the future research reference. The proposed approach constitutes part of the core cognitive engine proof-of-concept delivered to the NASA John H. Glenn Research Center's SCaN Testbed radios on-board the International Space Station.
AB - Future spacecraft communication subsystems will potentially benefit from software-defined radios controlled by artificial intelligence algorithms. In this paper, we propose a novel radio resource allocation algorithm leveraging multiobjective reinforcement learning and artificial neural network ensembles able to manage available resources and conflicting mission-based goals. The uncertainty in the performance of thousands of possible radio parameter combinations and the dynamic behavior of the radio channel over time producing a continuous multidimensional state-action space requires a fixed-size memory continuous state-action mapping instead of the traditional discrete mapping. In addition, actions need to be decoupled from states in order to allow for online learning, performance monitoring, and resource allocation prediction. The proposed approach leverages the authors' previous research on constraining decisions predicted to have poor performance through 'virtual environment exploration.' The simulation results show the performance for different communication mission profiles, and accuracy benchmarks are provided for the future research reference. The proposed approach constitutes part of the core cognitive engine proof-of-concept delivered to the NASA John H. Glenn Research Center's SCaN Testbed radios on-board the International Space Station.
UR - http://www.scopus.com/inward/record.url?scp=85046431536&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046431536&partnerID=8YFLogxK
U2 - 10.1109/JSAC.2018.2832820
DO - 10.1109/JSAC.2018.2832820
M3 - Article
AN - SCOPUS:85046431536
SN - 0733-8716
VL - 36
SP - 1030
EP - 1041
JO - IEEE Journal on Selected Areas in Communications
JF - IEEE Journal on Selected Areas in Communications
IS - 5
M1 - 8353861
ER -