TY - GEN
T1 - EOTD
T2 - 18th IEEE International Conference on Data Mining, ICDM 2018
AU - Xiao, Houping
AU - Wang, Fei
AU - Ma, Fenglong
AU - Gao, Jing
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/27
Y1 - 2018/12/27
N2 - A tensor (i.e., an N-mode array) is a natural representation for multidimensional data. Tucker Decomposition (TD) is one of the most popular methods, and a series of batch TD algorithms have been extensively studied and widely applied in signal/image processing, bioinformatics, etc. However, in many applications, the large-scale tensor is dynamically evolving at all modes, which poses significant challenges for existing approaches to track the TD for such dynamic tensors. In this paper, we propose an efficient Online Tucker Decomposition (eOTD) approach to track the TD of dynamic tensors with an arbitrary number of modes. We first propose corollaries on the multiplication of block tensor matrix. Based on this corollary, eOTD allows us 1) to update the projection matrices using those projection matrices from the previous timestamp and the auxiliary matrices from the current timestamp, and 2) to update the core tensor by a sum of tensors that are obtained by multiplying smaller tensors with matrices. The auxiliary matrices are obtained by solving a series of least square regression tasks, not by performing Singular Value Decompositions (SVD). This overcomes the bottleneck in computation and storage caused by computing SVDs on largescale data. A Modified Gram-Schmidt (MGS) process is further applied to orthonormalize the projection matrices. Theoretically, the output of the eOTD framework is guaranteed to be lowrank. We further prove that the MGS process will not increase Tucker decomposition error. Empirically, we demonstrate that the proposed eOTD achieves comparable accuracy with a significant speedup on both synthetic and real data, where the speedup can be more than 1,500 times on large-scale data.
AB - A tensor (i.e., an N-mode array) is a natural representation for multidimensional data. Tucker Decomposition (TD) is one of the most popular methods, and a series of batch TD algorithms have been extensively studied and widely applied in signal/image processing, bioinformatics, etc. However, in many applications, the large-scale tensor is dynamically evolving at all modes, which poses significant challenges for existing approaches to track the TD for such dynamic tensors. In this paper, we propose an efficient Online Tucker Decomposition (eOTD) approach to track the TD of dynamic tensors with an arbitrary number of modes. We first propose corollaries on the multiplication of block tensor matrix. Based on this corollary, eOTD allows us 1) to update the projection matrices using those projection matrices from the previous timestamp and the auxiliary matrices from the current timestamp, and 2) to update the core tensor by a sum of tensors that are obtained by multiplying smaller tensors with matrices. The auxiliary matrices are obtained by solving a series of least square regression tasks, not by performing Singular Value Decompositions (SVD). This overcomes the bottleneck in computation and storage caused by computing SVDs on largescale data. A Modified Gram-Schmidt (MGS) process is further applied to orthonormalize the projection matrices. Theoretically, the output of the eOTD framework is guaranteed to be lowrank. We further prove that the MGS process will not increase Tucker decomposition error. Empirically, we demonstrate that the proposed eOTD achieves comparable accuracy with a significant speedup on both synthetic and real data, where the speedup can be more than 1,500 times on large-scale data.
UR - http://www.scopus.com/inward/record.url?scp=85061398625&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85061398625&partnerID=8YFLogxK
U2 - 10.1109/ICDM.2018.00180
DO - 10.1109/ICDM.2018.00180
M3 - Conference contribution
AN - SCOPUS:85061398625
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 1326
EP - 1331
BT - 2018 IEEE International Conference on Data Mining, ICDM 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 November 2018 through 20 November 2018
ER -