TY - GEN
T1 - Cuckoo feature hashing
T2 - 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
AU - Gao, Jinyang
AU - Ooi, Beng Chin
AU - Shen, Yanyan
AU - Lee, Wang Chien
N1 - Funding Information:
This work is supported by National Research Foundation, Prime Ministers Office, Singapore under CRP Award No. NRF-CRP8-2011-08. Yanyan Shen is supported in part by NSFC (No. 61602297). Wang-Chien Lee is supported in part by the National Science Foundation under Grant No. IIS-1717084.
Publisher Copyright:
© 2018 International Joint Conferences on Artificial Intelligence. All right reserved.
PY - 2018
Y1 - 2018
N2 - Feature hashing is widely used to process large scale sparse features for learning of predictive models. Collisions inherently happen in the hashing process and hurt the model performance. In this paper, we develop a new feature hashing scheme called Cuckoo Feature Hashing (CCFH), which treats feature hashing as a problem of dynamic weight sharing during model training. By leveraging a set of indicators to dynamically decide the weight of each feature based on alternative hash locations, CCFH effectively prevents the collisions between important features to the model, i.e. predictive features, and thus avoid model performance degradation. Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.
AB - Feature hashing is widely used to process large scale sparse features for learning of predictive models. Collisions inherently happen in the hashing process and hurt the model performance. In this paper, we develop a new feature hashing scheme called Cuckoo Feature Hashing (CCFH), which treats feature hashing as a problem of dynamic weight sharing during model training. By leveraging a set of indicators to dynamically decide the weight of each feature based on alternative hash locations, CCFH effectively prevents the collisions between important features to the model, i.e. predictive features, and thus avoid model performance degradation. Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.
UR - http://www.scopus.com/inward/record.url?scp=85055715599&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85055715599&partnerID=8YFLogxK
U2 - 10.24963/ijcai.2018/295
DO - 10.24963/ijcai.2018/295
M3 - Conference contribution
AN - SCOPUS:85055715599
T3 - IJCAI International Joint Conference on Artificial Intelligence
SP - 2135
EP - 2141
BT - Proceedings of the 27th International Joint Conference on Artificial Intelligence, IJCAI 2018
A2 - Lang, Jerome
PB - International Joint Conferences on Artificial Intelligence
Y2 - 13 July 2018 through 19 July 2018
ER -