TY - CHAP
T1 - Privacy implications of database ranking
AU - Rahman, Farhadur
AU - Liu, Weimo
AU - Thirumuruganathan, Saravanan
AU - Zhang, Nan
AU - Das, Gautam
N1 - Publisher Copyright:
© 2015 VLDB.
PY - 2015
Y1 - 2015
N2 - In recent years, there has been much research in the adoption of Ranked Retrieval model (in addition to the Boolean retrieval model) in structured databases, especially those in a client-server environment (e.g., web databases). With this model, a search query returns top-k tuples according to not just exact matches of selection conditions, but a suitable ranking function. While much research has gone into the design of ranking functions and the efficient processing of top-k queries, this paper studies a novel problem on the privacy implications of database ranking. The motivation is a novel yet serious privacy leakage we found on real-world web databases which is caused by the ranking function design. Many such databases feature private attributes - e.g., a social network allows users to specify certain attributes as only visible to him/herself, but not to others. While these websites generally respect the privacy settings by not directly displaying private attribute values in search query answers, many of them nevertheless take into account such private attributes in the ranking function design. The conventional belief might be that tuple ranks alone are not enough to reveal the private attribute values. Our investigation, however, shows that this is not the case in reality. To address the problem, we introduce a taxonomy of the problem space with two dimensions, (1) the type of query interface and (2) the capability of adversaries. For each subspace, we develop a novel technique which either guarantees the successful inference of private attributes, or does so for a significant portion of realworld tuples. We demonstrate the effectiveness and efficiency of our techniques through theoretical analysis, extensive experiments over real-world datasets, as well as successful online attacks over websites with tens to hundreds of millions of users - e.g., Amazon Goodreads and Renren.com.
AB - In recent years, there has been much research in the adoption of Ranked Retrieval model (in addition to the Boolean retrieval model) in structured databases, especially those in a client-server environment (e.g., web databases). With this model, a search query returns top-k tuples according to not just exact matches of selection conditions, but a suitable ranking function. While much research has gone into the design of ranking functions and the efficient processing of top-k queries, this paper studies a novel problem on the privacy implications of database ranking. The motivation is a novel yet serious privacy leakage we found on real-world web databases which is caused by the ranking function design. Many such databases feature private attributes - e.g., a social network allows users to specify certain attributes as only visible to him/herself, but not to others. While these websites generally respect the privacy settings by not directly displaying private attribute values in search query answers, many of them nevertheless take into account such private attributes in the ranking function design. The conventional belief might be that tuple ranks alone are not enough to reveal the private attribute values. Our investigation, however, shows that this is not the case in reality. To address the problem, we introduce a taxonomy of the problem space with two dimensions, (1) the type of query interface and (2) the capability of adversaries. For each subspace, we develop a novel technique which either guarantees the successful inference of private attributes, or does so for a significant portion of realworld tuples. We demonstrate the effectiveness and efficiency of our techniques through theoretical analysis, extensive experiments over real-world datasets, as well as successful online attacks over websites with tens to hundreds of millions of users - e.g., Amazon Goodreads and Renren.com.
UR - http://www.scopus.com/inward/record.url?scp=84953931993&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84953931993&partnerID=8YFLogxK
U2 - 10.14778/2794367.2794379
DO - 10.14778/2794367.2794379
M3 - Chapter
AN - SCOPUS:84953931993
T3 - Proceedings of the VLDB Endowment
SP - 1106
EP - 1117
BT - Proceedings of the VLDB Endowment
PB - Association for Computing Machinery
T2 - 3rd Workshop on Spatio-Temporal Database Management, STDBM 2006, Co-located with the 32nd International Conference on Very Large Data Bases, VLDB 2006
Y2 - 11 September 2006 through 11 September 2006
ER -