TY - GEN
T1 - A machine learning approach for detecting third-party trackers on the web
AU - Wu, Qianru
AU - Liu, Qixu
AU - Zhang, Yuqing
AU - Liu, Peng
AU - Wen, Guanxing
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2016.
PY - 2016
Y1 - 2016
N2 - Nowadays, privacy violation caused by third-party tracking has become a serious problem and yet the most effective method to defend against third-party tracking is based on blacklists. Such method highly depends on the quality of the blacklist database, whose records need to be updated frequently. However, most records are curated manually and very difficult to maintain. To efficiently generate blacklists, we propose a system with high accuracy, named DMTrackerDetector, to detect third-party trackers automatically. Existing methods to detect online tracking have two shortcomings. Firstly, they treat first-party tracking and third-party tracking the same. Secondly, they always focus on a certain way of tracking and can only detect limited trackers. Since anti-tracking technology based on blacklists highly depends on the coverage of the blacklist database, these methods cannot generate high-quality blacklists. To solve these problems, we firstly use the structural hole theory to preserve first-party trackers, and only detect third-party trackers based on supervised machine learning by exploiting the fact that trackers and non-trackers always call different JavaScript APIs for different purposes. The results show that 97.8% of the third-party trackers in our test set can be correctly detected. The blacklist generated by our system not only covers almost all records in the Ghostery list (one of the most popular anti-tracking tools), but also detects 35 unrevealed trackers.
AB - Nowadays, privacy violation caused by third-party tracking has become a serious problem and yet the most effective method to defend against third-party tracking is based on blacklists. Such method highly depends on the quality of the blacklist database, whose records need to be updated frequently. However, most records are curated manually and very difficult to maintain. To efficiently generate blacklists, we propose a system with high accuracy, named DMTrackerDetector, to detect third-party trackers automatically. Existing methods to detect online tracking have two shortcomings. Firstly, they treat first-party tracking and third-party tracking the same. Secondly, they always focus on a certain way of tracking and can only detect limited trackers. Since anti-tracking technology based on blacklists highly depends on the coverage of the blacklist database, these methods cannot generate high-quality blacklists. To solve these problems, we firstly use the structural hole theory to preserve first-party trackers, and only detect third-party trackers based on supervised machine learning by exploiting the fact that trackers and non-trackers always call different JavaScript APIs for different purposes. The results show that 97.8% of the third-party trackers in our test set can be correctly detected. The blacklist generated by our system not only covers almost all records in the Ghostery list (one of the most popular anti-tracking tools), but also detects 35 unrevealed trackers.
UR - http://www.scopus.com/inward/record.url?scp=84990050965&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84990050965&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-45744-4_12
DO - 10.1007/978-3-319-45744-4_12
M3 - Conference contribution
AN - SCOPUS:84990050965
SN - 9783319457437
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 238
EP - 258
BT - Computer Security - 21st European Symposium on Research in Computer Security, ESORICS 2016, Proceedings
A2 - Katsikas, Sokratis
A2 - Meadows, Catherine
A2 - Askoxylakis, Ioannis
A2 - Ioannidis, Sotiris
PB - Springer Verlag
T2 - 21st European Symposium on Research in Computer Security, ESORICS 2016
Y2 - 26 September 2016 through 30 September 2016
ER -