TY - JOUR
T1 - PrivOnto: A semantic framework for the analysis of privacy policies
AU - Oltramari, Alessandro
AU - Piraviperumal, Dhivya
AU - Schaub, Florian
AU - Wilson, Shomir
AU - Cherivirala, Sushain
AU - Norton, Thomas B.
AU - Russell, N. Cameron
AU - Story, Peter
AU - Reidenberg, Joel
AU - Sadeh, Norman
PY - 2018
Y1 - 2018
N2 - Privacy policies are intended to inform users about the collection and use of their data by websites, mobile apps and other services or appliances they interact with. This also includes informing users about any choices they might have regarding such data practices. However, few users read these often long privacy policies; and those who do have difficulty understanding them, because they are written in convoluted and ambiguous language. A promising approach to help overcome this situation revolves around semi-automatically annotating policies, using combinations of semantic technologies, machine learning and natural language processing to analyze them. In this article, we introduce PrivOnto, a semantic framework to represent annotated privacy policies with an ontology developed in collaboration with privacy experts. PrivOnto has been applied to a corpus of over 23,000 annotated data practices, extracted from a dataset of 115 privacy policies. We designed a collection of 57 SPARQL queries to extract information from the PrivOnto knowledge base, with the dual objective of (1) answering privacy questions users often have and (2) supporting researchers and regulators in the analysis of privacy policies at scale. We present respective findings, after examining the process of developing PrivOnto. Finally, we outline future research and open challenges in using semantic technologies for privacy policy analysis.
AB - Privacy policies are intended to inform users about the collection and use of their data by websites, mobile apps and other services or appliances they interact with. This also includes informing users about any choices they might have regarding such data practices. However, few users read these often long privacy policies; and those who do have difficulty understanding them, because they are written in convoluted and ambiguous language. A promising approach to help overcome this situation revolves around semi-automatically annotating policies, using combinations of semantic technologies, machine learning and natural language processing to analyze them. In this article, we introduce PrivOnto, a semantic framework to represent annotated privacy policies with an ontology developed in collaboration with privacy experts. PrivOnto has been applied to a corpus of over 23,000 annotated data practices, extracted from a dataset of 115 privacy policies. We designed a collection of 57 SPARQL queries to extract information from the PrivOnto knowledge base, with the dual objective of (1) answering privacy questions users often have and (2) supporting researchers and regulators in the analysis of privacy policies at scale. We present respective findings, after examining the process of developing PrivOnto. Finally, we outline future research and open challenges in using semantic technologies for privacy policy analysis.
UR - http://www.mendeley.com/research/privonto-semantic-framework-analysis-privacy-policies
U2 - 10.3233/SW-170283
DO - 10.3233/SW-170283
M3 - Article
C2 - 20583754
SN - 1570-0844
VL - 9
SP - 185
EP - 203
JO - Semantic Web
JF - Semantic Web
IS - 2
ER -