TY - GEN
T1 - Automatic extraction of opt-out choices from privacy policies
AU - Sathyendra, Kanthashree Mysore
AU - Schaub, Florian
AU - Wilson, Shomir
AU - Sadeh, Norman
N1 - Publisher Copyright:
Copyright © 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2016
Y1 - 2016
N2 - Online "notice and choice" is an essential concept in the US FTC's Fair Information Practice Principles. Privacy laws based on these principles include requirements for providing notice about data practices and allowing individuals to exercise control over those practices. Internet users need control over privacy, but their options are hidden in long privacy policies which are cumbersome to read and understand. In this paper, we describe several approaches to automatically extract choice instances from privacy policy documents using natural language processing and machine learning techniques. We define a choice instance as a statement in a privacy policy that indicates the user has discretion over the collection, use, sharing, or retention of their data. We describe supervised machine learning approaches for automatically extracting instances containing opt-out hyperlinks and evaluate the proposed methods using the OPP-115 Corpus, a dataset of annotated privacy policies. Extracting information about privacy choices and controls enables the development of concise and usable interfaces to help Internet users better understand the choices offered by online services. The focus of this paper, however, is to describe such methods to automatically extract useful opt-out hyperlinks from privacy policies.
AB - Online "notice and choice" is an essential concept in the US FTC's Fair Information Practice Principles. Privacy laws based on these principles include requirements for providing notice about data practices and allowing individuals to exercise control over those practices. Internet users need control over privacy, but their options are hidden in long privacy policies which are cumbersome to read and understand. In this paper, we describe several approaches to automatically extract choice instances from privacy policy documents using natural language processing and machine learning techniques. We define a choice instance as a statement in a privacy policy that indicates the user has discretion over the collection, use, sharing, or retention of their data. We describe supervised machine learning approaches for automatically extracting instances containing opt-out hyperlinks and evaluate the proposed methods using the OPP-115 Corpus, a dataset of annotated privacy policies. Extracting information about privacy choices and controls enables the development of concise and usable interfaces to help Internet users better understand the choices offered by online services. The focus of this paper, however, is to describe such methods to automatically extract useful opt-out hyperlinks from privacy policies.
UR - http://www.scopus.com/inward/record.url?scp=85025816348&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85025816348&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85025816348
T3 - AAAI Fall Symposium - Technical Report
SP - 270
EP - 275
BT - FS-16-01
PB - AI Access Foundation
T2 - 2016 AAAI Fall Symposium
Y2 - 17 November 2016 through 19 November 2016
ER -