Projects per year
Personal profile
Research interests
[Note: This profile is incomplete, especially with regard to my publications. See http://shomir.net for many more.]
My research brings together natural language processing (NLP), privacy, and artificial intelligence.
I am interested in solving problems to enable computers to do meaningful work with large volumes of natural language text. My lab develops new methods for NLP and applies them to a variety of domains, including privacy, online social networks, web science, and digital libraries. I am particularly interested in breaking down technology's "walls of text", i.e., situations where a human user or decision-maker is expected to consume a large quantity of text to take action while lacking sufficient resources (time, expertise) to properly understand what they have been given. I have applied this paradigm to privacy policies, scholarly manuscripts, documents from the world wide web, and historical texts, and I am always interested in new domains to work with.
Personal profile
I am an Assistant Professor in the College of Information Sciences and Technology at Penn State, where I lead the Human Language Technologies Lab. I am also a Faculty Affiliate of Penn State's Institute for CyberScience and a member of the Social Data Analytics graduate faculty.
From 2016 until 2018 I was an Assistant Professor in the EECS Department at the University of Cincinnati. Prior to that I was a postdoc and a lecturer in Carnegie Mellon University's School of Computer Science and an NSF International Research Fellow in the University of Edinburgh's School of Informatics. I received my PhD in Computer Science from the University of Maryland in 2011.
Expertise related to UN Sustainable Development Goals
In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This person’s work contributes towards the following SDG(s):
Education/Academic qualification
Computer Science, PhD, University of Maryland
Award Date: May 1 2011
Computer Science, M.S., University of Maryland
Award Date: May 1 2008
Computer Science, B.S., Virginia Tech
Award Date: May 1 2005
Mathematics, B.S, Virginia Tech
Award Date: May 1 2005
Philosophy, B.A., Virginia Tech
Award Date: May 1 2005
Researcher Defined Keywords
- natural language processing
- computational linguistics
- privacy
- artificial intelligence
Fingerprint
- 1 Similar Profiles
Collaborations and top research areas from the last five years
-
CAREER: Large-Scale Exploration and Interpretation of Consumer-Oriented Legal Documents
Wilson, S. (PI)
8/1/23 → 7/31/28
Project: Research project
-
SaTC: CORE: Small: Toward Privacy Equity through Contextual Understanding of Self-Disclosure
Wilson, S. (CoPI) & Rajtmajer, S. (PI)
6/1/23 → 5/31/26
Project: Research project
-
Collaborative Research: SaTC: CORE: Medium: A Large-Scale, Longitudinal Resource to Advance Technical and Legal Understanding of Textual Privacy Information
Wilson, S. (PI) & Giles, C. L. C. L. (CoPI)
7/1/21 → 6/30/24
Project: Research project
-
SaTC: CORE: Medium: Collaborative: Automatically Answering People's Privacy Questions
Wilson, S. (PI)
7/15/19 → 12/31/23
Project: Research project
-
IRFP: Metalanguage Identification for Interactive Language Technologies
Wilson, S. (PI)
7/1/13 → 2/28/15
Project: Research project
-
Incorporating Taxonomic Reasoning and Regulatory Knowledge into Automated Privacy Question Answering
Ravichander, A., Yang, I., Chen, R., Wilson, S., Norton, T. & Sadeh, N., 2025, Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings. Barhamgi, M., Wang, H. & Wang, X. (eds.). Springer Science and Business Media Deutschland GmbH, p. 444-460 17 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 15436 LNCS).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
-
Automated Detection and Analysis of Data Practices Using A Real-World Corpus
Srinath, M., Venkit, P., Badillo, M., Schaub, F., Giles, C. L. & Wilson, S., 2024, 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 - Proceedings of the Conference. Ku, L.-W., Martins, A. & Srikumar, V. (eds.). Association for Computational Linguistics (ACL), p. 4567-4574 8 p. (Proceedings of the Annual Meeting of the Association for Computational Linguistics).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
-
Creation and Analysis of an International Corpus of Privacy Laws
Gupta, S., Gopi, G., Balaji, H., Poplavska, E., O'Toole, N., Arora, S., Norton, T., Sadeh, N. & Wilson, S., 2024, 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings. Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S. & Xue, N. (eds.). European Language Resources Association (ELRA), p. 4092-4105 14 p. (2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
-
Documenting the Unwritten Curriculum of Student Research
Wilson, S., 2024, TeachNLP 2024 - 6th Workshop on Teaching NLP, Proceedings of the Workshop. Al-azzawi, S., Biester, L., Kovacs, G., Marasovic, A., Mathur, L., Mieskes, M. & Weissweiler, L. (eds.). Association for Computational Linguistics (ACL), p. 1-3 3 p. (TeachNLP 2024 - 6th Workshop on Teaching NLP, Proceedings of the Workshop).Research output: Chapter in Book/Report/Conference proceeding › Conference contribution
-
Race and Privacy in Broadcast Police Communications
Narayanan Venkit, P., Graziul, C., Goodman, M. A., Kenny, S. N. & Wilson, S., Nov 8 2024, In: Proceedings of the ACM on Human-Computer Interaction. 8, CSCW2, 382.Research output: Contribution to journal › Article › peer-review
Open Access