20072025

Research activity per year

Personal profile

Research interests

[Note: This profile is incomplete, especially with regard to my publications. See http://shomir.net  for many more.]

My research brings together natural language processing (NLP), privacy, and artificial intelligence.

I am interested in solving problems to enable computers to do meaningful work with large volumes of natural language text. My lab develops new methods for NLP and applies them to a variety of domains, including privacy, online social networks, web science, and digital libraries. I am particularly interested in breaking down technology's "walls of text", i.e., situations where a human user or decision-maker is expected to consume a large quantity of text to take action while lacking sufficient resources (time, expertise) to properly understand what they have been given. I have applied this paradigm to privacy policies, scholarly manuscripts, documents from the world wide web, and historical texts, and I am always interested in new domains to work with.

Personal profile

I am an Assistant Professor in the College of Information Sciences and Technology at Penn State, where I lead the Human Language Technologies Lab. I am also a Faculty Affiliate of Penn State's Institute for CyberScience and a member of the Social Data Analytics graduate faculty.

From 2016 until 2018 I was an Assistant Professor in the EECS Department at the University of Cincinnati. Prior to that I was a postdoc and a lecturer in Carnegie Mellon University's School of Computer Science and an NSF International Research Fellow in the University of Edinburgh's School of Informatics. I received my PhD in Computer Science from the University of Maryland in 2011.

Expertise related to UN Sustainable Development Goals

In 2015, UN member states agreed to 17 global Sustainable Development Goals (SDGs) to end poverty, protect the planet and ensure prosperity for all. This person’s work contributes towards the following SDG(s):

  • SDG 3 - Good Health and Well-being

Education/Academic qualification

Computer Science, PhD, University of Maryland

Award Date: May 1 2011

Computer Science, M.S., University of Maryland

Award Date: May 1 2008

Computer Science, B.S., Virginia Tech

Award Date: May 1 2005

Mathematics, B.S, Virginia Tech

Award Date: May 1 2005

Philosophy, B.A., Virginia Tech

Award Date: May 1 2005

Researcher Defined Keywords

  • natural language processing
  • computational linguistics
  • privacy
  • artificial intelligence

Fingerprint

Dive into the research topics where Shomir Wilson is active. These topic labels come from the works of this person. Together they form a unique fingerprint.
  • 1 Similar Profiles

Collaborations and top research areas from the last five years

Recent external collaboration on country/territory level. Dive into details by clicking on the dots or
  • Incorporating Taxonomic Reasoning and Regulatory Knowledge into Automated Privacy Question Answering

    Ravichander, A., Yang, I., Chen, R., Wilson, S., Norton, T. & Sadeh, N., 2025, Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings. Barhamgi, M., Wang, H. & Wang, X. (eds.). Springer Science and Business Media Deutschland GmbH, p. 444-460 17 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 15436 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  • Automated Detection and Analysis of Data Practices Using A Real-World Corpus

    Srinath, M., Venkit, P., Badillo, M., Schaub, F., Giles, C. L. & Wilson, S., 2024, 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 - Proceedings of the Conference. Ku, L.-W., Martins, A. & Srikumar, V. (eds.). Association for Computational Linguistics (ACL), p. 4567-4574 8 p. (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  • Creation and Analysis of an International Corpus of Privacy Laws

    Gupta, S., Gopi, G., Balaji, H., Poplavska, E., O'Toole, N., Arora, S., Norton, T., Sadeh, N. & Wilson, S., 2024, 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings. Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S. & Xue, N. (eds.). European Language Resources Association (ELRA), p. 4092-4105 14 p. (2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  • Documenting the Unwritten Curriculum of Student Research

    Wilson, S., 2024, TeachNLP 2024 - 6th Workshop on Teaching NLP, Proceedings of the Workshop. Al-azzawi, S., Biester, L., Kovacs, G., Marasovic, A., Mathur, L., Mieskes, M. & Weissweiler, L. (eds.). Association for Computational Linguistics (ACL), p. 1-3 3 p. (TeachNLP 2024 - 6th Workshop on Teaching NLP, Proceedings of the Workshop).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

  • Race and Privacy in Broadcast Police Communications

    Narayanan Venkit, P., Graziul, C., Goodman, M. A., Kenny, S. N. & Wilson, S., Nov 8 2024, In: Proceedings of the ACM on Human-Computer Interaction. 8, CSCW2, 382.

    Research output: Contribution to journalArticlepeer-review

    Open Access