A POMDP Dialogue Policy with 3-way Grounding and Adaptive Sensing for Learning through Communication

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Agents to assist with rescue, surgery, and similar activities could collaborate better with humans if they could learn new strategic behaviors through communication. We introduce a novel POMDP dialogue policy for learning from people. The policy has 3-way grounding of language in the shared physical context, the dialogue context, and persistent knowledge. It can learn distinct but related games, and can continue learning across dialogues for complex games. A novel sensing component supports adaptation to information-sharing differences across people. The single policy performs better than oracle policies customized to specific games and information behavior.

Original languageEnglish (US)
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationEMNLP 2022
EditorsYoav Goldberg, Zornitsa Kozareva, Yue Zhang
PublisherAssociation for Computational Linguistics (ACL)
Pages6796-6809
Number of pages14
ISBN (Electronic)9781959429432
DOIs
StatePublished - 2022
Event2022 Findings of the Association for Computational Linguistics: EMNLP 2022 - Hybrid, Abu Dhabi, United Arab Emirates
Duration: Dec 7 2022Dec 11 2022

Publication series

NameFindings of the Association for Computational Linguistics: EMNLP 2022

Conference

Conference2022 Findings of the Association for Computational Linguistics: EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityHybrid, Abu Dhabi
Period12/7/2212/11/22

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'A POMDP Dialogue Policy with 3-way Grounding and Adaptive Sensing for Learning through Communication'. Together they form a unique fingerprint.

Cite this