GenPADS: Reinforcing politeness in an end-to-end dialogue system

Kshitij Mishra; Mauajama Firdaus; Asif Ekbal

doi:10.1371/journal.pone.0278323

GenPADS: Reinforcing politeness in an end-to-end dialogue system

PLoS One. 2023 Jan 6;18(1):e0278323. doi: 10.1371/journal.pone.0278323. eCollection 2023.

Authors

Kshitij Mishra¹, Mauajama Firdaus¹, Asif Ekbal¹

Affiliation

¹ Department of Computer Science and Engineering, Indian Institute of Technology Patna, Bihta, Bihar, India.

Abstract

In a task-oriented dialogue setting, user's mood and demands can change in an ongoing dialogue, which may lead to a non-informative conversation or may result in conversation drop-off. To rectify such scenarios, a conversational agent should be able to learn the user's behaviour online, and form informative, empathetic and interactive responses. To incorporate these three aspects, we propose a novel end-to-end dialogue system GenPADS. First, we build and train two models, viz. a politeness classifier to extract polite information present in user's and agent's utterances and a generation model (G) to generate varying but semantically correct responses. We then incorporate both of these models in a reinforcement learning (RL) setting using two different politeness oriented reward algorithms to adapt and generate polite responses. To train our politeness classifier, we annotate recently released Taskmaster dataset into four fine-grained classes depicting politeness and impoliteness. Further, to train our generator model, we prepare a GenDD dataset using the same Taskmaster dataset. Lastly, we train GenPADS and perform automatic and human evaluation by building seven different user simulators. Detailed analysis reveals that GenPADS performs better than the two considered baselines,viz. a transformer based seq2seq generator model for user's and agent's utterance and a retrieval based politeness adaptive dialogue system (PADS).

Copyright: © 2023 Mishra et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

MeSH terms

Adaptation, Physiological
Algorithms*
Communication*
Humans
Learning
Reinforcement, Psychology

Grants and funding

The author(s) received no specific funding for this work.