Guideline Bias in Wizard-of-Oz Dialogues
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Documents
- Fulltext
Final published version, 304 KB, PDF document
NLP models struggle with generalization due to sampling and annotator bias. This paper focuses on a different kind of bias that has received very little attention: guideline bias, i.e., the bias introduced by how our annotator guidelines are formulated. We examine two recently introduced dialogue datasets, CCPE-M and Taskmaster-1, both collected by trained assistants in a Wizard-of-Oz set-up. For CCPE-M, we show how a simple lexical bias for the word like in the guidelines biases the data collection. This bias, in effect, leads to poor performance on data without this bias: a preference elicitation architecture based on BERT suffers a 5.3% absolute drop in performance, when like is replaced with a synonymous phrase, and a 13.2% drop in performance when evaluated on out-of-sample data. For Taskmaster-1, we show how the order in which instructions are presented, biases the data collection.
Original language | English |
---|---|
Title of host publication | BPPF 2021 - 1st Workshop on Benchmarking : Past, Present and Future, Proceedings |
Editors | Kenneth Church, Mark Liberman, Valia Kordoni |
Publisher | Association for Computational Linguistics |
Publication date | 2021 |
Pages | 8-14 |
ISBN (Electronic) | 9781954085589 |
DOIs | |
Publication status | Published - 2021 |
Event | 1st Workshop on Benchmarking: Past, Present and Future, BPPF 2021 - Virtual, Bangkok, Thailand Duration: 5 Aug 2021 → 6 Aug 2021 |
Conference
Conference | 1st Workshop on Benchmarking: Past, Present and Future, BPPF 2021 |
---|---|
Land | Thailand |
By | Virtual, Bangkok |
Periode | 05/08/2021 → 06/08/2021 |
Bibliographical note
Publisher Copyright:
©2021 Association for Computational Linguistics
ID: 291812390