Evaluation of Summarization Systems across Gender, Age, and Race
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Documents
- Evaluation of Summarization Systems across Gender, Age, and Race
Final published version, 264 KB, PDF document
Summarization systems are ultimately evaluated by human annotators and raters. Usually, annotators and raters do not reflect the demographics of end users, but are recruited through student populations or crowdsourcing platforms with skewed demographics. For two different evaluation scenarios – evaluation against gold summaries and system output ratings – we show that summary evaluation is sensitive to protected attributes. This can severely bias system development and evaluation, leading us to build models that cater for some groups rather than others.
Original language | English |
---|---|
Title of host publication | Proceedings of the Third Workshop on New Frontiers in Summarization |
Publisher | Association for Computational Linguistics |
Publication date | 2021 |
Pages | 51–56 |
DOIs | |
Publication status | Published - 2021 |
Event | 3rd Workshop on New Frontiers in Summarization - Online Duration: 10 Nov 2021 → 10 Nov 2021 |
Conference
Conference | 3rd Workshop on New Frontiers in Summarization |
---|---|
By | Online |
Periode | 10/11/2021 → 10/11/2021 |
ID: 300074299