WebQAmGaze: A Multilingual Webcam Eye-Tracking-While-Reading Dataset

Research output: Working paperPreprintResearch

Standard

WebQAmGaze : A Multilingual Webcam Eye-Tracking-While-Reading Dataset. / Nunes Ribeiro, Tiago Filipe; Brandl, Stephanie; Søgaard, Anders; Hollenstein, Nora.

arXiv.org, 2023.

Research output: Working paperPreprintResearch

Harvard

Nunes Ribeiro, TF, Brandl, S, Søgaard, A & Hollenstein, N 2023 'WebQAmGaze: A Multilingual Webcam Eye-Tracking-While-Reading Dataset' arXiv.org. <https://arxiv.org/abs/2303.17876>

APA

Nunes Ribeiro, T. F., Brandl, S., Søgaard, A., & Hollenstein, N. (2023). WebQAmGaze: A Multilingual Webcam Eye-Tracking-While-Reading Dataset. arXiv.org. https://arxiv.org/abs/2303.17876

Vancouver

Nunes Ribeiro TF, Brandl S, Søgaard A, Hollenstein N. WebQAmGaze: A Multilingual Webcam Eye-Tracking-While-Reading Dataset. arXiv.org. 2023.

Author

Nunes Ribeiro, Tiago Filipe ; Brandl, Stephanie ; Søgaard, Anders ; Hollenstein, Nora. / WebQAmGaze : A Multilingual Webcam Eye-Tracking-While-Reading Dataset. arXiv.org, 2023.

Bibtex

@techreport{3c36ea27d2034bf2b3d6dfeec12f882b,
title = "WebQAmGaze: A Multilingual Webcam Eye-Tracking-While-Reading Dataset",
abstract = "We create WebQAmGaze, a multilingual low-cost eye-tracking-while-reading dataset, designed to support the development of fair and transparent NLP models. WebQAmGaze includes webcam eye-tracking data from 332 participants naturally reading English, Spanish, and German texts. Each participant performs two reading tasks composed of five texts, a normal reading and an information-seeking task. After preprocessing the data, we find that fixations on relevant spans seem to indicate correctness when answering the comprehension questions. Additionally, we perform a comparative analysis of the data collected to high-quality eye-tracking data. The results show a moderate correlation between the features obtained with the webcam-ET compared to those of a commercial ET device. We believe this data can advance webcam-based reading studies and open a way to cheaper and more accessible data collection. WebQAmGaze is useful to learn about the cognitive processes behind question answering (QA) and to apply these insights to computational models of language understanding.",
author = "{Nunes Ribeiro}, {Tiago Filipe} and Stephanie Brandl and Anders S{\o}gaard and Nora Hollenstein",
year = "2023",
language = "English",
publisher = "arXiv.org",
type = "WorkingPaper",
institution = "arXiv.org",

}

RIS

TY - UNPB

T1 - WebQAmGaze

T2 - A Multilingual Webcam Eye-Tracking-While-Reading Dataset

AU - Nunes Ribeiro, Tiago Filipe

AU - Brandl, Stephanie

AU - Søgaard, Anders

AU - Hollenstein, Nora

PY - 2023

Y1 - 2023

N2 - We create WebQAmGaze, a multilingual low-cost eye-tracking-while-reading dataset, designed to support the development of fair and transparent NLP models. WebQAmGaze includes webcam eye-tracking data from 332 participants naturally reading English, Spanish, and German texts. Each participant performs two reading tasks composed of five texts, a normal reading and an information-seeking task. After preprocessing the data, we find that fixations on relevant spans seem to indicate correctness when answering the comprehension questions. Additionally, we perform a comparative analysis of the data collected to high-quality eye-tracking data. The results show a moderate correlation between the features obtained with the webcam-ET compared to those of a commercial ET device. We believe this data can advance webcam-based reading studies and open a way to cheaper and more accessible data collection. WebQAmGaze is useful to learn about the cognitive processes behind question answering (QA) and to apply these insights to computational models of language understanding.

AB - We create WebQAmGaze, a multilingual low-cost eye-tracking-while-reading dataset, designed to support the development of fair and transparent NLP models. WebQAmGaze includes webcam eye-tracking data from 332 participants naturally reading English, Spanish, and German texts. Each participant performs two reading tasks composed of five texts, a normal reading and an information-seeking task. After preprocessing the data, we find that fixations on relevant spans seem to indicate correctness when answering the comprehension questions. Additionally, we perform a comparative analysis of the data collected to high-quality eye-tracking data. The results show a moderate correlation between the features obtained with the webcam-ET compared to those of a commercial ET device. We believe this data can advance webcam-based reading studies and open a way to cheaper and more accessible data collection. WebQAmGaze is useful to learn about the cognitive processes behind question answering (QA) and to apply these insights to computational models of language understanding.

M3 - Preprint

BT - WebQAmGaze

PB - arXiv.org

ER -

ID: 381729178