Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard

Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze? / Eberle, Oliver; Brandl, Stephanie; Pilot, Jonas; Søgaard, Anders.

ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). ed. / Smaranda Muresan; Preslav Nakov; Aline Villavicencio. Association for Computational Linguistics (ACL), 2022. p. 4295-4309 (Proceedings of the Annual Meeting of the Association for Computational Linguistics, Vol. 1).

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Harvard

Eberle, O, Brandl, S, Pilot, J & Søgaard, A 2022, Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze? in S Muresan, P Nakov & A Villavicencio (eds), ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). Association for Computational Linguistics (ACL), Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 4295-4309, 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022, Dublin, Ireland, 22/05/2022. https://doi.org/10.18653/v1/2022.acl-long.296

APA

Eberle, O., Brandl, S., Pilot, J., & Søgaard, A. (2022). Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze? In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers) (pp. 4295-4309). Association for Computational Linguistics (ACL). Proceedings of the Annual Meeting of the Association for Computational Linguistics Vol. 1 https://doi.org/10.18653/v1/2022.acl-long.296

Vancouver

Eberle O, Brandl S, Pilot J, Søgaard A. Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze? In Muresan S, Nakov P, Villavicencio A, editors, ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). Association for Computational Linguistics (ACL). 2022. p. 4295-4309. (Proceedings of the Annual Meeting of the Association for Computational Linguistics, Vol. 1). https://doi.org/10.18653/v1/2022.acl-long.296

Author

Eberle, Oliver ; Brandl, Stephanie ; Pilot, Jonas ; Søgaard, Anders. / Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?. ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers). editor / Smaranda Muresan ; Preslav Nakov ; Aline Villavicencio. Association for Computational Linguistics (ACL), 2022. pp. 4295-4309 (Proceedings of the Annual Meeting of the Association for Computational Linguistics, Vol. 1).

Bibtex

@inproceedings{f723a3fbe66d4f87beb27aee5d96eb94,
title = "Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?",
abstract = "Learned self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. We compare attention functions across two task-specific reading datasets for sentiment analysis and relation extraction. We find the predictiveness of large-scale pretrained self-attention for human attention depends on 'what is in the tail', e.g., the syntactic nature of rare contexts. Further, we observe that task-specific fine-tuning does not increase the correlation with human task-specific reading. Through an input reduction experiment we give complementary insights on the sparsity and fidelity trade-off, showing that lower-entropy attention vectors are more faithful.",
author = "Oliver Eberle and Stephanie Brandl and Jonas Pilot and Anders S{\o}gaard",
note = "Publisher Copyright: {\textcopyright} 2022 Association for Computational Linguistics.; 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 ; Conference date: 22-05-2022 Through 27-05-2022",
year = "2022",
doi = "10.18653/v1/2022.acl-long.296",
language = "English",
series = "Proceedings of the Annual Meeting of the Association for Computational Linguistics",
pages = "4295--4309",
editor = "Smaranda Muresan and Preslav Nakov and Aline Villavicencio",
booktitle = "ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)",
publisher = "Association for Computational Linguistics (ACL)",
address = "United States",

}

RIS

TY - GEN

T1 - Do Transformer Models Show Similar Attention Patterns to Task-Specific Human Gaze?

AU - Eberle, Oliver

AU - Brandl, Stephanie

AU - Pilot, Jonas

AU - Søgaard, Anders

N1 - Publisher Copyright: © 2022 Association for Computational Linguistics.

PY - 2022

Y1 - 2022

N2 - Learned self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. We compare attention functions across two task-specific reading datasets for sentiment analysis and relation extraction. We find the predictiveness of large-scale pretrained self-attention for human attention depends on 'what is in the tail', e.g., the syntactic nature of rare contexts. Further, we observe that task-specific fine-tuning does not increase the correlation with human task-specific reading. Through an input reduction experiment we give complementary insights on the sparsity and fidelity trade-off, showing that lower-entropy attention vectors are more faithful.

AB - Learned self-attention functions in state-of-the-art NLP models often correlate with human attention. We investigate whether self-attention in large-scale pre-trained language models is as predictive of human eye fixation patterns during task-reading as classical cognitive models of human attention. We compare attention functions across two task-specific reading datasets for sentiment analysis and relation extraction. We find the predictiveness of large-scale pretrained self-attention for human attention depends on 'what is in the tail', e.g., the syntactic nature of rare contexts. Further, we observe that task-specific fine-tuning does not increase the correlation with human task-specific reading. Through an input reduction experiment we give complementary insights on the sparsity and fidelity trade-off, showing that lower-entropy attention vectors are more faithful.

UR - http://www.scopus.com/inward/record.url?scp=85138337294&partnerID=8YFLogxK

U2 - 10.18653/v1/2022.acl-long.296

DO - 10.18653/v1/2022.acl-long.296

M3 - Article in proceedings

AN - SCOPUS:85138337294

T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics

SP - 4295

EP - 4309

BT - ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)

A2 - Muresan, Smaranda

A2 - Nakov, Preslav

A2 - Villavicencio, Aline

PB - Association for Computational Linguistics (ACL)

T2 - 60th Annual Meeting of the Association for Computational Linguistics, ACL 2022

Y2 - 22 May 2022 through 27 May 2022

ER -

ID: 341489931