Understanding models understanding language

Department of Computer Science

Understanding models understanding language

Research output: Contribution to journal › Journal article › Research › peer-review

Standard

Understanding models understanding language. / Søgaard, Anders.

In: Synthese, Vol. 200, No. 6, 443, 2022, p. 1-16.

Research output: Contribution to journal › Journal article › Research › peer-review

Harvard

Søgaard, A 2022, 'Understanding models understanding language', Synthese, vol. 200, no. 6, 443, pp. 1-16. https://doi.org/10.1007/s11229-022-03931-4

APA

Søgaard, A. (2022). Understanding models understanding language. Synthese, 200(6), 1-16. [443]. https://doi.org/10.1007/s11229-022-03931-4

Vancouver

Søgaard A. Understanding models understanding language. Synthese. 2022;200(6):1-16. 443. https://doi.org/10.1007/s11229-022-03931-4

Author

Søgaard, Anders. / Understanding models understanding language. In: Synthese. 2022 ; Vol. 200, No. 6. pp. 1-16.

Bibtex

@article{a17b9f58a8dd462f99c80eed2fae8d52,

title = "Understanding models understanding language",

abstract = "Landgrebe and Smith (Synthese 198(March):2061–2081, 2021) present an unflattering diagnosis of recent advances in what they call language-centric artificial intelligence—perhaps more widely known as natural language processing: The models that are currently employed do not have sufficient expressivity, will not generalize, and are fundamentally unable to induce linguistic semantics, they say. The diagnosis is mainly derived from an analysis of the widely used Transformer architecture. Here I address a number of misunderstandings in their analysis, and present what I take to be a more adequate analysis of the ability of Transformer models to learn natural language semantics. To avoid confusion, I distinguish between inferential and referential semantics. Landgrebe and Smith (2021){\textquoteright}s analysis of the Transformer architecture{\textquoteright}s expressivity and generalization concerns inferential semantics. This part of their diagnosis is shown to rely on misunderstandings of technical properties of Transformers. Landgrebe and Smith (2021) also claim that referential semantics is unobtainable for Transformer models. In response, I present a non-technical discussion of techniques for grounding Transformer models, giving them referential semantics, even in the absence of supervision. I also present a simple thought experiment to highlight the mechanisms that would lead to referential semantics, and discuss in what sense models that are grounded in this way, can be said to understand language. Finally, I discuss the approach Landgrebe and Smith (2021) advocate for, namely manual specification of formal grammars that associate linguistic expressions with logical form.",

keywords = "Artificial intelligence, Language, Mind",

author = "Anders S{\o}gaard",

note = "Publisher Copyright: {\textcopyright} 2022, The Author(s).",

year = "2022",

doi = "10.1007/s11229-022-03931-4",

language = "English",

volume = "200",

pages = "1--16",

journal = "Synthese",

issn = "0039-7857",

publisher = "Springer",

number = "6",

}

RIS

TY - JOUR

T1 - Understanding models understanding language

AU - Søgaard, Anders

PY - 2022

Y1 - 2022

N2 - Landgrebe and Smith (Synthese 198(March):2061–2081, 2021) present an unflattering diagnosis of recent advances in what they call language-centric artificial intelligence—perhaps more widely known as natural language processing: The models that are currently employed do not have sufficient expressivity, will not generalize, and are fundamentally unable to induce linguistic semantics, they say. The diagnosis is mainly derived from an analysis of the widely used Transformer architecture. Here I address a number of misunderstandings in their analysis, and present what I take to be a more adequate analysis of the ability of Transformer models to learn natural language semantics. To avoid confusion, I distinguish between inferential and referential semantics. Landgrebe and Smith (2021)’s analysis of the Transformer architecture’s expressivity and generalization concerns inferential semantics. This part of their diagnosis is shown to rely on misunderstandings of technical properties of Transformers. Landgrebe and Smith (2021) also claim that referential semantics is unobtainable for Transformer models. In response, I present a non-technical discussion of techniques for grounding Transformer models, giving them referential semantics, even in the absence of supervision. I also present a simple thought experiment to highlight the mechanisms that would lead to referential semantics, and discuss in what sense models that are grounded in this way, can be said to understand language. Finally, I discuss the approach Landgrebe and Smith (2021) advocate for, namely manual specification of formal grammars that associate linguistic expressions with logical form.

AB - Landgrebe and Smith (Synthese 198(March):2061–2081, 2021) present an unflattering diagnosis of recent advances in what they call language-centric artificial intelligence—perhaps more widely known as natural language processing: The models that are currently employed do not have sufficient expressivity, will not generalize, and are fundamentally unable to induce linguistic semantics, they say. The diagnosis is mainly derived from an analysis of the widely used Transformer architecture. Here I address a number of misunderstandings in their analysis, and present what I take to be a more adequate analysis of the ability of Transformer models to learn natural language semantics. To avoid confusion, I distinguish between inferential and referential semantics. Landgrebe and Smith (2021)’s analysis of the Transformer architecture’s expressivity and generalization concerns inferential semantics. This part of their diagnosis is shown to rely on misunderstandings of technical properties of Transformers. Landgrebe and Smith (2021) also claim that referential semantics is unobtainable for Transformer models. In response, I present a non-technical discussion of techniques for grounding Transformer models, giving them referential semantics, even in the absence of supervision. I also present a simple thought experiment to highlight the mechanisms that would lead to referential semantics, and discuss in what sense models that are grounded in this way, can be said to understand language. Finally, I discuss the approach Landgrebe and Smith (2021) advocate for, namely manual specification of formal grammars that associate linguistic expressions with logical form.

KW - Artificial intelligence

KW - Language

KW - Mind

UR - http://www.scopus.com/inward/record.url?scp=85140639413&partnerID=8YFLogxK

U2 - 10.1007/s11229-022-03931-4

DO - 10.1007/s11229-022-03931-4

M3 - Journal article

AN - SCOPUS:85140639413

VL - 200

SP - 1

EP - 16

JO - Synthese

JF - Synthese

SN - 0039-7857

IS - 6

M1 - 443

ER -

ID: 324677696