Parsing as pretraining

Department of Computer Science

Parsing as pretraining

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Standard

Parsing as pretraining. / Vilares, David; Strzyz, Michalina ; Søgaard, Anders; Gómez-Rodrıguez, Carlos .

Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020). AAAI Press, 2020. p. 9114-9121.

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Harvard

Vilares, D, Strzyz, M, Søgaard, A & Gómez-Rodrıguez, C 2020, Parsing as pretraining. in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020). AAAI Press, pp. 9114-9121., Thirty-Forth AAAI Conference on Artificial Intelligence, New York, United States, 07/02/2020. https://doi.org/10.1609/aaai.v34i05.6446

APA

Vilares, D., Strzyz, M., Søgaard, A., & Gómez-Rodrıguez, C. (2020). Parsing as pretraining. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020) (pp. 9114-9121.). AAAI Press. https://doi.org/10.1609/aaai.v34i05.6446

Vancouver

Vilares D, Strzyz M, Søgaard A, Gómez-Rodrıguez C. Parsing as pretraining. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020). AAAI Press. 2020. p. 9114-9121. https://doi.org/10.1609/aaai.v34i05.6446

Author

Vilares, David ; Strzyz, Michalina ; Søgaard, Anders ; Gómez-Rodrıguez, Carlos . / Parsing as pretraining. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020). AAAI Press, 2020. pp. 9114-9121.

Bibtex

@inproceedings{c350591648684017afa73ffa27fedcd0,

title = "Parsing as pretraining",

abstract = "Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups such as constituent and dependency parsing. This paper addresses this problem and does full parsing (on English) relying only on pretraining architectures – and no decoding. We first cast constituent and dependency parsing as sequence tagging. We then use a single feed-forward layer to directly map word vectors to labels that encode a linearized tree. This is used to: (i) see how far we can reach on syntax modelling with just pretrained encoders, and (ii) shed some light about the syntax-sensitivity of different word vectors (by freezing the weights of the pretraining network during training). For evaluation, we use bracketing F1-score and las, and analyze in-depth differences across representations for span lengths and dependency displacements. The overall results surpass existing sequence tagging parsers on the ptb (93.5%) and end-to-end en-ewt ud (78.8%).",

author = "David Vilares and Michalina Strzyz and Anders S{\o}gaard and Carlos G{\'o}mez-Rodrıguez",

year = "2020",

doi = "10.1609/aaai.v34i05.6446",

language = "English",

pages = "9114--9121.",

booktitle = "Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020)",

publisher = "AAAI Press",

note = "Thirty-Forth AAAI Conference on Artificial Intelligence : AAAI 2020 ; Conference date: 07-02-2020 Through 12-02-2020",

url = "https://aaai.org/Conferences/AAAI-20/",

}

RIS

TY - GEN

T1 - Parsing as pretraining

AU - Vilares, David

AU - Strzyz, Michalina

AU - Søgaard, Anders

AU - Gómez-Rodrıguez, Carlos

PY - 2020

Y1 - 2020

N2 - Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups such as constituent and dependency parsing. This paper addresses this problem and does full parsing (on English) relying only on pretraining architectures – and no decoding. We first cast constituent and dependency parsing as sequence tagging. We then use a single feed-forward layer to directly map word vectors to labels that encode a linearized tree. This is used to: (i) see how far we can reach on syntax modelling with just pretrained encoders, and (ii) shed some light about the syntax-sensitivity of different word vectors (by freezing the weights of the pretraining network during training). For evaluation, we use bracketing F1-score and las, and analyze in-depth differences across representations for span lengths and dependency displacements. The overall results surpass existing sequence tagging parsers on the ptb (93.5%) and end-to-end en-ewt ud (78.8%).

AB - Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups such as constituent and dependency parsing. This paper addresses this problem and does full parsing (on English) relying only on pretraining architectures – and no decoding. We first cast constituent and dependency parsing as sequence tagging. We then use a single feed-forward layer to directly map word vectors to labels that encode a linearized tree. This is used to: (i) see how far we can reach on syntax modelling with just pretrained encoders, and (ii) shed some light about the syntax-sensitivity of different word vectors (by freezing the weights of the pretraining network during training). For evaluation, we use bracketing F1-score and las, and analyze in-depth differences across representations for span lengths and dependency displacements. The overall results surpass existing sequence tagging parsers on the ptb (93.5%) and end-to-end en-ewt ud (78.8%).

U2 - 10.1609/aaai.v34i05.6446

DO - 10.1609/aaai.v34i05.6446

M3 - Article in proceedings

SP - 9114-9121.

BT - Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020)

PB - AAAI Press

T2 - Thirty-Forth AAAI Conference on Artificial Intelligence

Y2 - 7 February 2020 through 12 February 2020

ER -

ID: 258333711