Shared Task System Description: Frustratingly Hard Compositionality Prediction
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Shared Task System Description: Frustratingly Hard Compositionality Prediction. / Johannsen, Anders Trærup; Martinez Alonso, Hector; Rishøj, Christian; Søgaard, Anders.
Proceedings of the Workshop on Distributional Semantics and Compositionality (DiSCo'2011). Portland, Oregon : Association for Computational Linguistics, 2011. p. 29-32.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Shared Task System Description: Frustratingly Hard Compositionality Prediction
AU - Johannsen, Anders Trærup
AU - Martinez Alonso, Hector
AU - Rishøj, Christian
AU - Søgaard, Anders
PY - 2011/6
Y1 - 2011/6
N2 - We considered a wide range of features for the DiSCo 2011 shared task about compositionality prediction for word pairs, including COALS-based endocentricity scores, compositionality scores based on distributional clusters, statistics about wordnet-induced paraphrases, hyphenation, and the likelihood of long translation equivalents in other languages. Many of the features we considered correlated significantly with human compositionality scores, but in support vector regression experiments we obtained the best results using only COALS-based endocentricity scores. Our system was nevertheless the best performing system in the shared task, and average error reductions over a simple baseline in cross-validation were 13.7% for English glish and 50.1% for German.
AB - We considered a wide range of features for the DiSCo 2011 shared task about compositionality prediction for word pairs, including COALS-based endocentricity scores, compositionality scores based on distributional clusters, statistics about wordnet-induced paraphrases, hyphenation, and the likelihood of long translation equivalents in other languages. Many of the features we considered correlated significantly with human compositionality scores, but in support vector regression experiments we obtained the best results using only COALS-based endocentricity scores. Our system was nevertheless the best performing system in the shared task, and average error reductions over a simple baseline in cross-validation were 13.7% for English glish and 50.1% for German.
M3 - Article in proceedings
SN - 9781937284022
SP - 29
EP - 32
BT - Proceedings of the Workshop on Distributional Semantics and Compositionality (DiSCo'2011)
PB - Association for Computational Linguistics
CY - Portland, Oregon
ER -
ID: 34349517