What a Creole Wants, What a Creole Needs

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Documents

In recent years, the natural language processing (NLP) community has given increased attention to the disparity of efforts directed towards high-resource languages over low-resource ones. Efforts to remedy this delta often begin with translations of existing English datasets into other languages. However, this approach ignores that different language communities have different needs. We consider a group of low-resource languages, Creole languages. Creoles are both largely absent from the NLP literature, and also often ignored by society at large due to stigma, despite these languages having sizable and vibrant communities. We demonstrate, through conversations with Creole experts and surveys of Creole-speaking communities, how the things needed from language technology can change dramatically from one language to another, even when the languages are considered to be very similar to each other, as with Creoles. We discuss the prominent themes arising from these conversations, and ultimately demonstrate that useful language technology cannot be built without involving the relevant community.

Original languageEnglish
Title of host publicationProceedings of the Thirteenth Language Resources and Evaluation Conference
EditorsNicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Jan Odijk, Stelios Piperidis
PublisherEuropean Language Resources Association (ELRA)
Publication date2022
Pages6439-6449
ISBN (Electronic)9791095546726
Publication statusPublished - 2022
Event13th International Conference on Language Resources and Evaluation Conference, LREC 2022 - Marseille, France
Duration: 20 Jun 202225 Jun 2022

Conference

Conference13th International Conference on Language Resources and Evaluation Conference, LREC 2022
LandFrance
ByMarseille
Periode20/06/202225/06/2022
Sponsor3M, Emvista, et al., Google, SADILAR, Vocapia

Bibliographical note

Publisher Copyright:
© European Language Resources Association (ELRA), licensed under CC-BY-NC-4.0.

    Research areas

  • Creole, low-resource languages, natural language processing

ID: 341490918