Related Papers
CROWDSOURCING COMPLEX ASSOCIATIONS AMONG WORDS BY MEANS OF A GAME
This paper discusses a new approach to creating semantic resources consisting of complex associations among words that can be used for evaluating the content of word embeddings as well as in various language-learning scenarios. We briefly introduce Codenames-an existing party board game-and the way of recording word associations suggested by human players. Advanced word embedding models are then compared on the collected data and it is demonstrated that they often fail in the cases of complex word associations that go beyond simple contextual interchangeability. We conclude with an initial evaluation of the automatic guessing of associated words based on clues provided by human players and a discussion on further extensions of the system towards a wide language coverage and explanations of word associations in the language learning context.
Collaboration in the Production of a Massively Multilingual Lexicon
Martin Benjamin
This paper discusses the multiple approaches to collaboration that the Kamusi Project is employing in the creation of a massively multilingual lexical resource. The project's data structure enables the inclusion of large amounts of rich data within each sense-specific entry, with transitive concept-based links across languages. Data collection involves mining existing data sets, language experts using an online editing system, crowdsourcing, and games with a purpose. The paper discusses the benefits and drawbacks of each of these elements, and the steps the project is taking to account for those. Special attention is paid to guiding crowd members with targeted questions that produce results in a specific format. Collaboration is seen as an essential method for generating large amounts of linguistic data, as well as for validating the data so it can be considered trustworthy.
Natural Language Engineering
Gamified crowdsourcing for idiom corpora construction
2022 •
Johanna Monti
Learning idiomatic expressions is seen as one of the most challenging stages in second-language learning because of their unpredictable meaning. A similar situation holds for their identification within natural language processing applications such as machine translation and parsing. The lack of high-quality usage samples exacerbates this challenge not only for humans but also for artificial intelligence systems. This article introduces a gamified crowdsourcing approach for collecting language learning materials for idiomatic expressions; a messaging bot is designed as an asynchronous multiplayer game for native speakers who compete with each other while providing idiomatic and nonidiomatic usage examples and rating other players’ entries. As opposed to classical crowd-processing annotation efforts in the field, for the first time in the literature, a crowd-creating & crowd-rating approach is implemented and tested for idiom corpora construction. The approach is language-independent...
Programming games of word association
2010 •
Dilyana Budakova
This paper presents an example of implementing language games for collecting associative relations between words in three languages with the purpose of aiding foreign language self-training in each one of the languages. It discusses possible uses of the collected data for developments in the field of AI and linguistics at a later date. Results from questionnaires among students have been given, concerning both their preferences to the types of games to be used in language training and the effectiveness of the games of association in their self-training activities.
Building Specialized Multilingual Lexical Graphs Using Community Resources
2009 •
Mohammad Daoud
We are describing methods for compiling domain-dedicated multilingual terminological data from various resources. We focus on collecting data from online community users as a main source, therefore, our approach depends on acquiring contributions from volunteers (explicit approach), and it depends on analyzing users’ behaviors to extract interesting patterns and facts (implicit approach). As a generic repository that can handle the collected multilingual terminological data, we are describing the concept of dedicated Multilingual Preterminological Graphs MPGs, and some automatic approaches for constructing them by analyzing the behavior of online community users. A Multilingual Preterminological Graph is a special lexical resource that contains massive amount of terms related to a special domain. We call it preterminological, because it is a raw material that can be used to build a standardized terminological repository. Building such a graph is difficult using traditional approache...
Games in linguistics
2019 •
Julie Hunter
In this paper we set out three consequences of a game-theoretic model for conversation, Message Exchange (ME) Games (Asher et al., 2016), which we think are of linguistic interest. We develop a notion of conversational success, explain subjectivity and bias in interpretation using concepts from epistemic game theory, and characterize the strategic usefulness of using so called expressions of "not at issue" content using ME games.
ACM Transactions on Interactive Intelligent Systems (TiiS)
Phrase detectives: Utilizing collective intelligence for internet-scale language resource creation
2013 •
Livio Robaldo
Collecting and Evaluating Lexical Polarity with A Game With a Purpose
2015 •
Mathieu Lafourcade
Sentiment analysis from a text requires amongst others having a polarity lexical resource. We designed LikeIt, a GWAP (Game With A Purpose) that allows to attribute a positive, negative or neutral value to a term, and thus obtain a resulting polarity for most of the terms of the freely available lexical network of the JeuxDeMots project. We present a quantitative analysis of data obtained through our approach, together with the comparison method we developed to validate them qualitatively.
PARSEME – PARSing and Multiword Expressions within a European multilingual network
2015 •
Manfred Sailer
The aim of this paper is to present PARSEME, a COST Action devoted to the issue of Multiword Expressions in parsing and in linguistic resources (corpora, lexicons). This is a “meta-paper” intended to be the main citation point for any future work referring to PARSEME: it does not describe in detail any single result of the Action, but rather summarises its multifarious activities and provides links to such results (both completed and in progress).
LP2002, Urayasu, Japan
Towards Multiword and Multilingual Lexicons: Between Theory and Practice
2002 •
Quochi Valeria