Digital Library

cab1

 
Title:      BUILDING A CORPUS TO CATEGORIZE ARABIC SHORT TEXT USING GAMES WITH A PURPOSE
Author(s):      Abdennadher Slim, Ayman Heba, Sabty Caroline, Salem Reem, Tarhony Nada, Zohny Sara
ISBN:      978-989-8533-24-1
Editors:      Pedro IsaĆ­as and Bebo White
Year:      2014
Edition:      Single
Keywords:      Games with A Purpose, Arabic short text, Topic Categorization, NLP, Crowdsourcing, Human Computation
Type:      Full Paper
First Page:      59
Last Page:      65
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Text categorization, also known as text classification or topic detection is the task of automatically sorting documents into a predefined set of categories. This is considered to be one of the most important fields in the Natural Language Processing area, especially for the Arabic language where a fewer attempts are made towards constructing a corpus that could be used to train classifications algorithms for the Arabic short text. All the work investigated focuses on document classification whether for Arabic or other languages. Moreover, the small amount of work directed towards short text focused on sentiment analysis and neglected categorization. In this paper a new approach is presented to construct a corpus for short Arabic text classification using a Game with a Purpose (GWAP). "Eih elMawdoo3?" ("What is the Topic?") is a multiplayer GWAP that aims to categorize short, unstructured Arabic text, along with collecting various keywords that will help constructing a strong, cheap, and expandable corpus for short text classification in the Arabic language.
   

Social Media Links

Search

Login