DCU-UVT: Word-level language classification with code-mixed data

Barman, Utsab; Wagner, Joachim; Chrupała, Grzegorz; Foster, Jennifer

Barman, Utsab, Wagner, Joachim ORCID: 0000-0002-8290-3849, Chrupała, Grzegorz ORCID: 0000-0001-9498-6912 and Foster, Jennifer ORCID: 0000-0002-7789-4853 (2014) DCU-UVT: Word-level language classification with code-mixed data. In: First Workshop on Computational Approaches to Code Switching, 25 Oct 2014, Doha, Qatar.

Abstract
Metadata
Downloads
Documents
Metrics

[+][-]

Abstract

This paper describes the DCU-UVT team’s participation in the Language Identification in Code-Switched Data shared task in the Workshop on Computational Approaches to Code Switching. Word-level classification experiments were carried out using a simple dictionary-based method, linear kernel support vector machines (SVMs) with and without contextual clues, and a k-nearest neighbour approach. Based on these experiments, we select our SVM-based system with contextual clues as our final system and present results for the Nepali-English and Spanish-English datasets.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Workshop
Refereed:	Yes
Uncontrolled Keywords:	code-switching; language identification; user-generated content; Nepali-English; Spanish-English
Subjects:	Computer Science > Artificial intelligence Computer Science > Computational linguistics
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > Centre for Next Generation Localisation (CNGL)
Published in:	Proceedings of the First Workshop on Computational Approaches to Code Switching. . Association for Computational Linguistics (ACL).
Publisher:	Association for Computational Linguistics (ACL)
Official URL:	https://doi.org/10.3115/v1/W14-3915
Copyright Information:	© 2014 The Association for Computational Linguistics
Funders:	Science Foundation Ireland (Grant 12/CE/I2267)
ID Code:	20713
Deposited On:	26 Apr 2023 13:35 by Joachim Wagner . Last Modified 26 Apr 2023 13:35

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
173kB

Metrics

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

DCU-UVT: Word-level language classification with code-mixed data

Altmetric Badge

Dimensions Badge

Downloads