DCU-SEManiacs at SemEval-2016 task 1: synthetic paragram embeddings for semantic textual similarity

Hokamp, Chris; Arora, Piyush

Hokamp, Chris ORCID: 0000-0002-7850-9398 and Arora, Piyush ORCID: 0000-0002-4261-2860 (2016) DCU-SEManiacs at SemEval-2016 task 1: synthetic paragram embeddings for semantic textual similarity. In: 10th International Workshop on Semantic Evaluation (SemEval-2016), 16-17 June 2016, San Diego, Ca. USA.

Abstract
Metadata
Downloads
Documents
Metrics

[+][-]

Abstract

We experiment with learning word representations designed to be combined into sentence level semantic representations, using an objective function which does not directly make use of the supervised scores provided with the training data, instead opting for a simpler objective which encourages similar phrases to be close together in the embedding space. This simple objective lets us start with high quality embeddings trained using the Paraphrase Database (PPDB) (Wieting et al., 2015; Ganitkevitch et al., 2013), and then tune these embeddings using the official STS task training data, as well as synthetic paraphrases for each test dataset, obtained by pivoting through machine translation. Our submissions include runs which only compare the similarity of phrases in the embedding space, directly using the similarity score to produce predictions, as well as a run which uses vector similarity in addition to a suite of features we investigated for our 2015 Semeval submission. For the crosslingual task, we simply translate the Spanish sentences to English, and use the same system we designed for the monolingual task.

Metadata

Item Type:	Conference or Workshop Item (Poster)
Event Type:	Workshop
Refereed:	Yes
Subjects:	UNSPECIFIED
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT Research Institutes and Centres > Centre for Next Generation Localisation (CNGL)
Published in:	Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). . Association for Computational Linguistics.
Publisher:	Association for Computational Linguistics
Official URL:	https://doi.org/10.18653/v1/S16-1100
Copyright Information:	© 2016 ACL
Use License:	This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:	EXPERT (EU Marie Curie ITN No. 317471), Science Foundation Ireland (SFI) as a part of the ADAPT Centre at Dublin City University (Grant No: 12/CE/I2267),
ID Code:	22800
Deposited On:	23 Nov 2018 14:27 by Piyush Arora . Last Modified 31 Jan 2019 12:20

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
190kB

Metrics

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

DCU-SEManiacs at SemEval-2016 task 1: synthetic paragram embeddings for semantic textual similarity

Altmetric Badge

Dimensions Badge

Downloads