Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Towards language-agnostic alignment of product titles and descriptions: a neural approach

Stein, Daniel, Shterionov, Dimitar orcid logoORCID: 0000-0001-6300-797X and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2019) Towards language-agnostic alignment of product titles and descriptions: a neural approach. In: 2019 World Wide Web Conference, 13-17 May 2019, San Francisco, USA. ISBN 978-1-4503-6675-5

Abstract
The quality of e-Commerce services largely depends on the accessibility of product content as well as its completeness and correctness. Nowadays, many sellers target cross-country and cross-lingual markets via active or passive cross-border trade, fostering the desire for seamless user experiences. While machine translation (MT) is very helpful for crossing language barriers, automatically matching existing items for sale (e.g. the smartphone in front of me) to the same product (all smartphones of the same brand/type/colour/condition) can be challenging, especially because the seller’s description can often be erroneous or incomplete. This task we refer to as item alignment in multilingual e-commerce catalogues. To facilitate this task, we develop a pipeline of tools for item classification based on cross-lingual text similarity, exploiting recurrent neural networks (RNNs) with and without pre-trained word-embeddings. Furthermore, we combine our language agnostic RNN classifiers with an in-domain MT system to further reduce the linguistic and stylistic differences between the investigated data, aiming to boost our performance. The quality of the methods as well as their training speed is compared on an in-domain data set for English–German products.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Published in: WWW '19 Companion Proceedings of The 2019 World Wide Web Conference. . Association for Computing Machinery (ACM). ISBN 978-1-4503-6675-5
Publisher:Association for Computing Machinery (ACM)
Official URL:http://dx.doi.org/10.1145/3308560.3316602
Copyright Information:© 2019 9 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC-BY 4.0 License.
Funders:SFI
ID Code:23867
Deposited On:21 Oct 2019 13:44 by Andrew Way . Last Modified 21 Oct 2019 13:44
Documents

Full text available as:

[thumbnail of www19companion-136.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
398kB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record