The good, the bad and their kins: identifying questions with negative scores in StackOverflow

Arora, Piyush; Ganguly, Debasis; Jones, Gareth J.F.

Arora, Piyush ORCID: 0000-0002-4261-2860, Ganguly, Debasis ORCID: 0000-0003-0050-7138 and Jones, Gareth J.F. ORCID: 0000-0003-2923-8365 (2015) The good, the bad and their kins: identifying questions with negative scores in StackOverflow. In: International Conference on Advances in Social Networks Analysis and Mining (ASONAM '15), 25-28 Aug 2015, Paris, France. ISBN 978-1-4503-3854-7

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

A rapid increase in the number of questions posted on community question answering (CQA) forums is creating a need for automated methods of question quality moderation to improve the effectiveness of such forums in terms of response time and quality. Such automated approaches should aim to classify questions as good or bad for a particular forum as soon as they are posted based on the guidelines and quality standards defined/listed by the forum. Thus, if a question meets the standard of the forum then it is classified as good else we classify it as bad. In this paper, we propose a method to address this problem of question classification by retrieving similar questions previously asked in the same forum, and then using the text from these previously asked similar questions to predict the quality of the current question. We empirically validate our proposed approach on the set of StackOverflow data, a massive CQA forum for programmers, comprising of about 8M questions. With the use of these additional text retrieved from similar questions, we are able to improve the question quality prediction accuracy by about 2.8% and improve the recall of negatively scored questions by about 4.2%. This improvement of 4.2% in recall would be helpful in automatically flagging questions as bad (unsuitable) for the forum and will speed up the moderation process thus saving time and human effort.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Conference
Refereed:	Yes
Subjects:	UNSPECIFIED
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT Research Institutes and Centres > Centre for Next Generation Localisation (CNGL)
Published in:	Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. . ACM. ISBN 978-1-4503-3854-7
Publisher:	ACM
Official URL:	http://dx.doi.org/10.1145/2808797.2809318
Copyright Information:	© 2015 ACM
Funders:	Science Foundation Ireland (SFI) as part of the CNGL Centre for Global Intelligent Content at DCU (Grant No: 12/CE/I2267)
ID Code:	22799
Deposited On:	23 Nov 2018 14:10 by Piyush Arora . Last Modified 31 Jan 2019 12:20

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
315kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

The good, the bad and their kins: identifying questions with negative scores in StackOverflow

Downloads