Classifying racist texts using a support vector machine
Greevy, Edel and Smeaton, Alan F. (2004) Classifying racist texts using a support vector machine. In: SIGIR 2004 - the 27th Annual International ACM SIGIR Conference, 25-29 July 2004, Sheffield, UK.
Full text available as:
In this poster we present an overview of the techniques we used to develop and evaluate a text categorisation system to automatically classify racist texts. Detecting racism is difficult because the presence of indicator words is insufficient to indicate racist texts, unlike some other text classification tasks. Support Vector Machines (SVM) are used to automatically categorise web pages based on whether or not they are racist. Different interpretations of what constitutes a term are taken, and in this poster we look at three representations of a web page within an SVM -- bag-of-words, bigrams and part-of-speech tags.
Archive Staff Only: edit this record