An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results

Enrique Amigo, Julio Gonzalo, Stefano Mizzaro, Jorge Carrillo-de-Albornoz


Abstract
In Ordinal Classification tasks, items have to be assigned to classes that have a relative ordering, such as “positive”, “neutral”, “negative” in sentiment analysis. Remarkably, the most popular evaluation metrics for ordinal classification tasks either ignore relevant information (for instance, precision/recall on each of the classes ignores their relative ordering) or assume additional information (for instance, Mean Average Error assumes absolute distances between classes). In this paper we propose a new metric for Ordinal Classification, Closeness Evaluation Measure, that is rooted on Measurement Theory and Information Theory. Our theoretical analysis and experimental results over both synthetic data and data from NLP shared tasks indicate that the proposed metric captures quality aspects from different traditional tasks simultaneously. In addition, it generalizes some popular classification (nominal scale) and error minimization (interval scale) metrics, depending on the measurement scale in which it is instantiated.
Anthology ID:
2020.acl-main.363
Volume:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2020
Address:
Online
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3938–3949
Language:
URL:
https://www.aclweb.org/anthology/2020.acl-main.363
DOI:
10.18653/v1/2020.acl-main.363
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/2020.acl-main.363.pdf
Video:
 http://slideslive.com/38928965