Feature Selection as Causal Inference: Experiments with Text Classification

Michael J. Paul


Abstract
This paper proposes a matching technique for learning causal associations between word features and class labels in document classification. The goal is to identify more meaningful and generalizable features than with only correlational approaches. Experiments with sentiment classification show that the proposed method identifies interpretable word associations with sentiment and improves classification performance in a majority of cases. The proposed feature selection method is particularly effective when applied to out-of-domain data.
Anthology ID:
K17-1018
Volume:
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)
Month:
August
Year:
2017
Address:
Vancouver, Canada
Venue:
CoNLL
SIG:
SIGNLL
Publisher:
Association for Computational Linguistics
Note:
Pages:
163–172
Language:
URL:
https://www.aclweb.org/anthology/K17-1018
DOI:
10.18653/v1/K17-1018
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/K17-1018.pdf