Gavin Abercrombie


2020

pdf bib
ParlVote: A Corpus for Sentiment Analysis of Political Debates
Gavin Abercrombie | Riza Batista-Navarro
Proceedings of the 12th Language Resources and Evaluation Conference

Debate transcripts from the UK Parliament contain information about the positions taken by politicians towards important topics, but are difficult for people to process manually. While sentiment analysis of debate speeches could facilitate understanding of the speakers’ stated opinions, datasets currently available for this task are small when compared to the benchmark corpora in other domains. We present ParlVote, a new, larger corpus of parliamentary debate speeches for use in the evaluation of sentiment analysis systems for the political domain. We also perform a number of initial experiments on this dataset, testing a variety of approaches to the classification of sentiment polarity in debate speeches. These include a linear classifier as well as a neural network trained using a transformer word embedding model (BERT), and fine-tuned on the parliamentary speeches. We find that in many scenarios, a linear classifier trained on a bag-of-words text representation achieves the best results. However, with the largest dataset, the transformer-based model combined with a neural classifier provides the best performance. We suggest that further experimentation with classification models and observations of the debate content and structure are required, and that there remains much room for improvement in parliamentary sentiment analysis.

2019

pdf bib
Policy Preference Detection in Parliamentary Debate Motions
Gavin Abercrombie | Federico Nanni | Riza Batista-Navarro | Simone Paolo Ponzetto
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Debate motions (proposals) tabled in the UK Parliament contain information about the stated policy preferences of the Members of Parliament who propose them, and are key to the analysis of all subsequent speeches given in response to them. We attempt to automatically label debate motions with codes from a pre-existing coding scheme developed by political scientists for the annotation and analysis of political parties’ manifestos. We develop annotation guidelines for the task of applying these codes to debate motions at two levels of granularity and produce a dataset of manually labelled examples. We evaluate the annotation process and the reliability and utility of the labelling scheme, finding that inter-annotator agreement is comparable with that of other studies conducted on manifesto data. Moreover, we test a variety of ways of automatically labelling motions with the codes, ranging from similarity matching to neural classification methods, and evaluate them against the gold standard labels. From these experiments, we note that established supervised baselines are not always able to improve over simple lexical heuristics. At the same time, we detect a clear and evident benefit when employing BERT, a state-of-the-art deep language representation model, even in classification scenarios with over 30 different labels and limited amounts of training data.

pdf bib
Semantic Change in the Language of UK Parliamentary Debates
Gavin Abercrombie | Riza Batista-Navarro
Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change

We investigate changes in the meanings of words used in the UK Parliament across two different epochs. We use word embeddings to explore changes in the distribution of words of interest and uncover words that appear to have undergone semantic transformation in the intervening period, and explore different ways of obtaining target words for this purpose. We find that semantic changes are generally in line with those found in other corpora, and little evidence that parliamentary language is more static than general English. It also seems that words with senses that have been recorded in the dictionary as having fallen into disuse do not undergo semantic changes in this domain.

2018

pdf bib
‘Aye’ or ‘No’? Speech-level Sentiment Analysis of Hansard UK Parliamentary Debate Transcripts
Gavin Abercrombie | Riza Batista-Navarro
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

pdf bib
Identifying Opinion-Topics and Polarity of Parliamentary Debate Motions
Gavin Abercrombie | Riza Theresa Batista-Navarro
Proceedings of the 9th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

Analysis of the topics mentioned and opinions expressed in parliamentary debate motions–or proposals–is difficult for human readers, but necessary for understanding and automatic processing of the content of the subsequent speeches. We present a dataset of debate motions with pre-existing ‘policy’ labels, and investigate the utility of these labels for simultaneous topic and opinion polarity analysis. For topic detection, we apply one-versus-the-rest supervised topic classification, finding that good performance is achieved in predicting the policy topics, and that textual features derived from the debate titles associated with the motions are particularly indicative of motion topic. We then examine whether the output could also be used to determine the positions taken by proposers towards the different policies by investigating how well humans agree in interpreting the opinion polarities of the motions. Finding very high levels of agreement, we conclude that the policies used can be reliable labels for use in these tasks, and that successful topic detection can therefore provide opinion analysis of the motions ‘for free’.

2016

pdf bib
A Rule-based Shallow-transfer Machine Translation System for Scots and English
Gavin Abercrombie
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

An open-source rule-based machine translation system is developed for Scots, a low-resourced minor language closely related to English and spoken in Scotland and Ireland. By concentrating on translation for assimilation (gist comprehension) from Scots to English, it is proposed that the development of dictionaries designed to be used with in the Apertium platform will be sufficient to produce translations that improve non-Scots speakers understanding of the language. Mono- and bilingual Scots dictionaries are constructed using lexical items gathered from a variety of resources across several domains. Although the primary goal of this project is translation for gisting, the system is evaluated for both assimilation and dissemination (publication-ready translations). A variety of evaluation methods are used, including a cloze test undertaken by human volunteers. While evaluation results are comparable to, and in some cases superior to, those of other language pairs within the Apertium platform, room for improvement is identified in several areas of the system.

pdf bib
Putting Sarcasm Detection into Context: The Effects of Class Imbalance and Manual Labelling on Supervised Machine Classification of Twitter Conversations
Gavin Abercrombie | Dirk Hovy
Proceedings of the ACL 2016 Student Research Workshop