THU_NGN at SemEval-2019 Task 12: Toponym Detection and Disambiguation on Scientific Papers

Tao Qi, Suyu Ge, Chuhan Wu, Yubo Chen, Yongfeng Huang


Abstract
First name: Tao Last name: Qi Email: taoqi.qt@gmail.com Affiliation: Department of Electronic Engineering, Tsinghua University First name: Suyu Last name: Ge Email: gesy17@mails.tsinghua.edu.cn Affiliation: Department of Electronic Engineering, Tsinghua University First name: Chuhan Last name: Wu Email: wuch15@mails.tsinghua.edu.cn Affiliation: Department of Electronic Engineering, Tsinghua University First name: Yubo Last name: Chen Email: chen-yb18@mails.tsinghua.edu.cn Affiliation: Department of Electronic Engineering, Tsinghua University First name: Yongfeng Last name: Huang Email: yfhuang@mail.tsinghua.edu.cn Affiliation: Department of Electronic Engineering, Tsinghua University Toponym resolution is an important and challenging task in the neural language processing field, and has wide applications such as emergency response and social media geographical event analysis. Toponym resolution can be roughly divided into two independent steps, i.e., toponym detection and toponym disambiguation. In order to facilitate the study on toponym resolution, the SemEval 2019 task 12 is proposed, which contains three subtasks, i.e., toponym detection, toponym disambiguation and toponym resolution. In this paper, we introduce our system that participated in the SemEval 2019 task 12. For toponym detection, in our approach we use TagLM as the basic model, and explore the use of various features in this task, such as word embeddings extracted from pre-trained language models, POS tags and lexical features extracted from dictionaries. For toponym disambiguation, we propose a heuristics rule-based method using toponym frequency and population. Our systems achieved 83.03% strict macro F1, 74.50 strict micro F1, 85.92 overlap macro F1 and 78.47 overlap micro F1 in toponym detection subtask.
Anthology ID:
S19-2229
Volume:
Proceedings of the 13th International Workshop on Semantic Evaluation
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota, USA
Venue:
*SEMEVAL
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1302–1307
Language:
URL:
https://www.aclweb.org/anthology/S19-2229
DOI:
10.18653/v1/S19-2229
Bib Export formats:
BibTeX MODS XML EndNote
PDF:
http://aclanthology.lst.uni-saarland.de/S19-2229.pdf