Writing Code for NLP Research

Matt Gardner, Mark Neumann, Joel Grus, Nicholas Lourie


Abstract
Doing modern NLP research requires writing code. Good code enables fast prototyping, easy debugging, controlled experiments, and accessible visualizations that help researchers understand what a model is doing. Bad code leads to research that is at best hard to reproduce and extend, and at worst simply incorrect. Indeed, there is a growing recognition of the importance of having good tools to assist good research in our field, as the upcoming workshop on open source software for NLP demonstrates. This tutorial aims to share best practices for writing code for NLP research, drawing on the instructors' experience designing the recently-released AllenNLP toolkit, a PyTorch-based library for deep learning NLP research. We will explain how a library with the right abstractions and components enables better code and better science, using models implemented in AllenNLP as examples. Participants will learn how to write research code in a way that facilitates good science and easy experimentation, regardless of what framework they use.
Anthology ID:
D18-3003
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts
Month:
October-November
Year:
2018
Address:
Melbourne, Australia
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://www.aclweb.org/anthology/D18-3003
DOI:
Bib Export formats:
BibTeX MODS XML EndNote