Benjamin Börschinger

Also published as: Benjamin Boerschinger


2020

pdf bib
What Question Answering can Learn from Trivia Nerds
Jordan Boyd-Graber | Benjamin Börschinger
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In addition to the traditional task of machines answering questions, question answering (QA) research creates interesting, challenging questions that help systems how to answer questions and reveal the best systems. We argue that creating a QA dataset—and the ubiquitous leaderboard that goes with it—closely resembles running a trivia tournament: you write questions, have agents (either humans or machines) answer the questions, and declare a winner. However, the research community has ignored the hard-learned lessons from decades of the trivia community creating vibrant, fair, and effective question answering competitions. After detailing problems with existing QA datasets, we outline the key lessons—removing ambiguity, discriminating skill, and adjudicating disputes—that can transfer to QA research and how they might be implemented.

2015

pdf bib
A Computationally Efficient Algorithm for Learning Topical Collocation Models
Zhendong Zhao | Lan Du | Benjamin Börschinger | John K Pate | Massimiliano Ciaramita | Mark Steedman | Mark Johnson
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Exploring the Role of Stress in Bayesian Word Segmentation using Adaptor Grammars
Benjamin Börschinger | Mark Johnson
Transactions of the Association for Computational Linguistics, Volume 2

Stress has long been established as a major cue in word segmentation for English infants. We show that enabling a current state-of-the-art Bayesian word segmentation model to take advantage of stress cues noticeably improves its performance. We find that the improvements range from 10 to 4%, depending on both the use of phonotactic cues and, to a lesser extent, the amount of evidence available to the learner. We also find that in particular early on, stress cues are much more useful for our model than phonotactic cues by themselves, consistent with the finding that children do seem to use stress cues before they use phonotactic cues. Finally, we study how the model’s knowledge about stress patterns evolves over time. We not only find that our model correctly acquires the most frequent patterns relatively quickly but also that the Unique Stress Constraint that is at the heart of a previously proposed model does not need to be built in but can be acquired jointly with word segmentation.

pdf bib
Unsupervised Word Segmentation in Context
Gabriel Synnaeve | Isabelle Dautriche | Benjamin Börschinger | Mark Johnson | Emmanuel Dupoux
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Why is English so easy to segment?
Abdellah Fourtassi | Benjamin Börschinger | Mark Johnson | Emmanuel Dupoux
Proceedings of the Fourth Annual Workshop on Cognitive Modeling and Computational Linguistics (CMCL)

pdf bib
A joint model of word segmentation and phonological variation for English word-final /t/-deletion
Benjamin Börschinger | Mark Johnson | Katherine Demuth
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

2012

pdf bib
Using Rejuvenation to Improve Particle Filtering for Bayesian Word Segmentation
Benjamin Börschinger | Mark Johnson
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Studying the Effect of Input Size for Bayesian Word Segmentation on the Providence Corpus
Benjamin Börschinger | Katherine Demuth | Mark Johnson
Proceedings of COLING 2012

2011

pdf bib
Reducing Grounded Learning Tasks To Grammatical Inference
Benjamin Börschinger | Bevan K. Jones | Mark Johnson
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
A Particle Filter algorithm for Bayesian Wordsegmentation
Benjamin Börschinger | Mark Johnson
Proceedings of the Australasian Language Technology Association Workshop 2011

pdf bib
Collocations in Multilingual Natural Language Generation: Lexical Functions meet Lexical Functional Grammar
François Lareau | Mark Dras | Benjamin Börschinger | Robert Dale
Proceedings of the Australasian Language Technology Association Workshop 2011

2010

pdf bib
WikiNet: A Very Large Scale Multi-Lingual Concept Network
Vivi Nastase | Michael Strube | Benjamin Boerschinger | Caecilia Zirn | Anas Elghafari
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper describes a multi-lingual large-scale concept network obtained automatically by mining for concepts and relations and exploiting a variety of sources of knowledge from Wikipedia. Concepts and their lexicalizations are extracted from Wikipedia pages, in particular from article titles, hyperlinks, disambiguation pages and cross-language links. Relations are extracted from the category and page network, from the category names, from infoboxes and the body of the articles. The resulting network has two main components: (i) a central, language independent index of concepts, which serves to keep track of the concepts' lexicalizations both within a language and across languages, and to separate linguistic expressions of concepts from the relations in which they are involved (concepts themselves are represented as numeric IDs); (ii) a large network built on the basis of the relations extracted, represented as relations between concepts (more specifically, the numeric IDs). The various stages of obtaining the network were separately evaluated, and the results show a qualitative resource.