Alexis Palmer


2020

pdf bib
A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization
Graham Neubig | Shruti Rijhwani | Alexis Palmer | Jordan MacKenzie | Hilaria Cruz | Xinjian Li | Matthew Lee | Aditi Chaudhary | Luke Gessler | Steven Abney | Shirley Anugrah Hayati | Antonios Anastasopoulos | Olga Zamaraeva | Emily Prud’hommeaux | Jennette Child | Sara Child | Rebecca Knowles | Sarah Moeller | Jeffrey Micher | Yiyuan Li | Sydney Zink | Mengzhou Xia | Roshan S Sharma | Patrick Littell
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)

Despite recent advances in natural language processing and other language technology, the application of such technology to language documentation and conservation has been limited. In August 2019, a workshop was held at Carnegie Mellon University in Pittsburgh, PA, USA to attempt to bring together language community members, documentary linguists, and technologists to discuss how to bridge this gap and create prototypes of novel and practical language revitalization technologies. The workshop focused on developing technologies to aid language documentation and revitalization in four areas: 1) spoken language (speech transcription, phone to orthography decoding, text-to-speech and text-speech forced alignment), 2) dictionary extraction and management, 3) search tools for corpora, and 4) social media (language learning bots and social media analysis). This paper reports the results of this workshop, including issues discussed, and various conceived and implemented technologies for nine languages: Arapaho, Cayuga, Inuktitut, Irish Gaelic, Kidaw’ida, Kwak’wala, Ojibwe, San Juan Quiahije Chatino, and Seneca.

pdf bib
Predicting the Focus of Negation: Model and Error Analysis
Md Mosharaf Hossain | Kathleen Hamilton | Alexis Palmer | Eduardo Blanco
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

The focus of a negation is the set of tokens intended to be negated, and a key component for revealing affirmative alternatives to negated utterances. In this paper, we experiment with neural networks to predict the focus of negation. Our main novelty is leveraging a scope detector to introduce the scope of negation as an additional input to the network. Experimental results show that doing so obtains the best results to date. Additionally, we perform a detailed error analysis providing insights into the main error categories, and analyze errors depending on whether the model takes into account scope and context information.

pdf bib
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Aurelie Herbelot | Xiaodan Zhu | Alexis Palmer | Nathan Schneider | Jonathan May | Ekaterina Shutova
Proceedings of the Fourteenth Workshop on Semantic Evaluation

pdf bib
UNTLing at SemEval-2020 Task 11: Detection of Propaganda Techniques in English News Articles
Maia Petee | Alexis Palmer
Proceedings of the Fourteenth Workshop on Semantic Evaluation

Our system for the PropEval task explores the ability of semantic features to detect and label propagandistic rhetorical techniques in English news articles. For Subtask 2, labeling identified propagandistic fragments with one of fourteen technique labels, our system attains a micro-averaged F1 of 0.40; in this paper, we take a detailed look at the fourteen labels and how well our semantically-focused model detects each of them. We also propose strategies to fill the gaps.

pdf bib
UNT Linguistics at SemEval-2020 Task 12: Linear SVC with Pre-trained Word Embeddings as Document Vectors and Targeted Linguistic Features
Jared Fromknecht | Alexis Palmer
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper outlines our approach to Tasks A & B for the English Language track of SemEval-2020 Task 12: OffensEval 2: Multilingual Offensive Language Identification in Social Media. We use a Linear SVM with document vectors computed from pre-trained word embeddings, and we explore the effectiveness of lexical, part of speech, dependency, and named entity (NE) features. We manually annotate a subset of the training data, which we use for error analysis and to tune a threshold for mapping training confidence values to labels. While document vectors are consistently the most informative features for both tasks, testing on the development set suggests that dependency features are an effective addition for Task A, and NE features for Task B.

pdf bib
WikiPossessions: Possession Timeline Generation as an Evaluation Benchmark for Machine Reading Comprehension of Long Texts
Dhivya Chinnappa | Alexis Palmer | Eduardo Blanco
Proceedings of the 12th Language Resources and Evaluation Conference

This paper presents WikiPossessions, a new benchmark corpus for the task of temporally-oriented possession (TOP), or tracking objects as they change hands over time. We annotate Wikipedia articles for 90 different well-known artifacts paintings, diamonds, and archaeological artifacts), producing 799 artifact-possessor relations with associated attributes. For each article, we also produce a full possession timeline. The full version of the task combines straightforward entity-relation extraction with complex temporal reasoning, as well as verification of textual support for the relevant types of knowledge. Specifically, to complete the full TOP task for a given article, a system must do the following: a) identify possessors; b) anchor possessors to times/events; c) identify temporal relations between each temporal anchor and the possession relation it corresponds to; d) assign certainty scores to each possessor and each temporal relation; and e) assemble individual possession events into a global possession timeline. In addition to the corpus, we release evaluation scripts and a baseline model for the task.

pdf bib
It’s not a Non-Issue: Negation as a Source of Error in Machine Translation
Md Mosharaf Hossain | Antonios Anastasopoulos | Eduardo Blanco | Alexis Palmer
Findings of the Association for Computational Linguistics: EMNLP 2020

As machine translation (MT) systems progress at a rapid pace, questions of their adequacy linger. In this study we focus on negation, a universal, core property of human language that significantly affects the semantics of an utterance. We investigate whether translating negation is an issue for modern MT systems using 17 translation directions as test bed. Through thorough analysis, we find that indeed the presence of negation can significantly impact downstream quality, in some cases resulting in quality reductions of more than 60%. We also provide a linguistically motivated analysis that directly explains the majority of our findings. We release our annotations and code to replicate our analysis here: https://github.com/mosharafhossain/negation-mt.

2019

pdf bib
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts
Preslav Nakov | Alexis Palmer
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

pdf bib
Sigmorphon 2019 Task 2 system description paper: Morphological analysis in context for many languages, with supervision from only a few
Brad Aiken | Jared Kelly | Alexis Palmer | Suleyman Olcay Polat | Taraka Rama | Rodney Nielsen
Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology

This paper presents the UNT HiLT+Ling system for the Sigmorphon 2019 shared Task 2: Morphological Analysis and Lemmatization in Context. Our core approach focuses on the morphological tagging task; part-of-speech tagging and lemmatization are treated as secondary tasks. Given the highly multilingual nature of the task, we propose an approach which makes minimal use of the supplied training data, in order to be extensible to languages without labeled training data for the morphological inflection task. Specifically, we use a parallel Bible corpus to align contextual embeddings at the verse level. The aligned verses are used to build cross-language translation matrices, which in turn are used to map between embedding spaces for the various languages. Finally, we use sets of inflected forms, primarily from a high-resource language, to induce vector representations for individual UniMorph tags. Morphological analysis is performed by matching vector representations to embeddings for individual tokens. While our system results are dramatically below the average system submitted for the shared task evaluation campaign, our method is (we suspect) unique in its minimal reliance on labeled training data.

pdf bib
Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)
Antti Arppe | Jeff Good | Mans Hulden | Jordan Lachler | Alexis Palmer | Lane Schwartz | Miikka Silfverberg
Proceedings of the 3rd Workshop on the Use of Computational Methods in the Study of Endangered Languages Volume 1 (Papers)

pdf bib
A Corpus of Negations and their Underlying Positive Interpretations
Zahra Sarabi | Erin Killian | Eduardo Blanco | Alexis Palmer
Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*SEM 2019)

Negation often conveys implicit positive meaning. In this paper, we present a corpus of negations and their underlying positive interpretations. We work with negations from Simple Wikipedia, automatically generate potential positive interpretations, and then collect manual annotations that effectively rewrite the negation in positive terms. This procedure yields positive interpretations for approximately 77% of negations, and the final corpus includes over 5,700 negations and over 5,900 positive interpretations. We also present baseline results using seq2seq neural models.

2018

pdf bib
Determining Event Durations: Models and Error Analysis
Alakananda Vempala | Eduardo Blanco | Alexis Palmer
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

This paper presents models to predict event durations. We introduce aspectual features that capture deeper linguistic information than previous work, and experiment with neural networks. Our analysis shows that tense, aspect and temporal structure of the clause provide useful clues, and that an LSTM ensemble captures relevant context around the event.

2017

pdf bib
Classifying Semantic Clause Types: Modeling Context and Genre Characteristics with Recurrent Neural Networks and Attention
Maria Becker | Michael Staniek | Vivi Nastase | Alexis Palmer | Anette Frank
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

Detecting aspectual properties of clauses in the form of situation entity types has been shown to depend on a combination of syntactic-semantic and contextual features. We explore this task in a deep-learning framework, where tuned word representations capture lexical, syntactic and semantic features. We introduce an attention mechanism that pinpoints relevant context not only for the current instance, but also for the larger context. Apart from implicitly capturing task relevant features, the advantage of our neural model is that it avoids the need to reproduce linguistic features for other languages and is thus more easily transferable. We present experiments for English and German that achieve competitive performance. We present a novel take on modeling and exploiting genre information and showcase the adaptation of our system from one language to another.

pdf bib
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages
Antti Arppe | Jeff Good | Mans Hulden | Jordan Lachler | Alexis Palmer | Lane Schwartz
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib
Illegal is not a Noun: Linguistic Form for Detection of Pejorative Nominalizations
Alexis Palmer | Melissa Robinson | Kristy K. Phillips
Proceedings of the First Workshop on Abusive Language Online

This paper focuses on a particular type of abusive language, targeting expressions in which typically neutral adjectives take on pejorative meaning when used as nouns - compare ‘gay people’ to ‘the gays’. We first collect and analyze a corpus of hand-curated, expert-annotated pejorative nominalizations for four target adjectives: female, gay, illegal, and poor. We then collect a second corpus of automatically-extracted and POS-tagged, crowd-annotated tweets. For both corpora, we find support for the hypothesis that some adjectives, when nominalized, take on negative meaning. The targeted constructions are non-standard yet widely-used, and part-of-speech taggers mistag some nominal forms as adjectives. We implement a tool called NomCatcher to correct these mistaggings, and find that the same tool is effective for identifying new adjectives subject to transformation via nominalization into abusive language.

pdf bib
Modeling Communicative Purpose with Functional Style: Corpus and Features for German Genre and Register Analysis
Thomas Haider | Alexis Palmer
Proceedings of the Workshop on Stylistic Variation

While there is wide acknowledgement in NLP of the utility of document characterization by genre, it is quite difficult to determine a definitive set of features or even a comprehensive list of genres. This paper addresses both issues. First, with prototype semantics, we develop a hierarchical taxonomy of discourse functions. We implement the taxonomy by developing a new text genre corpus of contemporary German to perform a text based comparative register analysis. Second, we extract a host of style features, both deep and shallow, aiming beyond linguistically motivated features at situational correlates in texts. The feature sets are used for supervised text genre classification, on which our models achieve high accuracy. The combination of the corpus typology and feature sets allows us to characterize types of communicative purpose in a comparative setup, by qualitative interpretation of style feature loadings of a regularized discriminant analysis. Finally, to determine the dependence of genre on topics (which are arguably the distinguishing factor of sub-genre), we compare and combine our style models with Latent Dirichlet Allocation features across different corpus settings with unstable topics.

2016

pdf bib
Modal Sense Classification At Large: Paraphrase-Driven Sense Projection, Semantically Enriched Classification Models and Cross-Genre Evaluations
Ana Marasović | Mengfei Zhou | Alexis Palmer | Anette Frank
Linguistic Issues in Language Technology, Volume 14, 2016 - Modality: Logic, Semantics, Annotation, and Machine Learning

Modal verbs have different interpretations depending on their context. Their sense categories – epistemic, deontic and dynamic – provide important dimensions of meaning for the interpretation of discourse. Previous work on modal sense classification achieved relatively high performance using shallow lexical and syntactic features drawn from small-size annotated corpora. Due to the restricted empirical basis, it is difficult to assess the particular difficulties of modal sense classification and the generalization capacity of the proposed models. In this work we create large-scale, high-quality annotated corpora for modal sense classification using an automatic paraphrase-driven projection approach. Using the acquired corpora, we investigate the modal sense classification task from different perspectives.

pdf bib
Situation entity types: automatic classification of clause-level aspect
Annemarie Friedrich | Alexis Palmer | Manfred Pinkal
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Investigating Active Learning for Short-Answer Scoring
Andrea Horbach | Alexis Palmer
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Predicting the Direction of Derivation in English Conversion
Max Kisselew | Laura Rimell | Alexis Palmer | Sebastian Padó
Proceedings of the 14th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology

pdf bib
Argumentative texts and clause types
Maria Becker | Alexis Palmer | Anette Frank
Proceedings of the Third Workshop on Argument Mining (ArgMining2016)

2015

pdf bib
Obtaining a Better Understanding of Distributional Models of German Derivational Morphology
Max Kisselew | Sebastian Padó | Alexis Palmer | Jan Šnajder
Proceedings of the 11th International Conference on Computational Semantics

pdf bib
Annotating genericity: a survey, a scheme, and a corpus
Annemarie Friedrich | Alexis Palmer | Melissa Peate Sørensen | Manfred Pinkal
Proceedings of The 9th Linguistic Annotation Workshop

pdf bib
Using Shallow Syntactic Features to Measure Influences of L1 and Proficiency Level in EFL Writings
Andrea Horbach | Jonathan Poitz | Alexis Palmer
Proceedings of the fourth workshop on NLP for computer-assisted language learning

pdf bib
Linking discourse modes and situation entity types in a cross-linguistic corpus study
Kleio-Isidora Mavridou | Annemarie Friedrich | Melissa Peate Sørensen | Alexis Palmer | Manfred Pinkal
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

pdf bib
Semantically Enriched Models for Modal Sense Classification
Mengfei Zhou | Anette Frank | Annemarie Friedrich | Alexis Palmer
Proceedings of the First Workshop on Linking Computational Models of Lexical, Sentential and Discourse-level Semantics

2014

pdf bib
Automatic prediction of aspectual class of verbs in context
Annemarie Friedrich | Alexis Palmer
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
lex4all: A language-independent tool for building and evaluating pronunciation lexicons for small-vocabulary speech recognition
Anjana Vakil | Max Paulus | Alexis Palmer | Michaela Regneri
Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
SeedLing: Building and Using a Seed corpus for the Human Language Project
Guy Emerson | Liling Tan | Susanne Fertmann | Alexis Palmer | Michaela Regneri
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib
Short-Term Projects, Long-Term Benefits: Four Student NLP Projects for Low-Resource Languages
Alexis Palmer | Michaela Regneri
Proceedings of the 2014 Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf bib
Paraphrase Detection for Short Answer Scoring
Nikolina Koleva | Andrea Horbach | Alexis Palmer | Simon Ostermann | Manfred Pinkal
Proceedings of the third workshop on NLP for computer-assisted language learning

pdf bib
Situation Entity Annotation
Annemarie Friedrich | Alexis Palmer
Proceedings of LAW VIII - The 8th Linguistic Annotation Workshop

pdf bib
LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization
Annemarie Friedrich | Marina Valeeva | Alexis Palmer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

We present LQVSumm, a corpus of about 2000 automatically created extractive multi-document summaries from the TAC 2011 shared task on Guided Summarization, which we annotated with several types of linguistic quality violations. Examples for such violations include pronouns that lack antecedents or ungrammatical clauses. We give details on the annotation scheme and show that inter-annotator agreement is good given the open-ended nature of the task. The annotated summaries have previously been scored for Readability on a numeric scale by human annotators in the context of the TAC challenge; we show that the number of instances of violations of linguistic quality of a summary correlates with these intuitively assigned numeric scores. On a system-level, the average number of violations marked in a system’s summaries achieves higher correlation with the Readability scores than current supervised state-of-the-art methods for assigning a single readability score to a summary. It is our hope that our corpus facilitates the development of methods that not only judge the linguistic quality of automatically generated summaries as a whole, but which also allow for detecting, labeling, and fixing particular violations in a text.

pdf bib
Finding a Tradeoff between Accuracy and Rater’s Workload in Grading Clustered Short Answers
Andrea Horbach | Alexis Palmer | Magdalena Wolska
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

n this paper we investigate the potential of answer clustering for semi-automatic scoring of short answer questions for German as a foreign language. We use surface features like word and character n-grams to cluster answers to listening comprehension exercises per question and simulate having human graders only label one answer per cluster and then propagating this label to all other members of the cluster. We investigate various ways to select this single item to be labeled and find that choosing the item closest to the centroid of a cluster leads to improved (simulated) grading accuracy over random item selection. Averaged over all questions, we can reduce a teacher’s workload to labeling only 40% of all different answers for a question, while still maintaining a grading accuracy of more than 85%.

2013

pdf bib
Using the text to evaluate short answers for reading comprehension exercises
Andrea Horbach | Alexis Palmer | Manfred Pinkal
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity

2012

pdf bib
Visualising Typological Relationships: Plotting WALS with Heat Maps
Richard Littauer | Rory Turnbull | Alexis Palmer
Proceedings of the EACL 2012 Joint Workshop of LINGVIS & UNCLH

2011

pdf bib
Robust Semantic Analysis for Unseen Data in FrameNet
Alexis Palmer | Afra Alishahi | Caroline Sporleder
Proceedings of the International Conference Recent Advances in Natural Language Processing 2011

pdf bib
Enhancing Active Learning for Semantic Role Labeling via Compressed Dependency Trees
Chenhua Chen | Alexis Palmer | Caroline Sporleder
Proceedings of 5th International Joint Conference on Natural Language Processing

2010

pdf bib
Bringing Active Learning to Life
Ines Rehbein | Josef Ruppenhofer | Alexis Palmer
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf bib
Evaluating FrameNet-style semantic parsing: the role of coverage gaps in FrameNet
Alexis Palmer | Caroline Sporleder
Coling 2010: Posters

2009

pdf bib
How well does active learning actually work? Time-based evaluation of cost-reduction strategies for language documentation.
Jason Baldridge | Alexis Palmer
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Evaluating Automation Strategies in Language Documentation
Alexis Palmer | Taesun Moon | Jason Baldridge
Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing

2007

pdf bib
IGT-XML: An XML Format for Interlinearized Glossed Text
Alexis Palmer | Katrin Erk
Proceedings of the Linguistic Annotation Workshop

pdf bib
A Sequencing Model for Situation Entity Classification
Alexis Palmer | Elias Ponvert | Jason Baldridge | Carlota Smith
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2004

pdf bib
Utilization of Multiple Language Resources for Robust Grammar-Based Tense and Aspect Classification
Alexis Palmer | Jonas Kuhn | Carlota Smith
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

This paper reports on an ongoing project that uses varied language resources and advanced NLP tools for a linguistic classification task in discourse semantics. The system we present is designed to assign a "situation entity" class label to each predicator in English text. The project goal is to achieve the best-possible identification of situation entities in naturally-occurring written texts by implementing a robust system that will deal with real corpus material, rather than just with constructed textbook examples of discourse. In this paper we focus on the combination of multiple information sources, which we see as being vital for a robust classification system. We use a deep syntactic grammar of English to identify morphological, syntactic, and discourse clues, and we use various lexical databases for fine-grained semantic properties of the predicators. Experiments performed to date show that enhancing the output of the grammar with information from lexical resources improves recall but lowers precision in the situation entity classification task.
Search