Marjan Ghazvininejad


2020

pdf bib
Simple and Effective Retrieve-Edit-Rerank Text Generation
Nabil Hossain | Marjan Ghazvininejad | Luke Zettlemoyer
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Retrieve-and-edit seq2seq methods typically retrieve an output from the training set and learn a model to edit it to produce the final output. We propose to extend this framework with a simple and effective post-generation ranking approach. Our framework (i) retrieves several potentially relevant outputs for each input, (ii) edits each candidate independently, and (iii) re-ranks the edited candidates to select the final output. We use a standard editing model with simple task-specific re-ranking approaches, and we show empirically that this approach outperforms existing, significantly more complex methodologies. Experiments on two machine translation (MT) datasets show new state-of-art results. We also achieve near state-of-art performance on the Gigaword summarization dataset, where our analyses show that there is significant room for performance improvement with better candidate output selection in future work.

pdf bib
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
Mike Lewis | Yinhan Liu | Naman Goyal | Marjan Ghazvininejad | Abdelrahman Mohamed | Omer Levy | Veselin Stoyanov | Luke Zettlemoyer
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We present BART, a denoising autoencoder for pretraining sequence-to-sequence models. BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. It uses a standard Tranformer-based neural machine translation architecture which, despite its simplicity, can be seen as generalizing BERT (due to the bidirectional encoder), GPT (with the left-to-right decoder), and other recent pretraining schemes. We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token. BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 3.5 ROUGE. BART also provides a 1.1 BLEU increase over a back-translation system for machine translation, with only target language pretraining. We also replicate other pretraining schemes within the BART framework, to understand their effect on end-task performance.

2019

pdf bib
Mask-Predict: Parallel Decoding of Conditional Masked Language Models
Marjan Ghazvininejad | Omer Levy | Yinhan Liu | Luke Zettlemoyer
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Most machine translation systems generate text autoregressively from left to right. We, instead, use a masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a partially masked target translation. This approach allows for efficient iterative decoding, where we first predict all of the target words non-autoregressively, and then repeatedly mask out and regenerate the subset of words that the model is least confident about. By applying this strategy for a constant number of iterations, our model improves state-of-the-art performance levels for non-autoregressive and parallel decoding translation models by over 4 BLEU on average. It is also able to reach within about 1 BLEU point of a typical left-to-right transformer model, while decoding significantly faster.

pdf bib
Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation
Vladimir Karpukhin | Omer Levy | Jacob Eisenstein | Marjan Ghazvininejad
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

Contemporary machine translation systems achieve greater coverage by applying subword models such as BPE and character-level CNNs, but these methods are highly sensitive to orthographical variations such as spelling mistakes. We show how training on a mild amount of random synthetic noise can dramatically improve robustness to these variations, without diminishing performance on clean text. We focus on translation performance on natural typos, and show that robustness to such noise can be achieved using a balanced diet of simple synthetic noises at training time, without access to the natural noise data or distribution.

pdf bib
Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation
Nima Pourdamghani | Nada Aldarrab | Marjan Ghazvininejad | Kevin Knight | Jonathan May
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Given a rough, word-by-word gloss of a source language sentence, target language natives can uncover the latent, fully-fluent rendering of the translation. In this work we explore this intuition by breaking translation into a two step process: generating a rough gloss by means of a dictionary and then ‘translating’ the resulting pseudo-translation, or ‘Translationese’ into a fully fluent translation. We build our Translationese decoder once from a mish-mash of parallel data that has the target language in common and then can build dictionaries on demand using unsupervised techniques, resulting in rapidly generated unsupervised neural MT systems for many source languages. We apply this process to 14 test languages, obtaining better or comparable translation results on high-resource languages than previously published unsupervised MT studies, and obtaining good quality results for low-resource languages that have never been used in an unsupervised MT scenario.

pdf bib
Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation
Antoine Bosselut | Asli Celikyilmaz | Marjan Ghazvininejad | Srinivasan Iyer | Urvashi Khandelwal | Hannah Rashkin | Thomas Wolf
Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation

2018

pdf bib
Neural Poetry Translation
Marjan Ghazvininejad | Yejin Choi | Kevin Knight
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

We present the first neural poetry translation system. Unlike previous works that often fail to produce any translation for fixed rhyme and rhythm patterns, our system always translates a source text to an English poem. Human evaluation of the translations ranks the quality as acceptable 78.2% of the time.

pdf bib
Using Word Vectors to Improve Word Alignments for Low Resource Machine Translation
Nima Pourdamghani | Marjan Ghazvininejad | Kevin Knight
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

We present a method for improving word alignments using word similarities. This method is based on encouraging common alignment links between semantically similar words. We use word vectors trained on monolingual data to estimate similarity. Our experiments on translating fifteen languages into English show consistent BLEU score improvements across the languages.

pdf bib
Towards Controllable Story Generation
Nanyun Peng | Marjan Ghazvininejad | Jonathan May | Kevin Knight
Proceedings of the First Workshop on Storytelling

We present a general framework of analyzing existing story corpora to generate controllable and creative new stories. The proposed framework needs little manual annotation to achieve controllable story generation. It creates a new interface for humans to interact with computers to generate personalized stories. We apply the framework to build recurrent neural network (RNN)-based generation models to control story ending valence and storyline. Experiments show that our methods successfully achieve the control and enhance the coherence of stories through introducing storylines. with additional control factors, the generation model gets lower perplexity, and yields more coherent stories that are faithful to the control factors according to human evaluation.

2017

pdf bib
Hafez: an Interactive Poetry Generation System
Marjan Ghazvininejad | Xing Shi | Jay Priyadarshi | Kevin Knight
Proceedings of ACL 2017, System Demonstrations

2016

pdf bib
Generating Topical Poetry
Marjan Ghazvininejad | Xing Shi | Yejin Choi | Kevin Knight
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2015

pdf bib
How Much Information Does a Human Translator Add to the Original?
Barret Zoph | Marjan Ghazvininejad | Kevin Knight
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf bib
How to Memorize a Random 60-Bit String
Marjan Ghazvininejad | Kevin Knight
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies