Cicero dos Santos

Also published as: Cícero Nogueira dos Santos, Cicero Nogueira dos Santos, Cícero dos Santos, Cícero Nogueira dos Santos


pdf bib
Learning Implicit Text Generation via Feature Matching
Inkit Padhi | Pierre Dognin | Ke Bai | Cícero Nogueira dos Santos | Vijil Chenthamarakshan | Youssef Mroueh | Payel Das
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Generative feature matching network (GFMN) is an approach for training state-of-the-art implicit generative models for images by performing moment matching on features from pre-trained neural networks. In this paper, we present new GFMN formulations that are effective for sequential data. Our experimental results show the effectiveness of the proposed method, SeqGFMN, for three distinct generation tasks in English: unconditional text generation, class-conditional text generation, and unsupervised text style transfer. SeqGFMN is stable to train and outperforms various adversarial approaches for text generation and text style transfer.

pdf bib
Margin-aware Unsupervised Domain Adaptation for Cross-lingual Text Labeling
Dejiao Zhang | Ramesh Nallapati | Henghui Zhu | Feng Nan | Cicero Nogueira dos Santos | Kathleen McKeown | Bing Xiang
Findings of the Association for Computational Linguistics: EMNLP 2020

Unsupervised domain adaptation addresses the problem of leveraging labeled data in a source domain to learn a well-performing model in a target domain where labels are unavailable. In this paper, we improve upon a recent theoretical work (Zhang et al., 2019b) and adopt the Margin Disparity Discrepancy (MDD) unsupervised domain adaptation algorithm to solve the cross-lingual text labeling problems. Experiments on cross-lingual document classification and NER demonstrate the proposed domain adaptation approach advances the state-of-the-art results by a large margin. Specifically, we improve MDD by efficiently optimizing the margin loss on the source domain via Virtual Adversarial Training (VAT). This bridges the gap between theory and the loss function used in the original work Zhang et al.(2019b), and thereby significantly boosts the performance. Our numerical results also indicate that VAT can remarkably improve the generalization performance of both domains for various domain adaptation approaches.

pdf bib
Augmented Natural Language for Generative Sequence Labeling
Ben Athiwaratkun | Cicero Nogueira dos Santos | Jason Krone | Bing Xiang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose a generative framework for joint sequence labeling and sentence-level classification. Our model performs multiple sequence labeling tasks at once using a single, shared natural language output space. Unlike prior discriminative methods, our model naturally incorporates label semantics and shares knowledge across tasks. Our framework general purpose, performing well on few-shot learning, low resource, and high resource tasks. We demonstrate these advantages on popular named entity recognition, slot labeling, and intent classification benchmarks. We set a new state-of-the-art for few-shot slot labeling, improving substantially upon the previous 5-shot (75.0% to 90.9%) and 1-shot (70.4% to 81.0%) state-of-the-art results. Furthermore, our model generates large improvements (46.27% to 63.83%) in low resource slot labeling over a BERT baseline by incorporating label semantics. We also maintain competitive results on high resource tasks, performing within two points of the state-of-the-art on all tasks and setting a new state-of-the-art on the SNIPS dataset.

pdf bib
Beyond [CLS] through Ranking by Generation
Cicero Nogueira dos Santos | Xiaofei Ma | Ramesh Nallapati | Zhiheng Huang | Bing Xiang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Generative models for Information Retrieval, where ranking of documents is viewed as the task of generating a query from a document’s language model, were very successful in various IR tasks in the past. However, with the advent of modern deep neural networks, attention has shifted to discriminative ranking functions that model the semantic similarity of documents and queries instead. Recently, deep generative models such as GPT2 and BART have been shown to be excellent text generators, but their effectiveness as rankers have not been demonstrated yet. In this work, we revisit the generative framework for information retrieval and show that our generative approaches are as effective as state-of-the-art semantic similarity-based discriminative models for the answer selection task. Additionally, we demonstrate the effectiveness of unlikelihood losses for IR.

pdf bib
End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems
Siamak Shakeri | Cicero Nogueira dos Santos | Henghui Zhu | Patrick Ng | Feng Nan | Zhiguo Wang | Ramesh Nallapati | Bing Xiang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

We propose an end-to-end approach for synthetic QA data generation. Our model comprises a single transformer-based encoder-decoder network that is trained end-to-end to generate both answers and questions. In a nutshell, we feed a passage to the encoder and ask the decoder to generate a question and an answer token-by-token. The likelihood produced in the generation process is used as a filtering score, which avoids the need for a separate filtering model. Our generator is trained by fine-tuning a pretrained LM using maximum likelihood estimation. The experimental results indicate significant improvements in the domain adaptation of QA models outperforming current state-of-the-art methods.

pdf bib
DualTKB: A Dual Learning Bridge between Text and Knowledge Base
Pierre Dognin | Igor Melnyk | Inkit Padhi | Cicero Nogueira dos Santos | Payel Das
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

In this work, we present a dual learning approach for unsupervised text to path and path to text transfers in Commonsense Knowledge Bases (KBs). We investigate the impact of weak supervision by creating a weakly supervised dataset and show that even a slight amount of supervision can significantly improve the model performance and enable better-quality transfers. We examine different model architectures, and evaluation metrics, proposing a novel Commonsense KB completion metric tailored for generative models. Extensive experimental results show that the proposed method compares very favorably to the existing baselines. This approach is a viable step towards a more advanced system for automatic KB construction/expansion and the reverse operation of KB conversion to coherent textual descriptions.


pdf bib
Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering
Rui Zhang | Cícero Nogueira dos Santos | Michihiro Yasunaga | Bing Xiang | Dragomir Radev
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Coreference resolution aims to identify in a text all mentions that refer to the same real world entity. The state-of-the-art end-to-end neural coreference model considers all text spans in a document as potential mentions and learns to link an antecedent for each possible mention. In this paper, we propose to improve the end-to-end coreference resolution system by (1) using a biaffine attention model to get antecedent scores for each possible mention, and (2) jointly optimizing the mention detection accuracy and mention clustering accuracy given the mention cluster labels. Our model achieves the state-of-the-art performance on the CoNLL-2012 shared task English test set.

pdf bib
Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer
Cicero Nogueira dos Santos | Igor Melnyk | Inkit Padhi
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We introduce a new approach to tackle the problem of offensive language in online social media. Our approach uses unsupervised text style transfer to translate offensive sentences into non-offensive ones. We propose a new method for training encoder-decoders using non-parallel data that combines a collaborative classifier, attention and the cycle consistency loss. Experimental results on data from Twitter and Reddit show that our method outperforms a state-of-the-art text style transfer system in two out of three quantitative metrics and produces reliable non-offensive transferred sentences.


pdf bib
Improved Neural Relation Detection for Knowledge Base Question Answering
Mo Yu | Wenpeng Yin | Kazi Saidul Hasan | Cicero dos Santos | Bing Xiang | Bowen Zhou
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Relation detection is a core component of many NLP applications including Knowledge Base Question Answering (KBQA). In this paper, we propose a hierarchical recurrent neural network enhanced by residual learning which detects KB relations given an input question. Our method uses deep residual bidirectional LSTMs to compare questions and relation names via different levels of abstraction. Additionally, we propose a simple KBQA system that integrates entity linking and our proposed relation detector to make the two components enhance each other. Our experimental results show that our approach not only achieves outstanding relation detection performance, but more importantly, it helps our KBQA system achieve state-of-the-art accuracy for both single-relation (SimpleQuestions) and multi-relation (WebQSP) QA benchmarks.


pdf bib
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond
Ramesh Nallapati | Bowen Zhou | Cicero dos Santos | Çağlar Gu̇lçehre | Bing Xiang
Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning

pdf bib
Improved Representation Learning for Question Answer Matching
Ming Tan | Cicero dos Santos | Bing Xiang | Bowen Zhou
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)


pdf bib
Detecting Semantically Equivalent Questions in Online User Forums
Dasha Bogdanova | Cícero dos Santos | Luciano Barbosa | Bianca Zadrozny
Proceedings of the Nineteenth Conference on Computational Natural Language Learning

pdf bib
Boosting Named Entity Recognition with Neural Character Embeddings
Cícero dos Santos | Victor Guimarães
Proceedings of the Fifth Named Entity Workshop

pdf bib
Classifying Relations by Ranking with Convolutional Neural Networks
Cícero dos Santos | Bing Xiang | Bowen Zhou
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf bib
Learning Hybrid Representations to Retrieve Semantically Equivalent Questions
Cícero dos Santos | Luciano Barbosa | Dasha Bogdanova | Bianca Zadrozny
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)


pdf bib
Think Positive: Towards Twitter Sentiment Analysis from Scratch
Cícero dos Santos
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
Latent Trees for Coreference Resolution
Eraldo Rezende Fernandes | Cícero Nogueira dos Santos | Ruy Luiz Milidiú
Computational Linguistics, Volume 40, Issue 4 - December 2014

pdf bib
Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts
Cícero dos Santos | Maíra Gatti
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers


pdf bib
Latent Structure Perceptron with Feature Induction for Unrestricted Coreference Resolution
Eraldo Fernandes | Cícero dos Santos | Ruy Milidiú
Joint Conference on EMNLP and CoNLL - Shared Task


pdf bib
Rule and Tree Ensembles for Unrestricted Coreference Resolution
Cicero Nogueira dos Santos | Davi Lopes Carvalho
Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task


pdf bib
Phrase Chunking Using Entropy Guided Transformation Learning
Ruy Luiz Milidiú | Cícero Nogueira dos Santos | Julio C. Duarte
Proceedings of ACL-08: HLT