Muhao Chen


2020

pdf bib
What Are You Trying to Do? Semantic Typing of Event Processes
Muhao Chen | Hongming Zhang | Haoyu Wang | Dan Roth
Proceedings of the 24th Conference on Computational Natural Language Learning

This paper studies a new cognitively motivated semantic typing task,multi-axis event process typing, that, given anevent process, attempts to infer free-form typelabels describing (i) the type of action made bythe process and (ii) the type of object the pro-cess seeks to affect. This task is inspired bycomputational and cognitive studies of eventunderstanding, which suggest that understand-ing processes of events is often directed by rec-ognizing the goals, plans or intentions of theprotagonist(s). We develop a large dataset con-taining over 60k event processes, featuring ul-tra fine-grained typing on both the action andobject type axes with very large (10ˆ3∼10ˆ4)label vocabularies. We then propose a hybridlearning framework,P2GT, which addressesthe challenging typing problem with indirectsupervision from glosses1and a joint learning-to-rank framework. As our experiments indi-cate,P2GTsupports identifying the intent ofprocesses, as well as the fine semantic type ofthe affected object. It also demonstrates the ca-pability of handling few-shot cases, and stronggeneralizability on out-of-domain processes.

pdf bib
Multilingual Knowledge Graph Completion via Ensemble Knowledge Transfer
Xuelu Chen | Muhao Chen | Changjun Fan | Ankith Uppunda | Yizhou Sun | Carlo Zaniolo
Findings of the Association for Computational Linguistics: EMNLP 2020

Predicting missing facts in a knowledge graph(KG) is a crucial task in knowledge base construction and reasoning, and it has been the subject of much research in recent works us-ing KG embeddings. While existing KG embedding approaches mainly learn and predict facts within a single KG, a more plausible solution would benefit from the knowledge in multiple language-specific KGs, considering that different KGs have their own strengths and limitations on data quality and coverage. This is quite challenging since the transfer of knowledge among multiple independently maintained KGs is often hindered by the insufficiency of alignment information and inconsistency of described facts. In this paper, we propose kens, a novel framework for embedding learning and ensemble knowledge transfer across a number of language-specific KGs.KEnS embeds all KGs in a shared embedding space, where the association of entities is captured based on self-learning. Then, KEnS performs ensemble inference to com-bine prediction results from multiple language-specific embeddings, for which multiple en-semble techniques are investigated. Experiments on the basis of five real-world language-specific KGs show that, by effectively identifying and leveraging complementary knowledge, KEnS consistently improves state-of-the-art methods on KG completion.

pdf bib
Joint Constrained Learning for Event-Event Relation Extraction
Haoyu Wang | Muhao Chen | Hongming Zhang | Dan Roth
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Understanding natural language involves recognizing how multiple event mentions structurally and temporally interact with each other. In this process, one can induce event complexes that organize multi-granular events with temporal order and membership relations interweaving among them. Due to the lack of jointly labeled data for these relational phenomena and the restriction on the structures they articulate, we propose a joint constrained learning framework for modeling event-event relations. Specifically, the framework enforces logical constraints within and across multiple temporal and subevent relations of events by converting these constraints into differentiable learning objectives. We show that our joint constrained learning approach effectively compensates for the lack of jointly labeled data, and outperforms SOTA methods on benchmarks for both temporal relation extraction and event hierarchy construction, replacing a commonly used but more expensive global inference process. We also present a promising case study to show the effectiveness of our approach to inducing event complexes on an external corpus.

pdf bib
Analogous Process Structure Induction for Sub-event Sequence Prediction
Hongming Zhang | Muhao Chen | Haoyu Wang | Yangqiu Song | Dan Roth
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Computational and cognitive studies of event understanding suggest that identifying, comprehending, and predicting events depend on having structured representations of a sequence of events and on conceptualizing (abstracting) its components into (soft) event categories. Thus, knowledge about a known process such as “buying a car” can be used in the context of a new but analogous process such as “buying a house”. Nevertheless, most event understanding work in NLP is still at the ground level and does not consider abstraction. In this paper, we propose an Analogous Process Structure Induction (APSI) framework, which leverages analogies among processes and conceptualization of sub-event instances to predict the whole sub-event sequence of previously unseen open-domain processes. As our experiments and analysis indicate, APSI supports the generation of meaningful sub-event sequences for unseen processes and can help predict missing events.

pdf bib
Knowledge Association with Hyperbolic Knowledge Graph Embeddings
Zequn Sun | Muhao Chen | Wei Hu | Chengming Wang | Jian Dai | Wei Zhang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Capturing associations for knowledge graphs (KGs) through entity alignment, entity type inference and other related tasks benefits NLP applications with comprehensive knowledge representations. Recent related methods built on Euclidean embeddings are challenged by the hierarchical structures and different scales of KGs. They also depend on high embedding dimensions to realize enough expressiveness. Differently, we explore with low-dimensional hyperbolic embeddings for knowledge association. We propose a hyperbolic relational graph neural network for KG embedding and capture knowledge associations with a hyperbolic transformation. Extensive experiments on entity alignment and type inference demonstrate the effectiveness and efficiency of our method.

2019

pdf bib
Retrofitting Contextualized Word Embeddings with Paraphrases
Weijia Shi | Muhao Chen | Pei Zhou | Kai-Wei Chang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Contextualized word embeddings, such as ELMo, provide meaningful representations for words and their contexts. They have been shown to have a great impact on downstream applications. However, we observe that the contextualized embeddings of a word might change drastically when its contexts are paraphrased. As these embeddings are over-sensitive to the context, the downstream model may make different predictions when the input sentence is paraphrased. To address this issue, we propose a post-processing approach to retrofit the embedding with paraphrases. Our method learns an orthogonal transformation on the input space of the contextualized word embedding model, which seeks to minimize the variance of word representations on paraphrased contexts. Experiments show that the proposed method significantly improves ELMo on various sentence classification and inference tasks.

pdf bib
Examining Gender Bias in Languages with Grammatical Gender
Pei Zhou | Weijia Shi | Jieyu Zhao | Kuan-Hao Huang | Muhao Chen | Ryan Cotterell | Kai-Wei Chang
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Recent studies have shown that word embeddings exhibit gender bias inherited from the training corpora. However, most studies to date have focused on quantifying and mitigating such bias only in English. These analyses cannot be directly extended to languages that exhibit morphological agreement on gender, such as Spanish and French. In this paper, we propose new metrics for evaluating gender bias in word embeddings of these languages and further demonstrate evidence of gender bias in bilingual embeddings which align these languages with English. Finally, we extend an existing approach to mitigate gender bias in word embedding of these languages under both monolingual and bilingual settings. Experiments on modified Word Embedding Association Test, word similarity, word translation, and word pair translation tasks show that the proposed approaches can effectively reduce the gender bias while preserving the utility of the original embeddings.

pdf bib
Learning to Represent Bilingual Dictionaries
Muhao Chen | Yingtao Tian | Haochen Chen | Kai-Wei Chang | Steven Skiena | Carlo Zaniolo
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Bilingual word embeddings have been widely used to capture the correspondence of lexical semantics in different human languages. However, the cross-lingual correspondence between sentences and words is less studied, despite that this correspondence can significantly benefit many applications such as crosslingual semantic search and textual inference. To bridge this gap, we propose a neural embedding model that leverages bilingual dictionaries. The proposed model is trained to map the lexical definitions to the cross-lingual target words, for which we explore with different sentence encoding techniques. To enhance the learning process on limited resources, our model adopts several critical learning strategies, including multi-task learning on different bridges of languages, and joint learning of the dictionary model with a bilingual word embedding model. We conduct experiments on two new tasks. In the cross-lingual reverse dictionary retrieval task, we demonstrate that our model is capable of comprehending bilingual concepts based on descriptions, and the proposed learning strategies are effective. In the bilingual paraphrase identification task, we show that our model effectively associates sentences in different languages via a shared embedding space, and outperforms existing approaches in identifying bilingual paraphrases.

pdf bib
Learning Bilingual Word Embeddings Using Lexical Definitions
Weijia Shi | Muhao Chen | Yingtao Tian | Kai-Wei Chang
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Bilingual word embeddings, which represent lexicons of different languages in a shared embedding space, are essential for supporting semantic and knowledge transfers in a variety of cross-lingual NLP tasks. Existing approaches to training bilingual word embeddings require either large collections of pre-defined seed lexicons that are expensive to obtain, or parallel sentences that comprise coarse and noisy alignment. In contrast, we propose BiLex that leverages publicly available lexical definitions for bilingual word embedding learning. Without the need of predefined seed lexicons, BiLex comprises a novel word pairing strategy to automatically identify and propagate the precise fine-grain word alignment from lexical definitions. We evaluate BiLex in word-level and sentence-level translation tasks, which seek to find the cross-lingual counterparts of words and sentences respectively. BiLex significantly outperforms previous embedding methods on both tasks.