Hui Liu


2020

pdf bib
Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation
Bei Li | Hui Liu | Ziyang Wang | Yufan Jiang | Tong Xiao | Jingbo Zhu | Tongran Liu | Changliang Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

In encoder-decoder neural models, multiple encoders are in general used to represent the contextual information in addition to the individual sentence. In this paper, we investigate multi-encoder approaches in document-level neural machine translation (NMT). Surprisingly, we find that the context encoder does not only encode the surrounding sentences but also behaves as a noise generator. This makes us rethink the real benefits of multi-encoder in context-aware translation - some of the improvements come from robust training. We compare several methods that introduce noise and/or well-tuned dropout setup into the training of these encoders. Experimental results show that noisy training plays an important role in multi-encoder-based NMT, especially when the training data is small. Also, we establish a new state-of-the-art on IWSLT Fr-En task by careful use of noise generation and dropout methods.

pdf bib
Jointly Learning to Align and Summarize for Neural Cross-Lingual Summarization
Yue Cao | Hui Liu | Xiaojun Wan
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Cross-lingual summarization is the task of generating a summary in one language given a text in a different language. Previous works on cross-lingual summarization mainly focus on using pipeline methods or training an end-to-end model using the translated parallel data. However, it is a big challenge for the model to directly learn cross-lingual summarization as it requires learning to understand different languages and learning how to summarize at the same time. In this paper, we propose to ease the cross-lingual summarization training by jointly learning to align and summarize. We design relevant loss functions to train this framework and propose several methods to enhance the isomorphism and cross-lingual transfer between languages. Experimental results show that our model can outperform competitive models in most cases. In addition, we show that our model even has the ability to generate cross-lingual summaries without access to any cross-lingual corpus.

pdf bib
Does Gender Matter? Towards Fairness in Dialogue Systems
Haochen Liu | Jamell Dacon | Wenqi Fan | Hui Liu | Zitao Liu | Jiliang Tang
Proceedings of the 28th International Conference on Computational Linguistics

Recently there are increasing concerns about the fairness of Artificial Intelligence (AI) in real-world applications such as computer vision and recommendations. For example, recognition algorithms in computer vision are unfair to black people such as poorly detecting their faces and inappropriately identifying them as “gorillas”. As one crucial application of AI, dialogue systems have been extensively applied in our society. They are usually built with real human conversational data; thus they could inherit some fairness issues which are held in the real world. However, the fairness of dialogue systems has not been well investigated. In this paper, we perform a pioneering study about the fairness issues in dialogue systems. In particular, we construct a benchmark dataset and propose quantitative measures to understand fairness in dialogue models. Our studies demonstrate that popular dialogue models show significant prejudice towards different genders and races. Besides, to mitigate the bias in dialogue systems, we propose two simple but effective debiasing methods. Experiments show that our methods can reduce the bias in dialogue systems significantly. The dataset and the implementation are released to foster fairness research in dialogue systems.

pdf bib
Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning
Haochen Liu | Wentao Wang | Yiqi Wang | Hui Liu | Zitao Liu | Jiliang Tang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Dialogue systems play an increasingly important role in various aspects of our daily life. It is evident from recent research that dialogue systems trained on human conversation data are biased. In particular, they can produce responses that reflect people’s gender prejudice. Many debiasing methods have been developed for various NLP tasks, such as word embedding. However, they are not directly applicable to dialogue systems because they are likely to force dialogue models to generate similar responses for different genders. This greatly degrades the diversity of the generated responses and immensely hurts the performance of the dialogue models. In this paper, we propose a novel adversarial learning framework Debiased-Chat to train dialogue models free from gender bias while keeping their performance. Extensive experiments on two real-world conversation datasets show that our framework significantly reduces gender bias in dialogue models while maintaining the response quality.

pdf bib
Shallow-to-Deep Training for Neural Machine Translation
Bei Li | Ziyang Wang | Hui Liu | Yufan Jiang | Quan Du | Tong Xiao | Huizhen Wang | Jingbo Zhu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Deep encoders have been proven to be effective in improving neural machine translation (NMT) systems, but training an extremely deep encoder is time consuming. Moreover, why deep models help NMT is an open question. In this paper, we investigate the behavior of a well-tuned deep Transformer system. We find that stacking layers is helpful in improving the representation ability of NMT models and adjacent layers perform similarly. This inspires us to develop a shallow-to-deep training method that learns deep models by stacking shallow models. In this way, we successfully train a Transformer system with a 54-layer encoder. Experimental results on WMT’16 English-German and WMT’14 English-French translation tasks show that it is 1:4 faster than training from scratch, and achieves a BLEU score of 30:33 and 43:29 on two tasks. The code is publicly available at https://github.com/libeineu/SDT-Training.

2019

pdf bib
Towards Explainable NLP: A Generative Explanation Framework for Text Classification
Hui Liu | Qingyu Yin | William Yang Wang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Building explainable systems is a critical problem in the field of Natural Language Processing (NLP), since most machine learning models provide no explanations for the predictions. Existing approaches for explainable machine learning systems tend to focus on interpreting the outputs or the connections between inputs and outputs. However, the fine-grained information (e.g. textual explanations for the labels) is often ignored, and the systems do not explicitly generate the human-readable explanations. To solve this problem, we propose a novel generative explanation framework that learns to make classification decisions and generate fine-grained explanations at the same time. More specifically, we introduce the explainable factor and the minimum risk training approach that learn to generate more reasonable explanations. We construct two new datasets that contain summaries, rating scores, and fine-grained reasons. We conduct experiments on both datasets, comparing with several strong neural network baseline systems. Experimental results show that our method surpasses all baselines on both datasets, and is able to generate concise explanations at the same time.

pdf bib
The NiuTrans Machine Translation Systems for WMT19
Bei Li | Yinqiao Li | Chen Xu | Ye Lin | Jiqiang Liu | Hui Liu | Ziyang Wang | Yuhao Zhang | Nuo Xu | Zeyang Wang | Kai Feng | Hexuan Chen | Tengbo Liu | Yanyang Li | Qiang Wang | Tong Xiao | Jingbo Zhu
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper described NiuTrans neural machine translation systems for the WMT 2019 news translation tasks. We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track. Our systems were built on Deep Transformer and several back-translation methods. Iterative knowledge distillation and ensemble+reranking were also employed to obtain stronger models. Our unsupervised submissions were based on NMT enhanced by SMT. As a result, we achieved the highest BLEU scores in {KK↔EN, GU→EN} directions, ranking 2nd in {RU→EN, DE↔CS} and 3rd in {ZH→EN, LT→EN, EN→RU, EN↔DE} among all constrained submissions.

pdf bib
INS: An Interactive Chinese News Synthesis System
Hui Liu | Wentao Qin | Xiaojun Wan
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations)

Nowadays, we are surrounded by more and more online news articles. Tens or hundreds of news articles need to be read if we wish to explore a hot news event or topic. So it is of vital importance to automatically synthesize a batch of news articles related to the event or topic into a new synthesis article (or overview article) for reader’s convenience. It is so challenging to make news synthesis fully automatic that there is no successful solution by now. In this paper, we put forward a novel Interactive News Synthesis system (i.e. INS), which can help generate news overview articles automatically or by interacting with users. More importantly, INS can serve as a tool for editors to help them finish their jobs. In our experiments, INS performs well on both topic representation and synthesis article generation. A user study also demonstrates the usefulness and users’ satisfaction with the INS tool. A demo video is available at https://youtu.be/7ItteKW3GEk.

2007

pdf bib
Semantic Labeling of Compound Nominalization in Chinese
Jinglei Zhao | Hui Liu | Ruzhan Lu
Proceedings of the Workshop on A Broader Perspective on Multiword Expressions

2006

pdf bib
A Weakly Supervised Learning Approach for Spoken Language Understanding
Wei-Lin Wu | Ru-Zhan Lu | Jian-Yong Duan | Hui Liu | Feng Gao | Yu-Quan Chen
Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

2004

pdf bib
An Enhanced Model for Chinese Word Segmentation and Part-of-Speech Tagging
Feng Jiang | Hui Liu | Yuquan Chen | Ruzhan Lu
Proceedings of the Third SIGHAN Workshop on Chinese Language Processing