Minghui Qiu


2020

pdf bib
Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining
Chengyu Wang | Minghui Qiu | Jun Huang | Xiaofeng He
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Pre-trained neural language models bring significant improvement for various NLP tasks, by fine-tuning the models on task-specific training sets. During fine-tuning, the parameters are initialized from pre-trained models directly, which ignores how the learning process of similar NLP tasks in different domains is correlated and mutually reinforced. In this paper, we propose an effective learning procedure named Meta Fine-Tuning (MFT), serving as a meta-learner to solve a group of similar NLP tasks for neural language models. Instead of simply multi-task training over all the datasets, MFT only learns from typical instances of various domains to acquire highly transferable knowledge. It further encourages the language model to encode domain-invariant representations by optimizing a series of novel domain corruption loss functions. After MFT, the model can be fine-tuned for each domain with better parameter initializations and higher generalization ability. We implement MFT upon BERT to solve several multi-domain text mining tasks. Experimental results confirm the effectiveness of MFT and its usefulness for few-shot learning.

2018

pdf bib
Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce
Minghui Qiu | Liu Yang | Feng Ji | Wei Zhou | Jun Huang | Haiqing Chen | Bruce Croft | Wei Lin
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Building multi-turn information-seeking conversation systems is an important and challenging research topic. Although several advanced neural text matching models have been proposed for this task, they are generally not efficient for industrial applications. Furthermore, they rely on a large amount of labeled data, which may not be available in real-world applications. To alleviate these problems, we study transfer learning for multi-turn information seeking conversations in this paper. We first propose an efficient and effective multi-turn conversation model based on convolutional neural networks. After that, we extend our model to adapt the knowledge learned from a resource-rich domain to enhance the performance. Finally, we deployed our model in an industrial chatbot called AliMe Assist and observed a significant improvement over the existing online model.

2017

pdf bib
AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine
Minghui Qiu | Feng-Lin Li | Siyu Wang | Xing Gao | Yan Chen | Weipeng Zhao | Haiqing Chen | Jun Huang | Wei Chu
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We propose AliMe Chat, an open-domain chatbot engine that integrates the joint results of Information Retrieval (IR) and Sequence to Sequence (Seq2Seq) based generation models. AliMe Chat uses an attentive Seq2Seq based rerank model to optimize the joint results. Extensive experiments show our engine outperforms both IR and generation based models. We launch AliMe Chat for a real-world industrial application and observe better results than another public chatbot.

pdf bib
Aspect Extraction from Product Reviews Using Category Hierarchy Information
Yinfei Yang | Cen Chen | Minghui Qiu | Forrest Bao
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Aspect extraction abstracts the common properties of objects from corpora discussing them, such as reviews of products. Recent work on aspect extraction is leveraging the hierarchical relationship between products and their categories. However, such effort focuses on the aspects of child categories but ignores those from parent categories. Hence, we propose an LDA-based generative topic model inducing the two-layer categorical information (CAT-LDA), to balance the aspects of both a parent category and its child categories. Our hypothesis is that child categories inherit aspects from parent categories, controlled by the hierarchy between them. Experimental results on 5 categories of Amazon.com products show that both common aspects of parent category and the individual aspects of sub-categories can be extracted to align well with the common sense. We further evaluate the manually extracted aspects of 16 products, resulting in an average hit rate of 79.10%.

2015

pdf bib
Semantic Analysis and Helpfulness Prediction of Text for Online Product Reviews
Yinfei Yang | Yaowei Yan | Minghui Qiu | Forrest Bao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf bib
Generating Supplementary Travel Guides from Social Media
Liu Yang | Jing Jiang | Lifu Huang | Minghui Qiu | Lizi Liao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Learning Topics and Positions from Debatepedia
Swapna Gottipati | Minghui Qiu | Yanchuan Sim | Jing Jiang | Noah A. Smith
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

pdf bib
Mining User Relations from Online Discussions using Sentiment Analysis and Probabilistic Matrix Factorization
Minghui Qiu | Liu Yang | Jing Jiang
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
A Latent Variable Model for Viewpoint Discovery from Threaded Forum Posts
Minghui Qiu | Jing Jiang
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies