Jian Sun


pdf bib
Learning Low-Resource End-To-End Goal-Oriented Dialog for Fast and Reliable System Deployment
Yinpei Dai | Hangyu Li | Chengguang Tang | Yongbin Li | Jian Sun | Xiaodan Zhu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Existing end-to-end dialog systems perform less effectively when data is scarce. To obtain an acceptable success in real-life online services with only a handful of training examples, both fast adaptability and reliable performance are highly desirable for dialog systems. In this paper, we propose the Meta-Dialog System (MDS), which combines the advantages of both meta-learning approaches and human-machine collaboration. We evaluate our methods on a new extended-bAbI dataset and a transformed MultiWOZ dataset for low-resource goal-oriented dialog learning. Experimental results show that MDS significantly outperforms non-meta-learning baselines and can achieve more than 90% per-turn accuracies with only 10 dialogs on the extended-bAbI dataset.

pdf bib
Dynamic Memory Induction Networks for Few-Shot Text Classification
Ruiying Geng | Binhua Li | Yongbin Li | Jian Sun | Xiaodan Zhu
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper proposes Dynamic Memory Induction Networks (DMIN) for few-short text classification. The model develops a dynamic routing mechanism over static memory, enabling it to better adapt to unseen classes, a critical capability for few-short classification. The model also expands the induction process with supervised learning weights and query information to enhance the generalization ability of meta-learning. The proposed model brings forward the state-of-the-art performance significantly by 2~4% improvement on the miniRCV1 and ODIC datasets. Detailed analysis is further performed to show how the proposed network achieves the new performance.

pdf bib
Dual Attention Network for Cross-lingual Entity Alignment
Jian Sun | Yu Zhou | Chengqing Zong
Proceedings of the 28th International Conference on Computational Linguistics

Cross-lingual Entity alignment is an essential part of building a knowledge graph, which can help integrate knowledge among different language knowledge graphs. In the real KGs, there exists an imbalance among the information in the same hierarchy of corresponding entities, which results in the heterogeneity of neighborhood structure, making this task challenging. To tackle this problem, we propose a dual attention network for cross-lingual entity alignment (DAEA). Specifically, our dual attention consists of relation-aware graph attention and hierarchical attention. The relation-aware graph attention aims at selectively aggregating multi-hierarchy neighborhood information to alleviate the difference of heterogeneity among counterpart entities. The hierarchical attention adaptively aggregates the low-hierarchy and the high-hierarchy information, which is beneficial to balance the neighborhood information of counterpart entities and distinguish non-counterpart entities with similar structures. Finally, we treat cross-lingual entity alignment as a process of linking prediction. Experimental results on three real-world cross-lingual entity alignment datasets have shown the effectiveness of DAEA.


pdf bib
Induction Networks for Few-Shot Text Classification
Ruiying Geng | Binhua Li | Yongbin Li | Xiaodan Zhu | Ping Jian | Jian Sun
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Text classification tends to struggle when data is deficient or when it needs to adapt to unseen classes. In such challenging scenarios, recent studies have used meta-learning to simulate the few-shot task, in which new queries are compared to a small support set at the sample-wise level. However, this sample-wise comparison may be severely disturbed by the various expressions in the same class. Therefore, we should be able to learn a general representation of each class in the support set and then compare it to new queries. In this paper, we propose a novel Induction Network to learn such a generalized class-wise representation, by innovatively leveraging the dynamic routing algorithm in meta-learning. In this way, we find the model is able to induce and generalize better. We evaluate the proposed model on a well-studied sentiment classification dataset (English) and a real-world dialogue intent classification dataset (Chinese). Experiment results show that on both datasets, the proposed model significantly outperforms the existing state-of-the-art approaches, proving the effectiveness of class-wise generalization in few-shot text classification.

pdf bib
Improving Cross-Domain Chinese Word Segmentation with Word Embeddings
Yuxiao Ye | Weikang Li | Yue Zhang | Likun Qiu | Jian Sun
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Cross-domain Chinese Word Segmentation (CWS) remains a challenge despite recent progress in neural-based CWS. The limited amount of annotated data in the target domain has been the key obstacle to a satisfactory performance. In this paper, we propose a semi-supervised word-based approach to improving cross-domain CWS given a baseline segmenter. Particularly, our model only deploys word embeddings trained on raw text in the target domain, discarding complex hand-crafted features and domain-specific dictionaries. Innovative subsampling and negative sampling methods are proposed to derive word embeddings optimized for CWS. We conduct experiments on five datasets in special domains, covering domains in novels, medicine, and patent. Results show that our model can obviously improve cross-domain CWS, especially in the segmentation of domain-specific noun entities. The word F-measure increases by over 3.0% on four datasets, outperforming state-of-the-art semi-supervised and unsupervised cross-domain CWS approaches with a large margin. We make our data and code available on Github.


pdf bib
Learning Visually-Grounded Semantics from Contrastive Adversarial Samples
Haoyue Shi | Jiayuan Mao | Tete Xiao | Yuning Jiang | Jian Sun
Proceedings of the 27th International Conference on Computational Linguistics

We study the problem of grounding distributional representations of texts on the visual domain, namely visual-semantic embeddings (VSE for short). Begin with an insightful adversarial attack on VSE embeddings, we show the limitation of current frameworks and image-text datasets (e.g., MS-COCO) both quantitatively and qualitatively. The large gap between the number of possible constitutions of real-world semantics and the size of parallel data, to a large extent, restricts the model to establish a strong link between textual semantics and visual concepts. We alleviate this problem by augmenting the MS-COCO image captioning datasets with textual contrastive adversarial samples. These samples are synthesized using language priors of human and the WordNet knowledge base, and enforce the model to ground learned embeddings to concrete concepts within the image. This simple but powerful technique brings a noticeable improvement over the baselines on a diverse set of downstream tasks, in addition to defending known-type adversarial attacks. Codes are available at https://github.com/ExplorerFreda/VSE-C.


pdf bib
A Class-based Language Model Approach to Chinese Named Entity Identification
Jian Sun | Ming Zhou | Jianfeng Gao
International Journal of Computational Linguistics & Chinese Language Processing, Volume 8, Number 2, August 2003


pdf bib
Chinese Named Entity Identification Using Class-based Language Model
Jian Sun | Jianfeng Gao | Lei Zhang | Ming Zhou | Changning Huang
COLING 2002: The 19th International Conference on Computational Linguistics