Lu Xiang


2020

pdf bib
Knowledge Graph Enhanced Neural Machine Translation via Multi-task Learning on Sub-entity Granularity
Yang Zhao | Lu Xiang | Junnan Zhu | Jiajun Zhang | Yu Zhou | Chengqing Zong
Proceedings of the 28th International Conference on Computational Linguistics

Previous studies combining knowledge graph (KG) with neural machine translation (NMT) have two problems: i) Knowledge under-utilization: they only focus on the entities that appear in both KG and training sentence pairs, making much knowledge in KG unable to be fully utilized. ii) Granularity mismatch: the current KG methods utilize the entity as the basic granularity, while NMT utilizes the sub-word as the granularity, making the KG different to be utilized in NMT. To alleviate above problems, we propose a multi-task learning method on sub-entity granularity. Specifically, we first split the entities in KG and sentence pairs into sub-entity granularity by using joint BPE. Then we utilize the multi-task learning to combine the machine translation task and knowledge reasoning task. The extensive experiments on various translation tasks have demonstrated that our method significantly outperforms the baseline models in both translation quality and handling the entities.

pdf bib
A Knowledge-driven Generative Model for Multi-implication Chinese Medical Procedure Entity Normalization
Jinghui Yan | Yining Wang | Lu Xiang | Yu Zhou | Chengqing Zong
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Medical entity normalization, which links medical mentions in the text to entities in knowledge bases, is an important research topic in medical natural language processing. In this paper, we focus on Chinese medical procedure entity normalization. However, nonstandard Chinese expressions and combined procedures present challenges in our problem. The existing strategies relying on the discriminative model are poorly to cope with normalizing combined procedure mentions. We propose a sequence generative framework to directly generate all the corresponding medical procedure entities. we adopt two strategies: category-based constraint decoding and category-based model refining to avoid unrealistic results. The method is capable of linking entities when a mention contains multiple procedure concepts and our comprehensive experiments demonstrate that the proposed model can achieve remarkable improvements over existing baselines, particularly significant in the case of multi-implication Chinese medical procedures.

2014

pdf bib
Word Segmenter for Chinese Micro-blogging Text Segmentation – Report for CIPS-SIGHAN’2014 Bakeoff
Lu Xiang | Xiaoqing Li | Yu Zhou
Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing