Nuo Xu


2020

pdf bib
Distinguish Confusing Law Articles for Legal Judgment Prediction
Nuo Xu | Pinghui Wang | Long Chen | Li Pan | Xiaoyan Wang | Junzhou Zhao
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Legal Judgement Prediction (LJP) is the task of automatically predicting a law case’s judgment results given a text describing the case’s facts, which has great prospects in judicial assistance systems and handy services for the public. In practice, confusing charges are often presented, because law cases applicable to similar law articles are easily misjudged. To address this issue, existing work relies heavily on domain experts, which hinders its application in different law systems. In this paper, we present an end-to-end model, LADAN, to solve the task of LJP. To distinguish confusing charges, we propose a novel graph neural network, GDL, to automatically learn subtle differences between confusing law articles, and also design a novel attention mechanism that fully exploits the learned differences to attentively extract effective discriminative features from fact descriptions. Experiments conducted on real-world datasets demonstrate the superiority of our LADAN.

pdf bib
Learning Architectures from an Extended Search Space for Language Modeling
Yinqiao Li | Chi Hu | Yuhao Zhang | Nuo Xu | Yufan Jiang | Tong Xiao | Jingbo Zhu | Tongran Liu | Changliang Li
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.

pdf bib
A Novel Joint Framework for Multiple Chinese Events Extraction
Nuo Xu | Haihua Xie | Dongyan Zhao
Proceedings of the 19th Chinese National Conference on Computational Linguistics

Event extraction is an essential yet challenging task in information extraction. Previous approaches have paid little attention to the problem of roles overlap which is a common phenomenon in practice. To solve this problem, this paper defines event relation triple to explicitly represent relations among triggers, arguments and roles which are incorporated into the model to learn their inter-dependencies. The task of argument extraction is converted to event relation triple extraction. A novel joint framework for multiple Chinese event extraction is proposed which jointly performs predictions for event triggers and arguments based on shared feature representations from pre-trained language model. Experimental comparison with state-of-the-art baselines on ACE 2005 dataset shows the superiority of the proposed method in both trigger classification and argument classification.

2019

pdf bib
The NiuTrans Machine Translation Systems for WMT19
Bei Li | Yinqiao Li | Chen Xu | Ye Lin | Jiqiang Liu | Hui Liu | Ziyang Wang | Yuhao Zhang | Nuo Xu | Zeyang Wang | Kai Feng | Hexuan Chen | Tengbo Liu | Yanyang Li | Qiang Wang | Tong Xiao | Jingbo Zhu
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

This paper described NiuTrans neural machine translation systems for the WMT 2019 news translation tasks. We participated in 13 translation directions, including 11 supervised tasks, namely EN↔{ZH, DE, RU, KK, LT}, GU→EN and the unsupervised DE↔CS sub-track. Our systems were built on Deep Transformer and several back-translation methods. Iterative knowledge distillation and ensemble+reranking were also employed to obtain stronger models. Our unsupervised submissions were based on NMT enhanced by SMT. As a result, we achieved the highest BLEU scores in {KK↔EN, GU→EN} directions, ranking 2nd in {RU→EN, DE↔CS} and 3rd in {ZH→EN, LT→EN, EN→RU, EN↔DE} among all constrained submissions.