Jie Hao


pdf bib
XLP at SemEval-2020 Task 9: Cross-lingual Models with Focal Loss for Sentiment Analysis of Code-Mixing Language
Yili Ma | Liang Zhao | Jie Hao
Proceedings of the Fourteenth Workshop on Semantic Evaluation

In this paper, we present an approach for sentiment analysis in code-mixed language on twitter defined in SemEval-2020 Task 9. Our team (referred as LiangZhao) employ different multilingual models with weighted loss focused on complexity of code-mixing in sentence, in which the best model achieved f1-score of 0.806 and ranked 1st of subtask- Sentimix Spanglish. The performance of method is analyzed and each component of our architecture is demonstrated.

pdf bib
OPPO’s Machine Translation System for the IWSLT 2020 Open Domain Translation Task
Qian Zhang | Xiaopu Li | Dawei Dang | Tingxun Shi | Di Ai | Zhengshan Xue | Jie Hao
Proceedings of the 17th International Conference on Spoken Language Translation

In this paper, we demonstrate our machine translation system applied for the Chinese-Japanese bidirectional translation task (aka. open domain translation task) for the IWSLT 2020. Our model is based on Transformer (Vaswani et al., 2017), with the help of many popular, widely proved effective data preprocessing and augmentation methods. Experiments show that these methods can improve the baseline model steadily and significantly.


pdf bib
Multi-Granularity Self-Attention for Neural Machine Translation
Jie Hao | Xing Wang | Shuming Shi | Jinfeng Zhang | Zhaopeng Tu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Current state-of-the-art neural machine translation (NMT) uses a deep multi-head self-attention network with no explicit phrase information. However, prior work on statistical machine translation has shown that extending the basic translation unit from words to phrases has produced substantial improvements, suggesting the possibility of improving NMT performance from explicit modeling of phrases. In this work, we present multi-granularity self-attention (Mg-Sa): a neural network that combines multi-head self-attention and phrase modeling. Specifically, we train several attention heads to attend to phrases in either n-gram or syntactic formalisms. Moreover, we exploit interactions among phrases to enhance the strength of structure modeling – a commonly-cited weakness of self-attention. Experimental results on WMT14 English-to-German and NIST Chinese-to-English translation tasks show the proposed approach consistently improves performance. Targeted linguistic analysis reveal that Mg-Sa indeed captures useful phrase information at various levels of granularities.

pdf bib
Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons
Jie Hao | Xing Wang | Shuming Shi | Jinfeng Zhang | Zhaopeng Tu
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Recent studies have shown that a hybrid of self-attention networks (SANs) and recurrent neural networks RNNs outperforms both individual architectures, while not much is known about why the hybrid models work. With the belief that modeling hierarchical structure is an essential complementary between SANs and RNNs, we propose to further enhance the strength of hybrid models with an advanced variant of RNNs – Ordered Neurons LSTM (ON-LSTM), which introduces a syntax-oriented inductive bias to perform tree-like composition. Experimental results on the benchmark machine translation task show that the proposed approach outperforms both individual architectures and a standard hybrid model. Further analyses on targeted linguistic evaluation and logical inference tasks demonstrate that the proposed approach indeed benefits from a better modeling of hierarchical structure.

pdf bib
Modeling Recurrence for Transformer
Jie Hao | Xing Wang | Baosong Yang | Longyue Wang | Jinfeng Zhang | Zhaopeng Tu
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Recently, the Transformer model that is based solely on attention mechanisms, has advanced the state-of-the-art on various machine translation tasks. However, recent studies reveal that the lack of recurrence modeling hinders its further improvement of translation capacity. In response to this problem, we propose to directly model recurrence for Transformer with an additional recurrence encoder. In addition to the standard recurrent neural network, we introduce a novel attentive recurrent network to leverage the strengths of both attention models and recurrent networks. Experimental results on the widely-used WMT14 English⇒German and WMT17 Chinese⇒English translation tasks demonstrate the effectiveness of the proposed approach. Our studies also reveal that the proposed model benefits from a short-cut that bridges the source and target sequences with a single recurrent layer, which outperforms its deep counterpart.


pdf bib
Well-Formed Dependency to String translation with BTG Grammar
Xiaoqing Li | Kun Wang | Dakun Zhang | Jie Hao
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation


pdf bib
Query Lattice for Translation Retrieval
Meiping Dong | Yong Cheng | Yang Liu | Jia Xu | Maosong Sun | Tatsuya Izuha | Jie Hao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers