Mosha Chen


pdf bib
Predicting Clinical Trial Results by Implicit Evidence Integration
Qiao Jin | Chuanqi Tan | Mosha Chen | Xiaozhong Liu | Songfang Huang
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Clinical trials provide essential guidance for practicing Evidence-Based Medicine, though often accompanying with unendurable costs and risks. To optimize the design of clinical trials, we introduce a novel Clinical Trial Result Prediction (CTRP) task. In the CTRP framework, a model takes a PICO-formatted clinical trial proposal with its background as input and predicts the result, i.e. how the Intervention group compares with the Comparison group in terms of the measured Outcome in the studied Population. While structured clinical evidence is prohibitively expensive for manual collection, we exploit large-scale unstructured sentences from medical literature that implicitly contain PICOs and results as evidence. Specifically, we pre-train a model to predict the disentangled results from such implicit evidence and fine-tune the model with limited data on the downstream datasets. Experiments on the benchmark Evidence Integration dataset show that the proposed model outperforms the baselines by large margins, e.g., with a 10.7% relative gain over BioBERT in macro-F1. Moreover, the performance improvement is also validated on another dataset composed of clinical trials related to COVID-19.

pdf bib
OpenUE: An Open Toolkit of Universal Extraction from Text
Ningyu Zhang | Shumin Deng | Zhen Bi | Haiyang Yu | Jiacheng Yang | Mosha Chen | Fei Huang | Wei Zhang | Huajun Chen
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Natural language processing covers a wide variety of tasks with token-level or sentence-level understandings. In this paper, we provide a simple insight that most tasks can be represented in a single universal extraction format. We introduce a prototype model and provide an open-source and extensible toolkit called OpenUE for various extraction tasks. OpenUE allows developers to train custom models to extract information from the text and supports quick model validation for researchers. Besides, OpenUE provides various functional modules to maintain sufficient modularity and extensibility. Except for the toolkit, we also deploy an online demo with restful APIs to support real-time extraction without training and deploying. Additionally, the online system can extract information in various tasks, including relational triple extraction, slot & intent detection, event extraction, and so on. We release the source code, datasets, and pre-trained models to promote future researches in


pdf bib
NLP_HZ at SemEval-2018 Task 9: a Nearest Neighbor Approach
Wei Qiu | Mosha Chen | Linlin Li | Luo Si
Proceedings of The 12th International Workshop on Semantic Evaluation

Hypernym discovery aims to discover the hypernym word sets given a hyponym word and proper corpus. This paper proposes a simple but effective method for the discovery of hypernym sets based on word embedding, which can be used to measure the contextual similarities between words. Given a test hyponym word, we get its hypernym lists by computing the similarities between the hyponym word and words in the training data, and fill the test word’s hypernym lists with the hypernym list in the training set of the nearest similarity distance to the test word. In SemEval 2018 task9, our results, achieve 1st on Spanish, 2nd on Italian, 6th on English in the metric of MAP.