Bo Wang


2020

pdf bib
Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder
John Pougué Biyong | Bo Wang | Terry Lyons | Alejo Nevado-Holgado
Proceedings of the 3rd Clinical Natural Language Processing Workshop

Relying on large pretrained language models such as Bidirectional Encoder Representations from Transformers (BERT) for encoding and adding a simple prediction layer has led to impressive performance in many clinical natural language processing (NLP) tasks. In this work, we present a novel extension to the Transformer architecture, by incorporating signature transform with the self-attention model. This architecture is added between embedding and prediction layers. Experiments on a new Swedish prescription data show the proposed architecture to be superior in two of the three information extraction tasks, comparing to baseline models. Finally, we evaluate two different embedding approaches between applying Multilingual BERT and translating the Swedish text to English then encode with a BERT model pretrained on clinical notes.

2019

pdf bib
DeepGeneMD: A Joint Deep Learning Model for Extracting Gene Mutation-Disease Knowledge from PubMed Literature
Feifan Liu | Xiaoyu Zheng | Bo Wang | Catarina Kiefe
Proceedings of The 5th Workshop on BioNLP Open Shared Tasks

Understanding the pathogenesis of genetic diseases through different gene activities and their relations to relevant diseases is important for new drug discovery and drug repositioning. In this paper, we present a joint deep learning model in a multi-task learning paradigm for gene mutation-disease knowledge extraction, DeepGeneMD, which adapts the state-of-the-art hierarchical multi-task learning framework for joint inference on named entity recognition (NER) and relation extraction (RE) in the context of the AGAC (Active Gene Annotation Corpus) track at 2019 BioNLP Open Shared Tasks (BioNLP-OST). It simultaneously extracts gene mutation related activities, diseases, and their relations from the published scientific literature. In DeepGeneMD, we explore the task decomposition to create auxiliary subtasks so that more interactions between different learning subtasks can be leveraged in model training. Our model achieves the average F1 score of 0.45 on recognizing gene activities and disease entities, ranking 2nd in the AGAC NER task; and the average F1 score of 0.35 on extracting relations, ranking 1st in the AGAC RE task.

2018

pdf bib
OpenNMT System Description for WNMT 2018: 800 words/sec on a single-core CPU
Jean Senellart | Dakun Zhang | Bo Wang | Guillaume Klein | Jean-Pierre Ramatchandirin | Josep Crego | Alexander Rush
Proceedings of the 2nd Workshop on Neural Machine Translation and Generation

We present a system description of the OpenNMT Neural Machine Translation entry for the WNMT 2018 evaluation. In this work, we developed a heavily optimized NMT inference model targeting a high-performance CPU system. The final system uses a combination of four techniques, all of them lead to significant speed-ups in combination: (a) sequence distillation, (b) architecture modifications, (c) precomputation, particularly of vocabulary, and (d) CPU targeted quantization. This work achieves the fastest performance of the shared task, and led to the development of new features that have been integrated to OpenNMT and available to the community.

2017

pdf bib
TDParse: Multi-target-specific sentiment recognition on Twitter
Bo Wang | Maria Liakata | Arkaitz Zubiaga | Rob Procter
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Existing target-specific sentiment recognition methods consider only a single target per tweet, and have been shown to miss nearly half of the actual targets mentioned. We present a corpus of UK election tweets, with an average of 3.09 entities per tweet and more than one type of sentiment in half of the tweets. This requires a method for multi-target specific sentiment recognition, which we develop by using the context around a target as well as syntactic dependencies involving the target. We present results of our method on both a benchmark corpus of single targets and the multi-target election corpus, showing state-of-the art performance in both corpora and outperforming previous approaches to multi-target sentiment task as well as deep learning models for single-target sentiment.

pdf bib
TOTEMSS: Topic-based, Temporal Sentiment Summarisation for Twitter
Bo Wang | Maria Liakata | Adam Tsakalidis | Spiros Georgakopoulos Kolaitis | Symeon Papadopoulos | Lazaros Apostolidis | Arkaitz Zubiaga | Rob Procter | Yiannis Kompatsiaris
Proceedings of the IJCNLP 2017, System Demonstrations

We present a system for time sensitive, topic based summarisation of the sentiment around target entities and topics in collections of tweets. We describe the main elements of the system and illustrate its functionality with two examples of sentiment analysis of topics related to the 2017 UK general election.

pdf bib
SYSTRAN Purely Neural MT Engines for WMT2017
Yongchao Deng | Jungi Kim | Guillaume Klein | Catherine Kobus | Natalia Segal | Christophe Servan | Bo Wang | Dakun Zhang | Josep Crego | Jean Senellart
Proceedings of the Second Conference on Machine Translation

2015

pdf bib
WarwickDCS: From Phrase-Based to Target-Specific Sentiment Recognition
Richard Townsend | Adam Tsakalidis | Yiwei Zhou | Bo Wang | Maria Liakata | Arkaitz Zubiaga | Alexandra Cristea | Rob Procter
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2010

pdf bib
All in Strings: a Powerful String-based Automatic MT Evaluation Metric with Multiple Granularities
Junguo Zhu | Muyun Yang | Bo Wang | Sheng Li | Tiejun Zhao
Coling 2010: Posters

2009

pdf bib
References Extension for the Automatic Evaluation of MT by Syntactic Hybridization
Bo Wang | Tiejun Zhao | Muyun Yang | Sheng Li
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009

pdf bib
A Statistical Machine Translation Model Based on a Synthetic Synchronous Grammar
Hongfei Jiang | Muyun Yang | Tiejun Zhao | Sheng Li | Bo Wang
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
Bootstrapping Both Product Features and Opinion Words from Chinese Customer Reviews with Cross-Inducing
Bo Wang | Houfeng Wang
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib
Diagnostic Evaluation of Machine Translation Systems Using Automatically Constructed Linguistic Check-Points
Ming Zhou | Bo Wang | Shujie Liu | Mu Li | Dongdong Zhang | Tiejun Zhao
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)