Chao Sun


2020

pdf bib
基于神经网络的连动句识别(Recognition of serial-verb sentences based on Neural Network)
Chao Sun (孙超) | Weiguang Qu (曲维光) | Tingxin Wei (魏庭新) | Yanhui Gu (顾彦慧) | Bin Li (李斌) | Junsheng Zhou (周俊生)
Proceedings of the 19th Chinese National Conference on Computational Linguistics

连动句是具有连动结构的句子,是汉语中的特殊句法结构,在现代汉语中十分常见且使用频繁。连动句语法结构和语义关系都很复杂,在识别中存在许多问题,对此本文针对连动句的识别问题进行了研究,提出了一种基于神经网络的连动句识别方法。本方法分两步:第一步,运用简单的规则对语料进行预处理;第二步,用文本分类的思想,使用BERT编码,利用多层CNN与BiLSTM模型联合提取特征进行分类,进而完成连动句识别任务。在人工标注的语料上进行实验,实验结果达到92.71%的准确率,F1值为87.41%。

2017

pdf bib
Facebook sentiment: Reactions and Emojis
Ye Tian | Thiago Galery | Giulio Dulcinati | Emilia Molimpakis | Chao Sun
Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media

Emojis are used frequently in social media. A widely assumed view is that emojis express the emotional state of the user, which has led to research focusing on the expressiveness of emojis independent from the linguistic context. We argue that emojis and the linguistic texts can modify the meaning of each other. The overall communicated meaning is not a simple sum of the two channels. In order to study the meaning interplay, we need data indicating the overall sentiment of the entire message as well as the sentiment of the emojis stand-alone. We propose that Facebook Reactions are a good data source for such a purpose. FB reactions (e.g. “Love” and “Angry”) indicate the readers’ overall sentiment, against which we can investigate the types of emojis used the comments under different reaction profiles. We present a data set of 21,000 FB posts (57 million reactions and 8 million comments) from public media pages across four countries.