Aschern at SemEval-2020 Task 11: It Takes Three to Tango: RoBERTa, CRF, and Transfer Learning
Proceedings of the Fourteenth Workshop on Semantic Evaluation
We describe our system for SemEval-2020 Task 11 on Detection of Propaganda Techniques in News Articles. We developed ensemble models using RoBERTa-based neural architectures, additional CRF layers, transfer learning between the two subtasks, and advanced post-processing to handle the multi-label nature of the task, the consistency between nested spans, repetitions, and labels from similar spans in training. We achieved sizable improvements over baseline fine-tuned RoBERTa models, and the official evaluation ranked our system 3rd (almost tied with the 2nd) out of 36 teams on the span identification subtask with an F1 score of 0.491, and 2nd (almost tied with the 1st) out of 31 teams on the technique classification subtask with an F1 score of 0.62.
Extract and Aggregate: A Novel Domain-Independent Approach to Factual Data Verification
Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)
Triggered by Internet development, a large amount of information is published in online sources. However, it is a well-known fact that publications are inundated with inaccurate data. That is why fact-checking has become a significant topic in the last 5 years. It is widely accepted that factual data verification is a challenge even for the experts. This paper presents a domain-independent fact checking system. It can solve the fact verification problem entirely or at the individual stages. The proposed model combines various advanced methods of text data analysis, such as BERT and Infersent. The theoretical and empirical study of the system features is carried out. Based on FEVER and Fact Checking Challenge test-collections, experimental results demonstrate that our model can achieve the score on a par with state-of-the-art models designed by the specificity of particular datasets.