Jan Wira Gotama Putra


2020

pdf bib
TIARA: A Tool for Annotating Discourse Relations and Sentence Reordering
Jan Wira Gotama Putra | Simone Teufel | Kana Matsumura | Takenobu Tokunaga
Proceedings of the 12th Language Resources and Evaluation Conference

This paper introduces TIARA, a new publicly available web-based annotation tool for discourse relations and sentence reordering. Annotation tasks such as these, which are based on relations between large textual objects, are inherently hard to visualise without either cluttering the display and/or confusing the annotators. TIARA deals with the visual complexity during the annotation process by systematically simplifying the layout, and by offering interactive visualisation, including coloured links, indentation, and dual-view. TIARA’s text view allows annotators to focus on the analysis of logical sequencing between sentences. A separate tree view allows them to review their analysis in terms of the overall discourse structure. The dual-view gives it an edge over other discourse annotation tools and makes it particularly attractive as an educational tool (e.g., for teaching students how to argue more effectively). As it is based on standard web technologies and can be easily customised to other annotation schemes, it can be easily used by anybody. Apart from the project it was originally designed for, in which hundreds of texts were annotated by three annotators, TIARA has already been adopted by a second discourse annotation study, which uses it in the teaching of argumentation.

2017

pdf bib
Evaluating text coherence based on semantic similarity graph
Jan Wira Gotama Putra | Takenobu Tokunaga
Proceedings of TextGraphs-11: the Workshop on Graph-based Methods for Natural Language Processing

Coherence is a crucial feature of text because it is indispensable for conveying its communication purpose and meaning to its readers. In this paper, we propose an unsupervised text coherence scoring based on graph construction in which edges are established between semantically similar sentences represented by vertices. The sentence similarity is calculated based on the cosine similarity of semantic vectors representing sentences. We provide three graph construction methods establishing an edge from a given vertex to a preceding adjacent vertex, to a single similar vertex, or to multiple similar vertices. We evaluated our methods in the document discrimination task and the insertion task by comparing our proposed methods to the supervised (Entity Grid) and unsupervised (Entity Graph) baselines. In the document discrimination task, our method outperformed the unsupervised baseline but could not do the supervised baseline, while in the insertion task, our method outperformed both baselines.