Bayu Distiawan

Also published as: Bayu Distiawan Trisedya


2019

pdf bib
Neural Relation Extraction for Knowledge Base Enrichment
Bayu Distiawan Trisedya | Gerhard Weikum | Jianzhong Qi | Rui Zhang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We study relation extraction for knowledge base (KB) enrichment. Specifically, we aim to extract entities and their relationships from sentences in the form of triples and map the elements of the extracted triples to an existing KB in an end-to-end manner. Previous studies focus on the extraction itself and rely on Named Entity Disambiguation (NED) to map triples into the KB space. This way, NED errors may cause extraction errors that affect the overall precision and recall.To address this problem, we propose an end-to-end relation extraction model for KB enrichment based on a neural encoder-decoder model. We collect high-quality training data by distant supervision with co-reference resolution and paraphrase detection. We propose an n-gram based attention model that captures multi-word entity names in a sentence. Our model employs jointly learned word and entity embeddings to support named entity disambiguation. Finally, our model uses a modified beam search and a triple classifier to help generate high-quality triples. Our model outperforms state-of-the-art baselines by 15.51% and 8.38% in terms of F1 score on two real-world datasets.

2018

pdf bib
GTR-LSTM: A Triple Encoder for Sentence Generation from RDF Data
Bayu Distiawan Trisedya | Jianzhong Qi | Rui Zhang | Wei Wang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

A knowledge base is a large repository of facts that are mainly represented as RDF triples, each of which consists of a subject, a predicate (relationship), and an object. The RDF triple representation offers a simple interface for applications to access the facts. However, this representation is not in a natural language form, which is difficult for humans to understand. We address this problem by proposing a system to translate a set of RDF triples into natural sentences based on an encoder-decoder framework. To preserve as much information from RDF triples as possible, we propose a novel graph-based triple encoder. The proposed encoder encodes not only the elements of the triples but also the relationships both within a triple and between the triples. Experimental results show that the proposed encoder achieves a consistent improvement over the baseline models by up to 17.6%, 6.0%, and 16.4% in three common metrics BLEU, METEOR, and TER, respectively.

2014

pdf bib
Automatically Building a Corpus for Sentiment Analysis on Indonesian Tweets
Alfan Farizki Wicaksono | Clara Vania | Bayu Distiawan | Mirna Adriani
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing

2012

pdf bib
A GrAF-compliant Indonesian Speech Recognition Web Service on the Language Grid for Transcription Crowdsourcing
Bayu Distiawan | Ruli Manurung
Proceedings of the Sixth Linguistic Annotation Workshop

2010

pdf bib
Developing an Online Indonesian Corpora Repository
Ruli Manurung | Bayu Distiawan | Desmond Darma Putra
Proceedings of the 24th Pacific Asia Conference on Language, Information and Computation