Ichiro Kobayashi


2020

pdf bib
Market Comment Generation from Data with Noisy Alignments
Yumi Hamazono | Yui Uehara | Hiroshi Noji | Yusuke Miyao | Hiroya Takamura | Ichiro Kobayashi
Proceedings of the 13th International Conference on Natural Language Generation

End-to-end models on data-to-text learn the mapping of data and text from the aligned pairs in the dataset. However, these alignments are not always obtained reliably, especially for the time-series data, for which real time comments are given to some situation and there might be a delay in the comment delivery time compared to the actual event time. To handle this issue of possible noisy alignments in the dataset, we propose a neural network model with multi-timestep data and a copy mechanism, which allows the models to learn the correspondences between data and text from the dataset with noisier alignments. We focus on generating market comments in Japanese that are delivered each time an event occurs in the market. The core idea of our approach is to utilize multi-timestep data, which is not only the latest market price data when the comment is delivered, but also the data obtained at several timesteps earlier. On top of this, we employ a copy mechanism that is suitable for referring to the content of data records in the market price data. We confirm the superiority of our proposal by two evaluation metrics and show the accuracy improvement of the sentence generation using the time series data by our proposed method.

pdf bib
Dynamically Updating Event Representations for Temporal Relation Classification with Multi-category Learning
Fei Cheng | Masayuki Asahara | Ichiro Kobayashi | Sadao Kurohashi
Findings of the Association for Computational Linguistics: EMNLP 2020

Temporal relation classification is the pair-wise task for identifying the relation of a temporal link (TLINKs) between two mentions, i.e. event, time and document creation time (DCT). It leads to two crucial limits: 1) Two TLINKs involving a common mention do not share information. 2) Existing models with independent classifiers for each TLINK category (E2E, E2T and E2D) hinder from using the whole data. This paper presents an event centric model that allows to manage dynamic event representations across multiple TLINKs. Our model deals with three TLINK categories with multi-task learning to leverage the full size of data. The experimental results show that our proposal outperforms state-of-the-art models and two strong transfer learning baselines on both the English and Japanese data.

pdf bib
Learning with Contrastive Examples for Data-to-Text Generation
Yui Uehara | Tatsuya Ishigaki | Kasumi Aoki | Hiroshi Noji | Keiichi Goshima | Ichiro Kobayashi | Hiroya Takamura | Yusuke Miyao
Proceedings of the 28th International Conference on Computational Linguistics

Existing models for data-to-text tasks generate fluent but sometimes incorrect sentences e.g., “Nikkei gains” is generated when “Nikkei drops” is expected. We investigate models trained on contrastive examples i.e., incorrect sentences or terms, in addition to correct ones to reduce such errors. We first create rules to produce contrastive examples from correct ones by replacing frequent crucial terms such as “gain” or “drop”. We then use learning methods with several losses that exploit contrastive examples. Experiments on the market comment generation task show that 1) exploiting contrastive examples improves the capability of generating sentences with better lexical choice, without degrading the fluency, 2) the choice of the loss function is an important factor because the performances on different metrics depend on the types of loss functions, and 3) the use of the examples produced by some specific rules further improves performance. Human evaluation also supports the effectiveness of using contrastive examples.

pdf bib
Adversarial Training for Commonsense Inference
Lis Pereira | Xiaodong Liu | Fei Cheng | Masayuki Asahara | Ichiro Kobayashi
Proceedings of the 5th Workshop on Representation Learning for NLP

We apply small perturbations to word embeddings and minimize the resultant adversarial risk to regularize the model. We exploit a novel combination of two different approaches to estimate these perturbations: 1) using the true label and 2) using the model prediction. Without relying on any human-crafted features, knowledge bases, or additional datasets other than the target datasets, our model boosts the fine-tuning performance of RoBERTa, achieving competitive results on multiple reading comprehension datasets that require commonsense inference.

2019

pdf bib
Learning to Select, Track, and Generate for Data-to-Text
Hayate Iso | Yui Uehara | Tatsuya Ishigaki | Hiroshi Noji | Eiji Aramaki | Ichiro Kobayashi | Yusuke Miyao | Naoaki Okazaki | Hiroya Takamura
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

We propose a data-to-text generation model with two modules, one for tracking and the other for text generation. Our tracking module selects and keeps track of salient information and memorizes which record has been mentioned. Our generation module generates a summary conditioned on the state of tracking module. Our proposed model is considered to simulate the human-like writing process that gradually selects the information by determining the intermediate variables while writing the summary. In addition, we also explore the effectiveness of the writer information for generations. Experimental results show that our proposed model outperforms existing models in all evaluation metrics even without writer information. Incorporating writer information further improves the performance, contributing to content planning and surface realization.

pdf bib
Controlling Contents in Data-to-Document Generation with Human-Designed Topic Labels
Kasumi Aoki | Akira Miyazawa | Tatsuya Ishigaki | Tatsuya Aoki | Hiroshi Noji | Keiichi Goshima | Ichiro Kobayashi | Hiroya Takamura | Yusuke Miyao
Proceedings of the 12th International Conference on Natural Language Generation

We propose a data-to-document generator that can easily control the contents of output texts based on a neural language model. Conventional data-to-text model is useful when a reader seeks a global summary of data because it has only to describe an important part that has been extracted beforehand. However, because depending on users, it differs what they are interested in, so it is necessary to develop a method to generate various summaries according to users’ interests. We develop a model to generate various summaries and to control their contents by providing the explicit targets for a reference to the model as controllable factors. In the experiments, we used five-minute or one-hour charts of 9 indicators (e.g., Nikkei225), as time-series data, and daily summaries of Nikkei Quick News as textual data. We conducted comparative experiments using two pieces of information: human-designed topic labels indicating the contents of a sentence and automatically extracted keywords as the referential information for generation.

2018

pdf bib
Generating Market Comments Referring to External Resources
Tatsuya Aoki | Akira Miyazawa | Tatsuya Ishigaki | Keiichi Goshima | Kasumi Aoki | Ichiro Kobayashi | Hiroya Takamura | Yusuke Miyao
Proceedings of the 11th International Conference on Natural Language Generation

Comments on a stock market often include the reason or cause of changes in stock prices, such as “Nikkei turns lower as yen’s rise hits exporters.” Generating such informative sentences requires capturing the relationship between different resources, including a target stock price. In this paper, we propose a model for automatically generating such informative market comments that refer to external resources. We evaluated our model through an automatic metric in terms of BLEU and human evaluation done by an expert in finance. The results show that our model outperforms the existing model both in BLEU scores and human judgment.

2016

pdf bib
Generating Natural Language Descriptions for Semantic Representations of Human Brain Activity
Eri Matsuo | Ichiro Kobayashi | Shinji Nishimoto | Satoshi Nishida | Hideki Asoh
Proceedings of the ACL 2016 Student Research Workshop

pdf bib
Human-like Natural Language Generation Using Monte Carlo Tree Search
Kaori Kumagai | Ichiro Kobayashi | Daichi Mochihashi | Hideki Asoh | Tomoaki Nakamura | Takayuki Nagai
Proceedings of the INLG 2016 Workshop on Computational Creativity in Natural Language Generation

pdf bib
A POMDP-based Multimodal Interaction System Using a Humanoid Robot
Sae Iijima | Ichiro Kobayashi
Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation: Posters

2015

pdf bib
Learning Word Meanings and Grammar for Describing Everyday Activities in Smart Environments
Muhammad Attamimi | Yuji Ando | Tomoaki Nakamura | Takayuki Nagai | Daichi Mochihashi | Ichiro Kobayashi | Hideki Asoh
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib
Zero-Shot Learning of Language Models for Describing Human Actions Based on Semantic Compositionality of Actions
Hideki Asoh | Ichiro Kobayashi
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing

pdf bib
Topic-based Multi-document Summarization using Differential Evolution forCombinatorial Optimization of Sentences
Haruka Shigematsu | Ichiro Kobayashi
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing

pdf bib
On-line Summarization of Time-series Documents using a Graph-based Algorithm
Satoko Suzuki | Ichiro Kobayashi
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing

2013

pdf bib
Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm
Yukari Ogura | Ichiro Kobayashi
51st Annual Meeting of the Association for Computational Linguistics Proceedings of the Student Research Workshop

pdf bib
High-quality Training Data Selection using Latent Topics for Graph-based Semi-supervised Learning
Akiko Eriguchi | Ichiro Kobayashi
51st Annual Meeting of the Association for Computational Linguistics Proceedings of the Student Research Workshop

pdf bib
Event Sequence Model for Semantic Analysis of Time and Location in Dialogue System
Yasuhiro Noguchi | Satoru Kogure | Makoto Kondo | Ichiro Kobayashi | Hideki Asoh | Akira Takagi | Tatsuhiro Konishi | Yukihiro Itoh
Proceedings of the 27th Pacific Asia Conference on Language, Information, and Computation (PACLIC 27)

2011

pdf bib
A Latent Topic Extracting Method based on Events in a Document and its Application
Risa Kitajima | Ichiro Kobayashi
Proceedings of the ACL 2011 Student Session

1998

pdf bib
The Multex generator and its environment: application and development
Christian Matthiessen | Licheng Zeng | Marilyn Cross | Ichiro Kobayashi | Kazuhiro Teruya | Canzhong Wu
Natural Language Generation