Ji-Hoon Kim


2020

pdf bib
Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model
Sungrae Park | Geewook Kim | Junyeop Lee | Junbum Cha | Ji-Hoon Kim | Hwalsuk Lee
Proceedings of the 28th International Conference on Computational Linguistics

This paper introduces a method that efficiently reduces the computational cost and parameter size of Transformer. The proposed model, refer to as Group-Transformer, splits feature space into multiple groups, factorizes the calculation paths, and reduces computations for the group interaction. Extensive experiments on two benchmark tasks, enwik8 and text8, prove our model’s effectiveness and efficiency in small-scale Transformers. To the best of our knowledge, Group-Transformer is the first attempt to design Transformer with the group strategy, widely used for efficient CNN architectures.

pdf bib
Context-Aware Answer Extraction in Question Answering
Yeon Seonwoo | Ji-Hoon Kim | Jung-Woo Ha | Alice Oh
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Extractive QA models have shown very promising performance in predicting the correct answer to a question for a given passage. However, they sometimes result in predicting the correct answer text but in a context irrelevant to the given question. This discrepancy becomes especially important as the number of occurrences of the answer text in a passage increases. To resolve this issue, we propose BLANC (BLock AttentioN for Context prediction) based on two main ideas: context prediction as an auxiliary task in multi-task learning manner, and a block attention method that learns the context prediction task. With experiments on reading comprehension, we show that BLANC outperforms the state-of-the-art QA models, and the performance gap increases as the number of answer text occurrences increases. We also conduct an experiment of training the models using SQuAD and predicting the supporting facts on HotpotQA and show that BLANC outperforms all baseline models in this zero-shot setting.