Jong-Hyeok Lee


2020

pdf bib
POSTECH Submission on Duolingo Shared Task
Junsu Park | Hongseok Kwon | Jong-Hyeok Lee
Proceedings of the Fourth Workshop on Neural Generation and Translation

In this paper, we propose a transfer learning based simultaneous translation model by extending BART. We pre-trained BART with Korean Wikipedia and a Korean news dataset, and fine-tuned with an additional web-crawled parallel corpus and the 2020 Duolingo official training dataset. In our experiments on the 2020 Duolingo test dataset, our submission achieves 0.312 in weighted macro F1 score, and ranks second among the submitted En-Ko systems.

2019

pdf bib
Transformer-based Automatic Post-Editing Model with Joint Encoder and Multi-source Attention of Decoder
WonKee Lee | Jaehun Shin | Jong-Hyeok Lee
Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)

This paper describes POSTECH’s submission to the WMT 2019 shared task on Automatic Post-Editing (APE). In this paper, we propose a new multi-source APE model by extending Transformer. The main contributions of our study are that we 1) reconstruct the encoder to generate a joint representation of translation (mt) and its src context, in addition to the conventional src encoding and 2) suggest two types of multi-source attention layers to compute attention between two outputs of the encoder and the decoder state in the decoder. Furthermore, we train our model by applying various teacher-forcing ratios to alleviate exposure bias. Finally, we adopt the ensemble technique across variations of our model. Experiments on the WMT19 English-German APE data set show improvements in terms of both TER and BLEU scores over the baseline. Our primary submission achieves -0.73 in TER and +1.49 in BLEU compare to the baseline.

pdf bib
Decay-Function-Free Time-Aware Attention to Context and Speaker Indicator for Spoken Language Understanding
Jonggu Kim | Jong-Hyeok Lee
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

To capture salient contextual information for spoken language understanding (SLU) of a dialogue, we propose time-aware models that automatically learn the latent time-decay function of the history without a manual time-decay function. We also propose a method to identify and label the current speaker to improve the SLU accuracy. In experiments on the benchmark dataset used in Dialog State Tracking Challenge 4, the proposed models achieved significantly higher F1 scores than the state-of-the-art contextual models. Finally, we analyze the effectiveness of the introduced models in detail. The analysis demonstrates that the proposed methods were effective to improve SLU accuracy individually.

2018

pdf bib
Multi-encoder Transformer Network for Automatic Post-Editing
Jaehun Shin | Jong-Hyeok Lee
Proceedings of the Third Conference on Machine Translation: Shared Task Papers

This paper describes the POSTECH’s submission to the WMT 2018 shared task on Automatic Post-Editing (APE). We propose a new neural end-to-end post-editing model based on the transformer network. We modified the encoder-decoder attention to reflect the relation between the machine translation output, the source and the post-edited translation in APE problem. Experiments on WMT17 English-German APE data set show an improvement in both TER and BLEU score over the best result of WMT17 APE shared task. Our primary submission achieves -4.52 TER and +6.81 BLEU score on PBSMT task and -0.13 TER and +0.40 BLEU score for NMT task compare to the baseline.

2017

pdf bib
Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation
Hyun Kim | Jong-Hyeok Lee | Seung-Hoon Na
Proceedings of the Second Conference on Machine Translation

2016

pdf bib
A Recurrent Neural Networks Approach for Estimating the Quality of Machine Translation Output
Hyun Kim | Jong-Hyeok Lee
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Recurrent Neural Network based Translation Quality Estimation
Hyun Kim | Jong-Hyeok Lee
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2014

pdf bib
Postech’s System Description for Medical Text Translation Task
Jianri Li | Se-Jong Kim | Hwidong Na | Jong-Hyeok Lee
Proceedings of the Ninth Workshop on Statistical Machine Translation

2011

pdf bib
Multi-Word Unit Dependency Forest-based Translation Rule Extraction
Hwidong Na | Jong-Hyeok Lee
Proceedings of Fifth Workshop on Syntax, Semantics and Structure in Statistical Translation

pdf bib
Beyond Chart Parsing: An Analytic Comparison of Dependency Chart Parsing Algorithms
Meixun Jin | Hwidong Na | Jong-Hyeok Lee
Proceedings of the 12th International Conference on Parsing Technologies

2010

pdf bib
Evaluating Multilanguage-Comparability of Subjectivity Analysis Systems
Jungi Kim | Jin-Ji Li | Jong-Hyeok Lee
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf bib
Method of Extracting Is-A and Part-Of Relations Using Pattern Pairs in Mass Corpus
Se-Jong Kim | Yong-Hun Lee | Jong-Hyeok Lee
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 1

pdf bib
Chinese Syntactic Reordering for Adequate Generation of Korean Verbal Phrases in Chinese-to-Korean SMT
Jin-Ji Li | Jungi Kim | Dong-Il Kim | Jong-Hyeok Lee
Proceedings of the Fourth Workshop on Statistical Machine Translation

pdf bib
Discovering the Discriminative Views: Measuring Term Weights for Sentiment Analysis
Jungi Kim | Jin-Ji Li | Jong-Hyeok Lee
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf bib
Annotation Guidelines for Chinese-Korean Word Alignment
Jin-Ji Li | Dong-Il Kim | Jong-Hyeok Lee
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

For a language pair such as Chinese and Korean that belong to entirely different language families in terms of typology and genealogy, finding the correspondences is quite obscure in word alignment. We present annotation guidelines for Chinese-Korean word alignment through contrastive analysis of morpho-syntactic encodings. We discuss the differences in verbal systems that cause most of linking obscurities in annotation process. Systematic comparison of verbal systems is conducted by analyzing morpho-syntactic encodings. The viewpoint of grammatical category allows us to define consistent and systematic instructions for linguistically distant languages such as Chinese and Korean. The scope of our guidelines is limited to the alignment between Chinese and Korean, but the instruction methods exemplified in this paper are also applicable in developing systematic and comprehensible alignment guidelines for other languages having such different linguistic phenomena.

pdf bib
Search Result Clustering Using Label Language Model
Yeha Lee | Seung-Hoon Na | Jong-Hyeok Lee
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II

pdf bib
Automatic Extraction of English-Chinese Transliteration Pairs using Dynamic Window and Tokenizer
Chengguo Jin | Seung-Hoon Na | Dong-Il Kim | Jong-Hyeok Lee
Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing

2005

pdf bib
Chunking Using Conditional Random Fields in Korean Texts
Yong-Hun Lee | Mi-Young Kim | Jong-Hyeok Lee
Second International Joint Conference on Natural Language Processing: Full Papers

pdf bib
Two-Phase Shift-Reduce Deterministic Dependency Parser of Chinese
Meixun Jin | Mi-Young Kim | Jong-Hyeok Lee
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

2004

pdf bib
Segmentation of Chinese Long Sentences Using Commas
Meixun Jin | Mi-Young Kim | Dongil Kim | Jong-Hyeok Lee
Proceedings of the Third SIGHAN Workshop on Chinese Language Processing

pdf bib
Term Extraction from Korean Corpora via Japanese
Atsushi Fujii | Tetsuya Ishikawa | Jong-Hyeok Lee
Proceedings of CompuTerm 2004: 3rd International Workshop on Computational Terminology

2003

pdf bib
An empirical study for generating zero pronoun in Korean based on Cost-based centering model
Ji-Eun Roh | Jong-Hyeok Lee
Proceedings of the Australasian Language Technology Workshop 2003

pdf bib
S-clause segmentation for efficient syntactic analysis using decision trees
Mi-Young Kim | Jong-Hyeok Lee
Proceedings of the Australasian Language Technology Workshop 2003

pdf bib
Resolving Sense Ambiguity of Korean Nouns Based on Concept Co-occurrence Information
You-Jin Chung | Jong-Hyeok Lee
Proceedings of the Australasian Language Technology Workshop 2003

pdf bib
Conceptual Schema Approach to Natural Language Database Access
In-Su Kang | Seung-Hoon Na | Jong-Hyeok Lee
Proceedings of the Australasian Language Technology Workshop 2003

2002

pdf bib
Syllable-Pattern-Based Unknown-Morpheme Segmentation and Estimation for Hybrid Part-of-Speech Tagging of Korean
Gary Geunbae Lee | Jeongwon Cha | Jong-Hyeok Lee
Computational Linguistics, Volume 28, Number 1, March 2002

pdf bib
Word Sense Disambiguation in a Korean-to-Japanese MT System Using Neural Networks
You-Jin Chung | Sin-Jae Kang | Kyong-Hi Moon | Jong-Hyeok Lee
COLING-02: Machine Translation in Asia

pdf bib
A Knowledge Based Approach to Identification of Serial Verb Construction in Chinese-to-Korean Machine Translation System
Dong-il Kim | Zheng Cui | Jinji Li | Jong-Hyeok Lee
COLING-02: The First SIGHAN Workshop on Chinese Language Processing

2001

pdf bib
Semi-Automatic Practical Ontology Construction by Using a Thesaurus, Computational Dictionaries, and Large Corpora
Sin-Jae Kang | Jong-Hyeok Lee
Proceedings of the ACL 2001 Workshop on Human Language Technology and Knowledge Management

2000

pdf bib
Representation and Recognition Method for Multi-Word Translation Units in Korean-to-Japanese MT System
Kyonghi Moon | Jong-Hyeok Lee
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

1998

pdf bib
Unlimited Vocabulary Grapheme to Phoneme Conversion for Korean TTS
Byeongchang Kim | WonIl Lee | Geunbae Lee | Jong-Hyeok Lee
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf bib
Identifying Syntactic Role of Antecedent in Korean Relative Clause Using Corpus and Thesaurus Information
Hui-Feng Li | Jong-Hyeok Lee | Geunbae Lee
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
Unlimited Vocabulary Grapheme to Phoneme Conversion for Korean TTS
Byeongchang Kim | WonIl Lee | Geunbae Lee | Jong-Hyeok Lee
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf bib
Identifying Syntactic Role of Antecedent in Korean Relative Clause using Corpus and Thesaurus Informationes
Hui-Feng Li | Jong-Hyeok Lee | Geunbae Lee
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf bib
Generalized unknown morpheme guessing for hybrid POS tagging of Korean
Jeongwon Cha | Geunbae Lee | Jong-Hyeok Lee
Sixth Workshop on Very Large Corpora

1994

pdf bib
Table-driven Neural Syntactic Analysis of Spoken Korean
Wonll Lee | Geunbae Lee | Jong-Hyeok Lee
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics