Chuan-Jie Lin


2020

pdf bib
TOCP: A Dataset for Chinese Profanity Processing
Hsu Yang | Chuan-Jie Lin
Proceedings of the Second Workshop on Trolling, Aggression and Cyberbullying

This paper introduced TOCP, a larger dataset of Chinese profanity. This dataset contains natural sentences collected from social media sites, the profane expressions appearing in the sentences, and their rephrasing suggestions which preserve their meanings in a less offensive way. We proposed several baseline systems using neural network models to test this benchmark. We trained embedding models on a profanity-related dataset and proposed several profanity-related features. Our baseline systems achieved an F1-score of 86.37% in profanity detection and an accuracy of 77.32% in profanity rephrasing.

2019

pdf bib
Expanding English and Chinese Dictionaries by Wikipedia Titles
Wei-Ting Chen | Yu-Te Wang | Chuan-Jie Lin
Proceedings of the 3rd International Conference on Natural Language and Speech Processing

2018

pdf bib
台語古詩朗誦系統A Taiwanese Text-to-Speech System for Ancient Poems[In Chinese]
Yu-Lin Tsai | Chao-Hsiang Huang | Chuan-Jie Lin
Proceedings of the 30th Conference on Computational Linguistics and Speech Processing (ROCLING 2018)

pdf bib
Detecting Grammatical Errors in the NTOU CGED System by Identifying Frequent Subsentences
Chuan-Jie Lin | Shao-Heng Chen
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

The main goal of Chinese grammatical error diagnosis task is to detect word er-rors in the sentences written by Chinese-learning students. Our previous system would generate error-corrected sentences as candidates and their sentence likeli-hood were measured based on a large scale Chinese n-gram dataset. This year we further tried to identify long frequent-ly-seen subsentences and label them as correct in order to avoid propose too many error candidates. Two new methods for suggesting missing and selection er-rors were also tested.

2017

pdf bib
NTOUA at IJCNLP-2017 Task 2: Predicting Sentiment Scores of Chinese Words and Phrases
Chuan-Jie Lin | Hao-Tsung Chang
Proceedings of the IJCNLP 2017, Shared Tasks

This paper describes the approaches of sentimental score prediction in the NTOU DSA system participating in DSAP this year. The modules to predict scores for words are adapted from our system last year. The approach to predict scores for phrases is keyword-based machine learning method. The performance of our system is good in predicting scores of phrases.

pdf bib
Rephrasing Profanity in Chinese Text
Hui-Po Su | Zhen-Jie Huang | Hao-Tsung Chang | Chuan-Jie Lin
Proceedings of the First Workshop on Abusive Language Online

This paper proposes a system that can detect and rephrase profanity in Chinese text. Rather than just masking detected profanity, we want to revise the input sentence by using inoffensive words while keeping their original meanings. 29 of such rephrasing rules were invented after observing sentences on real-word social websites. The overall accuracy of the proposed system is 85.56%

2016

pdf bib
Using Wikipedia and Semantic Resources to Find Answer Types and Appropriate Answer Candidate Sets in Question Answering
Po-Chun Chen | Meng-Jie Zhuang | Chuan-Jie Lin
Proceedings of the Open Knowledge Base and Question Answering Workshop (OKBQA 2016)

This paper proposes a new idea that uses Wikipedia categories as answer types and defines candidate sets inside Wikipedia. The focus of a given question is searched in the hierarchy of Wikipedia main pages. Our searching strategy combines head-noun matching and synonym matching provided in semantic resources. The set of answer candidates is determined by the entry hierarchy in Wikipedia and the hyponymy hierarchy in WordNet. The experimental results show that the approach can find candidate sets in a smaller size but achieve better performance especially for ARTIFACT and ORGANIZATION types, where the performance is better than state-of-the-art Chinese factoid QA systems.

pdf bib
Generating and Scoring Correction Candidates in Chinese Grammatical Error Diagnosis
Shao-Heng Chen | Yu-Lin Tsai | Chuan-Jie Lin
Proceedings of the 3rd Workshop on Natural Language Processing Techniques for Educational Applications (NLPTEA2016)

Grammatical error diagnosis is an essential part in a language-learning tutoring system. Based on the data sets of Chinese grammar error detection tasks, we proposed a system which measures the likelihood of correction candidates generated by deleting or inserting characters or words, moving substrings to different positions, substituting prepositions with other prepositions, or substituting words with their synonyms or similar strings. Sentence likelihood is measured based on the frequencies of substrings from the space-removed version of Google n-grams. The evaluation on the training set shows that Missing-related and Selection-related candidate generation methods have promising performance. Our final system achieved a precision of 30.28% and a recall of 62.85% in the identification level evaluated on the test set.

2015

pdf bib
A Study on Chinese Spelling Check Using Confusion Sets and?N-gram Statistics
Chuan-Jie Lin | Wei-Cheng Chu
International Journal of Computational Linguistics & Chinese Language Processing, Volume 20, Number 1, June 2015-Special Issue on Chinese as a Foreign Language

pdf bib
NTOU Chinese Spelling Check System in Sighan-8 Bake-off
Wei-Cheng Chu | Chuan-Jie Lin
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing

pdf bib
NTOU Chinese Grammar Checker for CGED Shared Task
Chuan-Jie Lin | Shao-Heng Chen
Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications

2014

pdf bib
NTOU Chinese Spelling Check System in CLP Bake-off 2014
Wei-Cheng Chu | Chuan-Jie Lin
Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing

2013

pdf bib
NTOU Chinese Spelling Check System in SIGHAN Bake-off 2013
Chuan-Jie Lin | Wei-Cheng Chu
Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing

2012

pdf bib
Strategies of Processing Japanese Names and Character Variants in Traditional Chinese Text
Chuan-Jie Lin | Jia-Cheng Zhan | Yen-Heng Chen | Chien-Wei Pao
International Journal of Computational Linguistics & Chinese Language Processing, Volume 17, Number 3, September 2012

2010

pdf bib
Tourism-Related Opinion Mining
Chuan-Jie Lin | Pin-Hsien Chao
Proceedings of the 22nd Conference on Computational Linguistics and Speech Processing (ROCLING 2010)

pdf bib
Tourism-Related Opinion Detection and Tourist-Attraction Target Identification
Chuan-Jie Lin | Pin-Hsien Chao
International Journal of Computational Linguistics & Chinese Language Processing, Volume 15, Number 1, March 2010

2006

pdf bib
Question Pre-Processing in a QA System on Internet Discussion Groups
Chuan-Jie Lin | Chun-Hung Cho
Proceedings of the Workshop on Task-Focused Summarization and Question Answering

2003

pdf bib
以網際網路內容為基礎之問答系統 “Why” 問句研究 (The Study of Why Questions in Web-based Question-Answering Systems) [In Chinese]
Tean-Zuo Shen | Chuan-Jie Lin | Hsin-Hsi Chen
Proceedings of Research on Computational Linguistics Conference XV

2001

pdf bib
簡易影片字幕文字辨識法及其詢答應用 (A Simple Method for Video OCR and Its Application on Question Answering) [In Chinese]
Chuan-Jie Lin | Che-Chia Liu | Hsin-Hsi Chen
Proceedings of Research on Computational Linguistics Conference XIV

pdf bib
A Simple Method for Chinese Video OCR and Its Application to Question Answering
Chuan-Jie Lin | Che-Chia Liu | Hsin-Hsi Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 6, Number 2, August 2001

2000

pdf bib
A Muitilingual News Summarizer
Hsin-Hsi Chen | Chuan-Jie Lin
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

1999

pdf bib
A Mandarin to Taiwanese Min Nan Machine Translation System with Speech Synthesis of Taiwanese Min Nan
Chuan-Jie Lin | Hsin-Hsi Chen
International Journal of Computational Linguistics & Chinese Language Processing, Volume 4, Number 1, February 1999