Mirna Adriani


pdf bib
Speech-Emotion Detection in an Indonesian Movie
Fahmi Fahmi | Meganingrum Arista Jiwanggi | Mirna Adriani
Proceedings of the 1st Joint Workshop on Spoken Language Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL)

The growing demand to develop an automatic emotion recognition system for the Human-Computer Interaction field had pushed some research in speech emotion detection. Although it is growing, there is still little research about automatic speech emotion detection in Bahasa Indonesia. Another issue is the lack of standard corpus for this research area in Bahasa Indonesia. This study proposed several approaches to detect speech-emotion in the dialogs of an Indonesian movie by classifying them into 4 different emotion classes i.e. happiness, sadness, anger, and neutral. There are two different speech data representations used in this study i.e. statistical and temporal/sequence representations. This study used Artificial Neural Network (ANN), Recurrent Neural Network (RNN) with Long Short Term Memory (LSTM) variation, word embedding, and also the hybrid of three to perform the classification task. The best accuracies given by one-vs-rest scenario for each emotion class with speech-transcript pairs using hybrid of non-temporal and embedding approach are 1) happiness: 76.31%; 2) sadness: 86.46%; 3) anger: 82.14%; and 4) neutral: 68.51%. The multiclass classification resulted in 64.66% of precision, 66.79% of recall, and 64.83% of F1-score.


pdf bib
Normalization of Indonesian-English Code-Mixed Twitter Data
Anab Maulana Barik | Rahmad Mahendra | Mirna Adriani
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

Twitter is an excellent source of data for NLP researches as it offers tremendous amount of textual data. However, processing tweet to extract meaningful information is very challenging, at least for two reasons: (i) using nonstandard words as well as informal writing manner, and (ii) code-mixing issues, which is combining multiple languages in single tweet conversation. Most of the previous works have addressed both issues in isolated different task. In this study, we work on normalization task in code-mixed Twitter data, more specifically in Indonesian-English language. We propose a pipeline that consists of four modules, i.e tokenization, language identification, lexical normalization, and translation. Another contribution is to provide a gold standard of Indonesian-English code-mixed data for each module.


pdf bib
Automatically Building a Corpus for Sentiment Analysis on Indonesian Tweets
Alfan Farizki Wicaksono | Clara Vania | Bayu Distiawan | Mirna Adriani
Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing


pdf bib
Predicting Answer Location Using Shallow Semantic Analogical Reasoning in a Factoid Question Answering System
Hapnes Toba | Mirna Adriani | Hisar Maruli Manurung
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation