Mahmoud Azab


2020

pdf bib
LifeQA: A Real-life Dataset for Video Question Answering
Santiago Castro | Mahmoud Azab | Jonathan Stroud | Cristina Noujaim | Ruoyao Wang | Jia Deng | Rada Mihalcea
Proceedings of the 12th Language Resources and Evaluation Conference

We introduce LifeQA, a benchmark dataset for video question answering that focuses on day-to-day real-life situations. Current video question answering datasets consist of movies and TV shows. However, it is well-known that these visual domains are not representative of our day-to-day lives. Movies and TV shows, for example, benefit from professional camera movements, clean editing, crisp audio recordings, and scripted dialog between professional actors. While these domains provide a large amount of data for training models, their properties make them unsuitable for testing real-life question answering systems. Our dataset, by contrast, consists of video clips that represent only real-life scenarios. We collect 275 such video clips and over 2.3k multiple-choice questions. In this paper, we analyze the challenging but realistic aspects of LifeQA, and we apply several state-of-the-art video question answering models to provide benchmarks for future research. The full dataset is publicly available at https://lit.eecs.umich.edu/lifeqa/.

2019

pdf bib
Towards Extracting Medical Family History from Natural Language Interactions: A New Dataset and Baselines
Mahmoud Azab | Stephane Dadian | Vivi Nastase | Larry An | Rada Mihalcea
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We introduce a new dataset consisting of natural language interactions annotated with medical family histories, obtained during interactions with a genetic counselor and through crowdsourcing, following a questionnaire created by experts in the domain. We describe the data collection process and the annotations performed by medical professionals, including illness and personal attributes (name, age, gender, family relationships) for the patient and their family members. An initial system that performs argument identification and relation extraction shows promising results – average F-score of 0.87 on complex sentences on the targeted relations.

pdf bib
Representing Movie Characters in Dialogues
Mahmoud Azab | Noriyuki Kojima | Jia Deng | Rada Mihalcea
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

We introduce a new embedding model to represent movie characters and their interactions in a dialogue by encoding in the same representation the language used by these characters as well as information about the other participants in the dialogue. We evaluate the performance of these new character embeddings on two tasks: (1) character relatedness, using a dataset we introduce consisting of a dense character interaction matrix for 4,378 unique character pairs over 22 hours of dialogue from eighteen movies; and (2) character relation classification, for fine- and coarse-grained relations, as well as sentiment relations. Our experiments show that our model significantly outperforms the traditional Word2Vec continuous bag-of-words and skip-gram models, demonstrating the effectiveness of the character embeddings we introduce. We further show how these embeddings can be used in conjunction with a visual question answering system to improve over previous results.

2018

pdf bib
Speaker Naming in Movies
Mahmoud Azab | Mingzhe Wang | Max Smith | Noriyuki Kojima | Jia Deng | Rada Mihalcea
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We propose a new model for speaker naming in movies that leverages visual, textual, and acoustic modalities in an unified optimization framework. To evaluate the performance of our model, we introduce a new dataset consisting of six episodes of the Big Bang Theory TV show and eighteen full movies covering different genres. Our experiments show that our multimodal model significantly outperforms several competitive baselines on the average weighted F-score metric. To demonstrate the effectiveness of our framework, we design an end-to-end memory network model that leverages our speaker naming model and achieves state-of-the-art results on the subtitles task of the MovieQA 2017 Challenge.

2015

pdf bib
Using Word Semantics To Assist English as a Second Language Learners
Mahmoud Azab | Chris Hokamp | Rada Mihalcea
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

2013

pdf bib
An English Reading Tool as a NLP Showcase
Mahmoud Azab | Ahmed Salama | Kemal Oflazer | Hideki Shima | Jun Araki | Teruko Mitamura
The Companion Volume of the Proceedings of IJCNLP 2013: System Demonstrations

pdf bib
An NLP-based Reading Tool for Aiding Non-native English Readers
Mahmoud Azab | Ahmed Salama | Kemal Oflazer | Hideki Shima | Jun Araki | Teruko Mitamura
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib
Dudley North visits North London: Learning When to Transliterate to Arabic
Mahmoud Azab | Houda Bouamor | Behrang Mohit | Kemal Oflazer
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies