Goran Nenadic

Also published as: Goran Nenadić


pdf bib
A Framework for Evaluation of Machine Reading Comprehension Gold Standards
Viktor Schlegel | Marco Valentino | Andre Freitas | Goran Nenadic | Riza Batista-Navarro
Proceedings of the 12th Language Resources and Evaluation Conference

Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text. While neural MRC systems gain popularity and achieve noticeable performance, issues are being raised with the methodology used to establish their performance, particularly concerning the data design of gold standards that are used to evaluate them. There is but a limited understanding of the challenges present in this data, which makes it hard to draw comparisons and formulate reliable hypotheses. As a first step towards alleviating the problem, this paper proposes a unifying framework to systematically investigate the present linguistic features, required reasoning and background knowledge and factual correctness on one hand, and the presence of lexical cues as a lower bound for the requirement of understanding on the other hand. We propose a qualitative annotation schema for the first and a set of approximative metrics for the latter. In a first application of the framework, we analyse modern MRC gold standards and present our findings: the absence of features that contribute towards lexical ambiguity, the varying factual correctness of the expected answers and the presence of lexical cues, all of which potentially lower the reading comprehension complexity and quality of the evaluation data.

pdf bib
An efficient representation of chronological events in medical texts
Andrey Kormilitzin | Nemanja Vaci | Qiang Liu | Hao Ni | Goran Nenadic | Alejo Nevado-Holgado
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis

In this work we addressed the problem of capturing sequential information contained in longitudinal electronic health records (EHRs). Clinical notes, which is a particular type of EHR data, are a rich source of information and practitioners often develop clever solutions how to maximise the sequential information contained in free-texts. We proposed a systematic methodology for learning from chronological events available in clinical notes. The proposed methodological path signature framework creates a non-parametric hierarchical representation of sequential events of any type and can be used as features for downstream statistical learning tasks. The methodology was developed and externally validated using the largest in the UK secondary care mental health EHR data on a specific task of predicting survival risk of patients diagnosed with Alzheimer’s disease. The signature-based model was compared to a common survival random forest model. Our results showed a 15.4% increase of risk prediction AUC at the time point of 20 months after the first admission to a specialist memory clinic and the signature method outperformed the baseline mixed-effects model by 13.2 %.


pdf bib
MedNorm: A Corpus and Embeddings for Cross-terminology Medical Concept Normalisation
Maksim Belousov | William G. Dixon | Goran Nenadic
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task

The medical concept normalisation task aims to map textual descriptions to standard terminologies such as SNOMED-CT or MedDRA. Existing publicly available datasets annotated using different terminologies cannot be simply merged and utilised, and therefore become less valuable when developing machine learning-based concept normalisation systems. To address that, we designed a data harmonisation pipeline and engineered a corpus of 27,979 textual descriptions simultaneously mapped to both MedDRA and SNOMED-CT, sourced from five publicly available datasets across biomedical and social media domains. The pipeline can be used in the future to integrate new datasets into the corpus and also could be applied in relevant data curation tasks. We also described a method to merge different terminologies into a single concept graph preserving their relations and demonstrated that representation learning approach based on random walks on a graph can efficiently encode both hierarchical and equivalent relations and capture semantic similarities not only between concepts inside a given terminology but also between concepts from different terminologies. We believe that making a corpus and embeddings for cross-terminology medical concept normalisation available to the research community would contribute to a better understanding of the task.


pdf bib
Inferring Methodological Meta-knowledge from Large Biomedical Corpora
Goran Nenadic
Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation: Keynote Speeches and Invited Talks


pdf bib
Mining temporal footprints from Wikipedia
Michele Filannino | Goran Nenadic
Proceedings of the First AHA!-Workshop on Information Discovery in Text


pdf bib
ManTIME: Temporal expression identification and normalization in the TempEval-3 challenge
Michele Filannino | Gavin Brown | Goran Nenadic
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)


pdf bib
An Exploration of Mining Gene Expression Mentions and Their Anatomical Locations from Biomedical Text
Martin Gerner | Goran Nenadic | Casey M. Bergman
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

pdf bib
Using SVMs with the Command Relation features to identify negated events in biomedical literature
Farzaneh Sarafraz | Goran Nenadic
Proceedings of the Workshop on Negation and Speculation in Natural Language Processing


pdf bib
Biomedical Event Detection using Rules, Conditional Random Fields and Parse Tree Distances
Farzaneh Sarafraz | James Eales | Reza Mohammadi | Jonathan Dickerson | David Robertson | Goran Nenadic
Proceedings of the BioNLP 2009 Workshop Companion Volume for Shared Task


pdf bib
Towards a terminological resource for biomedical text mining
Goran Nenadic | Naoki Okazaki | Sophia Ananiadou
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

One of the main challenges in biomedical text mining is the identification of terminology, which is a key factor for accessing and integrating the information stored in literature. Manual creation of biomedical terminologies cannot keep pace with the data that becomes available. Still, many of them have been used in attempts to recognise terms in literature, but their suitability for text mining has been questioned as substantial re-engineering is needed to tailor the resources for automatic processing. Several approaches have been suggested to automatically integrate and map between resources, but the problems of extensive variability of lexical representations and ambiguity have been revealed. In this paper we present a methodology to automatically maintain a biomedical terminological database, which contains automatically extracted terms, their mutual relationships, features and possible annotations that can be useful in text processing. In addition to TermDB, a database used for terminology management and storage, we present the following modules that are used to populate the database: TerMine (recognition, extraction and normalisation of terms from literature), AcroTerMine (extraction and clustering of acronyms and their long forms), AnnoTerm (annotation and classification of terms), and ClusTerm (extraction of term associations and clustering of terms).

pdf bib
Annotation and Disambiguation of Semantic Types in Biomedical Text: A Cascaded Approach to Named Entity Recognition
Dietrich Rebholz-Schuhmann | Harald Kirsch | Sylvain Gaudan | Miguel Arregui | Goran Nenadic
Proceedings of the 5th Workshop on NLP and XML (NLPXML-2006): Multi-Dimensional Markup in Natural Language Processing


pdf bib
Enhancing automatic term recognition through recognition of variation
Goran Nenadic | Sophia Ananiadou | John McNaught
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

pdf bib
Exploring Balkanet Shared Ontology for Multilingual Conceptual Indexing
Sofia Stamou | Goran Nenadic | Dimitris Christodoulakis
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)


pdf bib
Using Domain-Specific Verbs for Term Classification
Irena Spasic | Goran Nenadic | Sophia Ananiadou
Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine

pdf bib
Selecting Text Features for Gene Name Classification: from Documents to Terms
Goran Nenadic | Simon Rice | Irena Spasic | Sophia Ananiadou | Benjamin Stapley
Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine

pdf bib
Morpho-syntactic Clues for Terminological Processing in Serbian
Goran Nenadić | Irena Spasić | Sophia Ananiadou
Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages

pdf bib
An Integrated Term-Based Corpus Query System
Irena Spasic | Goran Nenadic | Kostas Manios | Sophia Ananiadou
10th Conference of the European Chapter of the Association for Computational Linguistics


pdf bib
Automatic Discovery of Term Similarities Using Pattern Mining
Goran Nenadić | Irena Spasić | Sophia Ananiadou
COLING-02: COMPUTERM 2002: Second International Workshop on Computational Terminology

pdf bib
Tuning Context Features with Genetic Algorithms
Irena Spasić | Goran Nenadić | Sophia Ananiadou
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
Automatic Acronym Acquisition and Term Variation Management within Domain-Specific Texts
Goran Nenadić | Irena Spasić | Sophia Ananiadou
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

pdf bib
A Methodology for Terminology-based Knowledge Acquisition and Integration
Hideki Mima | Sophia Ananiadou | Goran Nenadic | Jun-Ichi Tsujii
COLING 2002: The 19th International Conference on Computational Linguistics