Verbal memory is affected by numerous clinical conditions and most neuropsychological and clinical examinations evaluate it. However, a bottleneck exists in such endeavors because traditional methods require expert human review, and usually only a couple of test versions exist, thus limiting the frequency of administration and clinical applications. The present study overcomes this bottleneck by automating the administration, transcription, analysis and scoring of story recall. A large group of healthy participants (n = 120) and patients with mental illness (n = 105) interacted with a mobile application that administered a wide range of assessments, including verbal memory. The resulting speech generated by participants when retelling stories from the memory task was transcribed using automatic speech recognition tools, which was compared with human transcriptions (overall word error rate = 21%). An assortment of surface-level and semantic language-based features were extracted from the verbal recalls. A final set of three features were used to both predict expert human ratings with a ridge regression model (r = 0.88) and to differentiate patients from healthy individuals with an ensemble of logistic regression classifiers (accuracy = 76%). This is the first ‘outside of the laboratory’ study to showcase the viability of the complete pipeline of automated assessment of verbal memory in naturalistic settings.