Dementia (Diagnosis), Alzheimer Dementia (AD), Vascular Dementia (VaD), Lewy Body Dementia (LBD), Frontotemporal Dementia (FTD), Mild Cognitive Impairment (MCI), Depression - Major Depressive Disorder
Conditions
Keywords
artificial intelligence, speech-based artificial intelligence, artificial intelligence in dementia diagnostics, artificial intelligence for dementia screening, artificial intelligence for dementia classification, speech based artificial intelligence, dementia, Vascular dementia (VaD), Alzheimer dementia (AD), Lewy Body Dementia (LBD), Frontotemporal Dementia (FTD), Mild Cognitive Impairment (MCI), Depression - Major Depressive disorder, Dementia (diagnosis)
Brief summary
The goal of this observational study is to develop and test an artificial intelligence (AI) model that can detect signs of dementia and related conditions from speech recordings. The main question is whether a speech-based AI model can correctly tell apart people with normal memory and thinking from those with cognitive impairment. The study will also explore whether the AI can distinguish dementia from depression, separate different dementia subtypes, and identify which people with Mild Cognitive Impairment (MCI) are likely to develop dementia. Participants will complete short memory and speech tasks while being recorded. The AI model will analyze these recordings to learn patterns linked to different diagnoses. At the end of the study, its accuracy will be tested on new participants.
Detailed description
Background Dementia is a growing public health challenge, and early and accurate diagnosis is essential for effective care and potential future disease-modifying treatments. Current diagnostic pathways are resource-intensive and associated with long waiting times. Speech reflects cognitive functioning, and recent international studies have shown that AI can detect dementia-related patterns in speech recordings with promising accuracy. This study aims to develop and validate a speech-based AI model in a Danish setting, providing a non-invasive and scalable screening tool for use in primary care. Phases one This protocol describes the first phase of our study which is expected to be completed in two separate phases. In phase one we seek to train an AI model to analyse speech data from participants with cognitive impairment and compare it to speech data from healthy control participants, as is detailed through this protocol. If the method is validated, we will continue to phase two. Future work In phase two we expect to conduct an external validation. The AI model analysis will be performed on 200 participants in the primary care sector referred for dementia evaluation. The results of the AI analysis will be compared against the final gold standard consensus diagnosis. Phase two will have a separate protocol which will be worked up based on the results from phase one. Elaboration of time perspective Other: Hybrid design. Most participants will be included in a cross-sectional case-control study (single speech recording). For participants with MCI, follow-up data will be collected within the study period to assess progression to dementia, allowing evaluation of the model's ability to distinguish progressive from non-progressive MCI.
Interventions
Participants will be recorded during the test in order til allow the AI to learn and analyze speech patterns.
Participants will be recorded during the test in order til allow the AI to learn and analyze speech patterns.
Participants will be asked to describe the Cookie Theft Picture from the Boston Diagnostic Aphasia Examination. The task will take 2 minutes. Participants will be recorded during the speech task in order to allow the AI to learn and analyze the speech patterns.
Participants will be asked to retell the fairy-tale Cinderella, based on pictures that summarize the fairy-tale. In case Cinderella is not known, participants are asked to tell a story with a start, a middle and an end based on the provided pictures. The task will take 4 minutes. Participants will be recorded during the speech task in order to allow the AI to learn and analyze the speech patterns.
Two fluency tasks from the Addenbrooke's Cognitive Examination. First, the participant is asked to name as many animals as possible within 1 minutes. Next the participant is asked to name as many words starting with S as possible within 1 minute. Participants will be recorded during the tasks in order to allow the AI to learn and analyze speech patterns.
For healthy controls an MRI will be conducted to provide comparable imaging and as part of screening to ensure they do not meet exclusion criteria (neuroradiological findings that could affect cognitive functions). For participants in the follow-up or new referral cohorts, imaging will be performed as part of the standard diagnostic battery and results will be obtained from the electronic journal.
Healthy control participants will undergo a standard blood test panel commonly used in dementia diagnostics. The panel includes complete blood counts, inflammatory markers, kidney- and liver function markers, thyroid-stimulating hormone (TSH), vitamine B12 and folate. These tests are performed to exclude underlying medical conditions that could mimic cognitive impairment. For participants in the follow-up or new referral cohorts, blood sampling will be performed as part of the standard diagnostic battery and results will be obtained from the electronic journal.
Performed on healthy controls to rule out depression using either the geriatric depression scale (GDS) for patients \> 65 year of age or the Major Depression Index (MDI) for patiens \<65 year of age. For participants in the follow-up or new referral cohorts, depression screening will be performed as part of the standard diagnostic battery and results will be obtained from the electronic journal.
Healthy controls will undergo a standard somatic and neurological examination to exclude conditions that may affect cognition. This includes basic neurological assessment and clinical evaluation of general health status. For participants in the follow-up or new referral cohorts, a somatic and neurological examination will be performed as part of the standard diagnostic battery and results will be obtained from the electronic journal
Sponsors
Study design
Eligibility
Inclusion criteria
* Age \> 50 years * Fluent in Danish * Minimum 7 years of schooling For participants from the follow-up cohort: * A consensus diagnosis of either AD, VaD, LBD, FTD, MCI or depression established at the memory clinic within 6 months prior to enrollment For participants from the healthy control cohort: * No known cognitive impairment or affective disorder
Exclusion criteria
* Significantly impaired vision or hearing (to the extent that the participant cannot participate in the linguistic AI analysis) * Participants unable to give consent Participants from follow-up and new referrals cohort: * MMSE score \< 16 * Participants with multiple diagnoses (eg. mixed dementia or AD with concurrent depression) For participants from the new referrals cohort: * Participants falling outside of the six categories included in the study (AD, VaD, LBD, FTD, MCI, Depression) * Participants where it is obvious at baseline that they will not fall within the above categories (can be excluded before clinical consensus diagnosis is given) For participants from the healthy control cohort: * MMSE \<26 and ACE \<90 * GDS score indicating depression (6 or higher) * Clinical, laboratory or neuroradiological findings that could affect cognitive functions
Design outcomes
Primary
| Measure | Time frame | Description |
|---|---|---|
| Accuracy of AI model in classifying cognitive impairment vs. unimpaired cognition | At baseline (speech recording) | Measured by sensitivity, specificity, AUR-ROC of AI predictions compared to clinical consensus diagnosis, using baseline speech recordings from participants. Model performance will be measured after database lock at study completion. |
Secondary
| Measure | Time frame | Description |
|---|---|---|
| Sub-classification of Mild Cognitive Impairment (MCI) into progressive vs. non-progressive | At baseline (speech recording) and up to 12 months after enrollment (to determine progression) | Measured by sensitivity, specificity, AUR-ROC of AI predictions compared to clinical consensus diagnosis, using baseline speech recordings from participants. Model performance will be measured after database lock at study completion. Progression is defined as new dementia diagnosis during study period. |
| Classification of dementia subtypes (AD, VaD, LBD, FTD) | At baseline (speech recording) | Measured by sensitivity, specificity, AUR-ROC of AI predictions compared to clinical consensus diagnosis, using baseline speech recordings from participants. Model performance will be measured after database lock at study completion. |
| Comparison with established biomarkers | At baseline, or at time of biomarker testing if performed after baseline | Differences in diagnostic accuracy between AI predictions and state-of-the-art biomarkers for dementia diagnosis |
| Feature importance analysis | At baseline (speech recording) | Feature importance will be evaluated using interpretability analyses (e.g. permutation importance, SHAP values, and/or ablation of feature groups) to quantify the contribution of acoustic and linguistic features to the model's predictions. |
| Accuracy for dementia vs. depression | At baseline (speech recording) | Measured by sensitivity, specificity, AUR-ROC of AI predictions compared to clinical consensus diagnosis, using baseline speech recordings from participants. Model performance will be measured after database lock at study completion. |
Other
| Measure | Time frame | Description |
|---|---|---|
| Contribution of individual speech tasks to AI model performance | At baseline (speech recording) | Contribution of individual speech tasks will be evaluated by comparing model performance (e.g. accuracy, sensitivity, specificity, AUC-ROC) when trained and tested on subsets of speech tasks (memory tests, story recall, picture description). This will identify which tasks provide the strongest diagnostic signal. |
| Number of tasks required for optimal accuracy | At baseline (speech recording) | Evaluation of whether a reduced set of speech tasks provide accuracy comparable to the full test battery. |
Countries
Denmark