Tele-ophthalmology-enabled and artificial intelligence-ready referral pathway for community optometry referrals of retinal disease: HERMES cluster randomised trial with a diagnostic accuracy study
Sharma A, Hussain R, Learoyd AE, Aristidou A, Soomro T, Blandford A, Lawrenson JG, Grimaldi G, Douiri A, Kernohan A, Robinson T, Moradi N, Dinah C, Minos E, Sim D, Aslam T, Manna A, Denniston AK, Patel PJ, Keane PA, Bunce C, Vale L, Balaskas K
Record ID 32018014805
English
Authors' objectives:
Community-based optometrists, a major provider of primary eye care in the United Kingdom, are the main source of referrals to hospital eye services. The widespread introduction of optical coherence tomography devices in community practices provides community-based optometrists with an opportunity to identify a broader range of treatable diseases. Standard referral pathways do not effectively filter unnecessary referrals, with misclassification of urgency, and erroneous diagnoses. To assess the effectiveness of a teleophthalmology referral pathway between community-based optometrists and hospital eye services for retinal diseases. To measure the accuracy of an artificial intelligence decision support system for diagnosis and referral management of retinal disease. Ophthalmic services are challenged by an ageing demographic and the associated rise in common retinal diseases. Despite an increasing demand, there is an anticipated global shortfall of ophthalmologists. Community-based optometrists (COs) are essential in managing and referring people to hospital eye services (HES). Optical coherence tomography (OCT) devices have been increasingly installed in community practices. Although OCT expanded optometrists’ scope of practice, its widespread use caused an increase in overall and unnecessary referrals. Efforts to streamline incoming referrals are vital to help alleviate pressures to HES and to improve patients’ access to timely diagnosis and treatment. This could be achieved with teleophthalmology, which is the use of medical information exchanged from one site to another via electronic communications to improve a patient’s eye health. A systematic review revealed the evidence gap from well-designed, randomised controlled trials (RCTs) for validation of teleophthalmology referrals. The HERMES study assesses the clinical- and cost-effectiveness of teleophthalmology referrals compared to standard care, in reducing unnecessary hospital visits and misclassified ‘urgent’ referrals through a cluster RCT (cRCT). Additionally, there have been significant developments in applying machine learning, specifically deep learning, in medical imaging for disease diagnosis. The HERMES study included a prospective observational study, reviewing artificial intelligence decision support systems (AI DSS), specifically the Moorfields-Google DeepMind model (‘Octane’) (Google DeepMind, DeepMind Technologies Limited, London), compared to human assessors in terms of its accuracy for retinal diagnosis and referral decisions. Its use as an AI DSS was tested for its generalisability in a diverse clinical care environment. To assess the effectiveness and cost-effectiveness of a digital referral pathway between community optometry and HES for referral of retinal disease enabled by a teleophthalmology platform. To measure the diagnostic and referral accuracy of the Octane AI DSS in the context of referral pathways between community optometry and HES.
Authors' results and conclusions:
Teleophthalmology significantly reduces the proportion of false-positive urgent referrals by 59% compared to standard care in referred participants. Due to the observed low event rate for false positive referrals, teleophthalmology’s role for reducing false positives overall was inconclusive. No significant difference between arms for safety of referral decisions (false negatives) was found. After accounting for external factors, the time to consultation demonstrated both clinically and statistically significant benefits for the teleophthalmology arm. The time to treatment showed a clinically significant benefit. Of 396 recruited participants, the Octane artificial intelligence model processed images contributed by 204 participants (51.5%). For referral decisions, the model showed comparable sensitivity and specificity against its own preset referral rules (rule-based reference standard) (post hoc analysis), but it showed inferior sensitivity and specificity when compared to human expert assessors making these referral decisions (clinical reference standard) (primary AI analysis). The artificial intelligence model presented challenges relating to its generalisability in a real-world evaluation context. Asynchronous teleophthalmology reduces the number of unnecessary urgent referrals, the main drivers of increasing hospital capacity pressures, provides more appropriate referral-to-treatment times and is more cost-effective compared to standard care. The Octane artificial intelligence model could not process images from 48.5% of study participants. Compared to hospital-based experts for referral decisions, Octane was less accurate at making routine and urgent referral decisions and of similar accuracy to community optometrists. One hundred and thirty-six participants were recruited to the control arm, and 158 participants were recruited to the intervention arm. Asynchronous teleophthalmology with clinician triage significantly reduced the proportion of false-positive urgent referrals by 59.5% as compared to standard care in referred participants. It lowered the proportion of false-positive referrals (for urgent/routine referrals) and of false-negative referrals. In terms of wrong referral urgency (reported as means with CIs), the standard pathway had 25.7% (14.1% to 37.3%) more incorrect referral decisions than the teleophthalmology pathway, a significant difference at the 5% level. Both arms showed high sensitivity for detecting the need for a referral, however, specificity was significantly lower in the control arm [41.2% (18.4% to 67.1%)] versus the intervention arm [93.9% (79.8% to 99.3%)]. For urgent referrals only, there was greater accuracy in the intervention arm [sensitivity: 96.3% (81.0% to 99.9%), specificity: 99.2% (95.8% to 100%)] versus the control arm [sensitivity: 73.7% (48.8% to 90.9%), specificity: 79.5% (71.0% to 86.4%)]. There was also a significant increase in the proportion of wrong diagnoses in the standard pathway versus the teleophthalmology pathway [right eye: 19.7% (6.1% to 33.3%); left eye: 18.8% (4.3% to 33.2%)]. Focusing on nAMD, the control arm had 54.2% (30.3% to 78.0%) more false-positive referrals than the intervention arm. Additionally, the intervention arm had greater sensitivity [90.9% (70.8% to 98.9%)] and specificity [100% (97.3% to 100%)] when making a nAMD-related referral than the control arm sensitivity [57.9% (33.5% to 79.7%)] and specificity [88.9% (81.7% to 93.9%)]. The standard pathway also led to more false-positive nAMD diagnoses [53.6% (33.8% to 72.5%)] versus the teleophthalmology pathway [0, (0%)]. Following a sensitivity analysis, the mean time to consultation was significantly lower in the intervention arm [53 (51 to 55) days] versus the control arm [89 (87 to 91) days] (p = 0.039). The mean time to treatment was lower in the intervention arm [55 (52 to 57) days] versus the control arm [90 (87 to 94)] (p = 0.151). Assessment of safe triage of rare diseases showed each arm referred all suspected cases to HES. For the AI study, 204 out of 396 participants from 17 of the 29 participating CO practices were included, where the majority of participants were from 14 sites, 201 of the 204 participants (98.5%). Two OCT device manufacturers found in CO practices were supported, of which 30% of images were not suitable for processing by Octane due to image size or format. When compared to human assessors (clinical RS), the sensitivity of the Octane AI model was 96.4% (92.4% to 98.7%) and the specificity was 20.0% (8.4% to 36.9%). The AI model was less accurate when deciding referral urgency for required referrals (routine vs. urgent), with wrong referral urgency by AI of 14.8% (9.0% to 22.3%) and by human assessors 2.8% (0.6% to 7.8%). Of note, human assessors considered information from OCT scans, clinical history and patient preferences when making referral decisions, whereas the AI model considered information solely from OCT scans. When recommending urgent referrals, the Octane model is less accurate than hospital-based experts in the teleophthalmology arm, with wrong referral urgency by AI of 25% (5.5% to 57.2%) and by hospital-based experts of 0%, yet the overall absolute numbers of wrong urgent referrals are small; the model is of a similar accuracy as community optometrists for urgent referrals, with wrong referral urgency by AI of 66.7% (34.9% to 90.1%) and by CO of 62.5% (35.4% to 84.8%). In a post-hoc analysis, the Octane model showed good referral accuracy when applying its own preset referral rules (rule-based RS) for recommending a referral versus no referral. If the AI model offers a referral to a patient, there is a 100% (98.1% to 100%) probability that a referral is truly needed (PPV). If a referral is not offered by the AI, however, there is a 69.2% (38.6% to 90.9%) probability that a referral is truly not needed (NPV). Absolute numbers of no referrals being small in both pathways, however, PPV is increased and NPV is decreased due to high prevalence of participants requiring a referral. In a post-hoc analysis, the Octane model shows good referral accuracy against the rule-based RS for recommending urgent referral, with modest reduction in referral efficiency (PPV); that is, if the Octane model offers an urgent referral to a patient, there is a 78.4% (61.8% to 90.2%) probability that an urgent referral is truly needed. For referral urgency, the Octane model chose the wrong referral urgency a comparable number of times to CO [AI: 22.7% (11.5% to 37.8%) vs. CO: 22.7% (11.5% to 37.8%)] and a greater number compared to hospital-based experts [AI: 9.4% (3.5% to 19.3%) vs. HES: 1.6% (0.0% to 8.4%)]. In terms of diagnostic accuracy for the diagnosis of retinal disease, the Octane model exhibits moderate sensitivity [68.1% (62.5% to 73.4%)] when detecting the same diagnosis as the clinical RS, which improves to 82.6% (72.9% to 89.9%) when compared to the rule-based RS. The post-implementation substudy recruited 17 patients overall. Barriers to implementation of the Manchester Electronic eye Referral System included training gaps, inadequate communication channels among primary care, secondary care and technology suppliers and an insufficient support network. The CEA of the economic decision model showed the HERMES pathway to have a greater effect at a lower cost than the standard pathway, meaning it was the dominant intervention. The DCE results showed the public had a greater preference for a more effective intervention that would be delivered more quickly. Importantly, they had a greater preference for obtaining a correct diagnosis compared to a reduction in the waiting time. A CBA, based on the DCE results, demonstrated a net benefit for the HERMES pathway compared with the standard pathway of £992 for every patient seen. Deterministic and probabilistic sensitivity analyses were carried out to assess the robustness of the conclusions, and in these analyses the conclusions were found not to change. These results provide strong evidence for the efficiency of the HERMES pathway compared with standard care. These data are significant as previous studies reviewing different referral interventions, including asynchronous teleophthalmology, provide limited information on this key area. The HCI-related in-depth interview study with participants, optometrists and ophthalmologists showed that they generally attribute value in implementing teleophthalmology through improving efficiency and the ability to provide and receive feedback. The normalisation process theory analysis highlighted the need to consider multiple factors when developing and implementing teleophthalmology platforms, especially if aiming to have it adopted and normalised at a large scale. Asynchronous teleophthalmology can reduce the number of unnecessary referrals, provide more appropriate referral timescales and is more cost-effective compared to standard care. Its role in reducing unnecessary referrals overall was inconclusive. When comparing the Octane AI system for referral decisions to hospital-based human expert assessors, it was less accurate for making routine and urgent referral decisions and was of similar accuracy to CO for making urgent referral decisions. The AI model presented challenges relating to its generalisability in a real-world evaluation context.
Authors' methods:
A multicentre, superiority cluster randomised controlled trial to assess the effectiveness of a teleophthalmology referral pathway. A prospective, observational diagnostic accuracy study to measure the performance of artificial intelligence decision support system. A comprehensive economic evaluation was conducted. United Kingdom-based community optometry practices with an optical coherence tomography device and hospital eye services. Adults requiring referral for retinal disease at the opinion of the community-based optometrists. Community optometry practices were randomised 1 : 1 to standard care or teleophthalmology. Referrals sent via the teleophthalmology platform were remotely reviewed by human experts based at the corresponding hospital eye services. A referral decision was provided within 48 hours. Suitable optical coherence tomography scans were solely processed by artificial intelligence decision support system (the ‘Octane’ model). Cluster randomised controlled trial’s primary outcome was the proportion of false-positive referrals (not required or not urgent) per arm in overall participants and in referred-only participants against an independent reference standard. Secondary outcomes included the proportion of wrong diagnosis, wrong referral urgency, false-negative referrals, safely triaged referrals for rare diseases, time from referral to consultation and treatment and cost-effectiveness of teleophthalmology. Primary outcome for the artificial intelligence study was the sensitivity and specificity of artificial intelligence referral decisions against the reference standard. Technical limitations in optometry practices, lack of ethnicity data. Design A multicentre, superiority cRCT to assess the clinical- and cost-effectiveness of a teleophthalmology referral pathway for retinal disease. A prospective, observational diagnostic accuracy (validation) study to measure the performance of Octane AI DSS for diagnostic and referral support. A human–computer interaction (HCI) assessment via a theoretically informed, qualitative study to explore participants’ and healthcare professionals’ perspectives on teleophthalmology and AI DSS. A small-scale exploratory post-implementation observational study of real-life teleophthalmology. Community optometry practices with an OCT device and HES based in the UK. Adults (≥ 18 years) attending an eye examination with a macular OCT scan at the participating optometry practice were recruited if there was a suspicion of retinal disease in the opinion of CO. Conditions included neovascular age-related macular degeneration (nAMD), dry age-related macular degeneration, diabetic retinopathy, macular oedema, central serous chorioretinopathy, vitreoretinal interface abnormalities, genetic eye disease and any other retinal condition not requiring an emergency referral. Participants were required to give consent. Individuals with known retinal comorbidities in either eye triggering a referral and cases where acquisition of a good-quality OCT scan was not possible were excluded. Twenty-six community optometry practices (clusters) in the catchment areas of four HES sites in the UK were randomised 1 : 1 to standard care or teleophthalmology for the cRCT. Practices randomised to the control arm continued to refer participants with suspicion of retinal disease to HES using their standard method. Upon receiving informed written consent, COs uploaded their clinical findings, imaging (OCT scans), diagnosis and referral decision to the teleophthalmology referral platform. Practices randomised to the intervention arm referred participants with suspicion of retinal disease to HES using the teleophthalmology referral platform. Human experts based at the corresponding HES reviewed every case remotely. A referral decision was provided to participants and COs within 48 hours. A hospital appointment was arranged if required. The efficiency of teleophthalmology was assessed by comparing CO and teleophthalmology recommendations against a reference standard (RS). For the AI study, a subset of OCT scans was processed by the Octane AI DSS. The Octane AI only used the OCT scans as input without any other clinical information. Its diagnosis and referral recommendations were compared against an independent RS. Additionally, a model-based economic evaluation was conducted, including a model-based cost-effectiveness analysis (CEA), a discrete choice experiment (DCE), and by using the results of both CEA and DCE for a cost–benefit analysis (CBA). A HCI evaluation was conducted; this involved a combination of situated observations and semistructured interviews with healthcare professionals and participants to investigate their perspectives on teleophthalmology models of care and AI DSS.
Details
Project Status:
Completed
URL for project:
https://www.journalslibrary.nihr.ac.uk/programmes/hta/NIHR127773
Year Published:
2025
URL for published report:
https://www.journalslibrary.nihr.ac.uk/hta/QNDF3325
URL for additional information:
English
English language abstract:
An English language summary is available
Publication Type:
Full HTA
Country:
England, United Kingdom
DOI:
10.3310/QNDF3325
MeSH Terms
- Telemedicine
- Optometry
- Ophthalmology
- Retinal Diseases
- Referral and Consultation
- Artificial Intelligence
- Tomography, Optical Coherence
Contact
Organisation Name:
NIHR Health Technology Assessment programme
Contact Address:
NIHR Journals Library, National Institute for Health and Care Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK
Contact Name:
journals.library@nihr.ac.uk
Contact Email:
journals.library@nihr.ac.uk
This is a bibliographic record of a published health technology assessment from a member of INAHTA or other HTA producer. No evaluation of the quality of this assessment has been made for the HTA database.