Patient-reported outcome measures for monitoring primary care patients with depression: the PROMDEP cluster RCT and economic evaluation

Kendrick T, Dowrick C, Lewis G, Moore M, Leydon GM, Geraghty AWA, Griffiths G, Zhu S, Yao GL, May C, Gabbay M, Dewar-Haggart R, Williams S, Bui L, Thompson N, Bridewell L, Trapasso E, Patel T, McCarthy M, Khan N, Page H, Corcoran E, Hahn JS, Bird M, Logan MX, Ching BCF, Tiwari R, Hunt A, Stuart B
Record ID 32018011304
Authors' objectives: Guidelines on the management of depression recommend that practitioners use patient-reported outcome measures for the follow-up monitoring of symptoms, but there is a lack of evidence of benefit in terms of patient outcomes. To test using the Patient Health Questionnaire-9 questionnaire as a patient-reported outcome measure for monitoring depression, training practitioners in interpreting scores and giving patients feedback. Depression is common and costly. It can lead to chronic disability, poor quality of life, suicide, and high service use and costs. National Institute for Health and Care Excellence guidelines recommend different treatments for more severe and less severe depression, but general practitioners, who treat more the majority of people with depression in primary care, are often inaccurate in their global clinical assessments of depression severity, and treatment is not targeted to patients most likely to benefit. The National Institute for Health and Care Excellence recommends that practitioners consider using validated patient-reported outcome measures to inform treatment at diagnosis and follow-up of people with depression, but there is insufficient evidence that these measures improve depression management and outcomes for patients in primary care.
Authors' results and conclusions: Three hundred and two patients were recruited in intervention arm practices and 227 patients were recruited in control practices. Primary outcome data were collected for 252 (83.4%) and 195 (85.9%), respectively. No significant difference in Beck Depression Inventory, 2nd edition, score was found at 12 weeks (adjusted mean difference –0.46, 95% confidence interval –2.16 to 1.26). Nor were significant differences found in Beck Depression Inventory, 2nd Edition, score at 26 weeks, social functioning, patient satisfaction or adverse events. EuroQol-5 Dimensions, five-level version, quality-of-life scores favoured the intervention arm at 26 weeks (adjusted mean difference 0.053, 95% confidence interval 0.013 to 0.093). However, quality-adjusted life-years over 26 weeks were not significantly greater (difference 0.0013, 95% confidence interval –0.0157 to 0.0182). Costs were lower in the intervention arm but, again, not significantly (–£163, 95% confidence interval –£349 to £28). Cost-effectiveness and cost–utility analyses, therefore, suggested that the intervention was dominant over usual care, but with considerable uncertainty around the point estimates. Patients valued using the Patient Health Questionnaire-9 to compare scores at baseline and follow-up, whereas practitioner views were more mixed, with some considering it too time-consuming. We found no evidence of improved depression management or outcome at 12 weeks from using the Patient Health Questionnaire-9, but patients’ quality of life was better at 26 weeks, perhaps because feedback of Patient Health Questionnaire-9 scores increased their awareness of improvement in their depression and reduced their anxiety. Further research in primary care should evaluate patient-reported outcome measures including anxiety symptoms, administered remotely, with algorithms delivering clear recommendations for changes in treatment. Practices and patients As the number of patients recruited per practice was smaller than anticipated, we recruited significantly more than our target of 113 practices, eventually reaching a total of 189, but 48 practices subsequently withdrew (24 in each arm), so the final number of active practices was 141: 72 intervention and 69 control (28 above our original target). Practice characteristics were well balanced by arm. Of 11,468 patients approached in consultations or through mailed invitations, 1058 (9.2%) returned reply slips about the study: 574 (10.6% of those approached) in the intervention arm and 484 (8.0% of those approached) in the control arm. After the exclusion of patients declining to participate, ineligible at screening or uncontactable, 529 patients were assessed at baseline: 302 (5.5% of those approached) in the intervention arm and 227 (3.8% of those approached) in the control arm. The ratio of intervention to control arm patients recruited was, therefore, 1.3 to 1, which may have reflected lower motivation to take part among control arm practices. Of 529 patients recruited, 453 (85.6%) were followed up at 12 weeks: 254 intervention arm (84.1%) and 199 control arm (87.7%) patients. At the 26-week point, 414 patients (78.3%) were followed up: 230 intervention arm (76.2%) and 184 control arm (81.1%). Medical records data were collected for 259 intervention arm patients (85.8%) and 201 control arm patients (88.5%). The mean BDI-II score for depressive symptoms at baseline was higher in the intervention arm, at 24.1 (standard deviation 8.89) than in the control arm, at 22.4 (standard deviation 9.52). Baseline anxiety and quality-of-life scores were also worse in the intervention arm. Control arm patients were more likely to have had two or more previous depressive episodes. Demographic characteristics were relatively well balanced. We found no benefit from using the Patient Health Questionnaire-9 in relation to the primary outcome of depression on the Beck Depression Inventory, 2nd edition, at the 12-week follow-up. There were also no significant differences found between the arms in the secondary outcomes of Beck Depression Inventory, 2nd edition, scores at 26 weeks, work and social functioning, patient satisfaction, medication use, or contacts with mental health services, although all the differences found in these measures were in the direction of favouring the intervention arm. However, we did find a significant benefit in terms of improved quality of life at 26 weeks, for similar overall service costs. We also found evidence of benefit in a categorical analysis comparing rates of remission of depression at 26 weeks, although this result should be treated with caution as it was from a post hoc analysis. Cost-effectiveness acceptability curves showed that the probability of the intervention being cost-effective, at the lower and higher thresholds adopted by the National Institute for Health and Care Excellence of £20,000 and £30,000 per quality-adjusted life-year, was 77% and 72%, respectively. We found that patients valued using the Patient Health Questionnaire-9 to identify changes in their scores. The mechanism by which feedback of scores might improve patients’ quality of life, despite not changing the management of their depression, might be through increasing their awareness of improvement in their symptoms over time, supporting personal reflection on their progress to recovery.
Authors' methods: Parallel-group, cluster-randomised superiority trial; 1 : 1 allocation to intervention and control. UK primary care (141 group general practices in England and Wales). The study design was a parallel-group, cluster-randomised superiority trial with 1 : 1 allocation to intervention and control arms. The setting was UK primary care (141 group general practices in England and Wales). Inclusion criteria were age ≥ 18 years with a new episode of depressive disorder or symptoms. Patients were recruited mainly through regular medical records searches but also opportunistically at consultations for new episodes of depression. Exclusion criteria were current treatment for depression; dementia; psychosis; substance misuse; or a significant risk of suicide. The intervention was administration of the Patient Health Questionnaire-9 questionnaire as a PROM soon after diagnosis and at follow-up 10–35 days later. Patients were given written feedback on their Patient Health Questionnaire-9 scores and potential treatments to discuss with their general practitioners. Practitioners were trained in interpreting Patient Health Questionnaire-9 scores and taking them into account in treatment decisions. The primary outcome was depressive symptoms on the Beck Depression Inventory, 2nd edition, at 12 weeks. Secondary outcomes were Beck Depression Inventory, 2nd edition, scores at 26 weeks; social functioning (on the Work and Social Adjustment Scale) and quality of life (on the EuroQol-5 Dimensions, five-level) at 12 and 26 weeks; service use including antidepressant treatment and primary and secondary care contacts over 26 weeks to calculate NHS costs; and patient satisfaction at 26 weeks (on the Medical Informant Satisfaction Scale). For our sample size calculation, we assumed a baseline mean Beck Depression Inventory, 2nd edition, score of 24.0 with a standard deviation of 10.0 (derived from a feasibility study), and mean scores at 12-week follow-up of 14.0 in the intervention arm and 17.0 in the control arm. The anticipated difference of 3.0 points (effect size of 0.3) represented the minimum clinically important difference on the Beck Depression Inventory, 2nd edition. At the 5% level of significance, to have 90% power to detect that difference we calculated we needed 235 patients analysed per arm. We aimed to recruit a mean of six patients per practice and assumed an intracluster correlation coefficient of 0.03 (from the feasibility study), which gave a cluster design effect of 1.15, meaning we needed 270 per arm. We assumed a 20% loss to follow-up at 12 weeks, so the total sample size needed was 270 × 2/0.8 and our original target sample size was a total of 676 patients recruited, from 113 practices, by three recruitment centres (the University of Southampton, the University of Liverpool and University College London). We subsequently revised the target sample size on finding a significant correlation coefficient of > 0.5 between baseline and follow-up values for the primary outcome, which meant that we needed only 222 patients analysed per arm and, therefore, a target sample size of 554 patients recruited (revised 10 June 2021). Cluster randomisation of practices to intervention and control arms was carried out remotely by a Clinical Trials Unit statistician using computerised sequence generation, with minimisation by recruiting centre, size of practice and urban or rural location. Blinding of participating practitioners and patients to allocation was impossible given the nature of the intervention and the cluster-randomised design, but self-report outcome measures were used to prevent researcher rating bias, and statistical analysis was blind to allocation. Differences between intervention and control arms in the outcomes of depressive symptoms, social functioning and quality of life measured at 12- and 26-week follow-up were analysed using linear mixed models, adjusting for baseline depression; duration of depression; history of depression; baseline anxiety; sociodemographic factors (gender, age, socioeconomic position, housing, education, marital status and dependants), and clustering including a random effect for practice. Patient satisfaction, quality of life (quality-adjusted life-years) and costs were compared between the arms over the 26 weeks’ study follow-up period. Differences between the arms in the process of care for depression were also analysed, including patients’ self-reported use of antidepressants at the 12- and 26-week follow-up points, and medication and contacts with mental health services (community mental health nurses, counsellors, psychologists, psychiatrists, other therapists and social workers) recorded in practice medical records over the 26 weeks’ follow-up. A health economic evaluation was undertaken from an NHS and Personal Social Services perspective. The outcomes were expressed as incremental cost per point improvement in the Beck Depression Inventory, 2nd edition, clinical outcome (cost-effectiveness analysis), and incremental cost per quality-adjusted life-year gained (cost–utility analysis). The primary analysis at 26 weeks used a generalised linear mixed model to estimate the differences in costs and quality-adjusted life-years (using the EuroQol-5 Dimensions, five-level to calculate patient utilities), adjusted for baseline quality of life; baseline anxiety; sociodemographic factors; and practice as a random effect. Incremental cost-effectiveness ratios and a cost-effectiveness acceptability curve were generated using non-parametric bootstrapping. Qualitative interviews with participating practitioners and patients in both arms were conducted to reflect on their involvement in the trial and analysed using reflexive thematic analysis. Intervention arm participants were asked about barriers, facilitators, benefits and problems related to using the Patient Health Questionnaire-9, including questions derived from the normalisation process theory framework. Baseline differences in depression, anxiety and quality-of-life scores may have reflected selection bias due to the cluster randomised design. We did not quite achieve the revised sample size target of 554 patients, falling short by 25, but the follow-up rate of 84.5% was better than the 80% predicted and so we gathered primary outcome data on 447 patients, exceeding the target of 444 and sufficient to answer the main research question with precision. It was not possible to blind participants and researchers to allocation to intervention or control arms given the pragmatic open and cluster randomised design, but we used self-report measures to avoid observer bias, and the analyses were carried out blind to allocation. We endeavoured to carry out the baseline assessments and administer the first Patient Health Questionnaire-9 as soon as possible after the patients first presented symptoms, but this was sometimes delayed by 2–3 weeks. In the meantime, treatment had already been started by the general practitioner/nurse practitioner of around half of the patients, which meant that the first Patient Health Questionnaire-9 score could not be taken into account when choosing initial treatments.
Project Status: Completed
Year Published: 2024
URL for additional information: English
English language abstract: An English language summary is available
Publication Type: Full HTA
Country: England, United Kingdom
MeSH Terms
  • Depressive Disorder
  • Depression
  • Patient Reported Outcome Measures
  • Quality of Life
  • Primary Health Care
  • Cost-Effectiveness Analysis
Organisation Name: NIHR Health Technology Assessment programme
Contact Address: NIHR Journals Library, National Institute for Health and Care Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK
Contact Name:
Contact Email:
This is a bibliographic record of a published health technology assessment from a member of INAHTA or other HTA producer. No evaluation of the quality of this assessment has been made for the HTA database.