Predicting psychiatric hospitalisation using routinely-collected measures

0
6


Whilst psychiatric hospitalisation can be lifesaving (Wang & Colucci, 2017), it is also associated with a wide range of adverse clinical outcomes (Walter et al., 2019) and high economic costs (Stensland et al., 2012). Consequently, it is important to try to prevent hospital admissions where possible through available interventions such as the involvement of crisis resolution and home treatment teams.

In light of scarce resources, clinical decision-making and resource allocation of these interventions can be informed by early warning scores. These are clinical prediction models that are used to monitor patients’ health during hospital stays and to identify patients at risk of further deterioration. When this score breaches a pre-determined threshold (indicating signs of impending deterioration), a warning is triggered indicating the need for preventive intervention, thereby assisting clinical decision-making and improving patient outcomes.

Early warning scores have been used widely in physical health, but they have had limited success in mental health. Taquet et al. (2025) set out to address this gap by building on their earlier work where they showed that clinical instability and severity are strong predictors of hospitalisation across diagnoses (Taquet et al., 2023, which Florian Walter blogged about), and developing an early warning score for psychiatric hospitalisation using measures of both clinical and functional severity and instability.

Psychiatric hospitalisation is associated with significant personal, social, and economical burden but early warning scores may help by predicting deterioration and guiding timely interventions.

Psychiatric hospitalisation is associated with significant personal, social, and economical burden, but early warning scores may help by predicting deterioration and guiding timely interventions.

Methods

The authors used longitudinal electronic health record (EHR) data on patients from 20 US-based mental health centres. Data included sociodemographic factors (age; gender), diagnosis, and clinician-rated measures for clinical severity (Clinical Global Impression of Severity, CGI-S) and functional ability (Global Assessment of Functioning, GAF).

Patients were included if they had a diagnosis from the following disorders:

  • Major depressive disorder (MDD)
  • Bipolar disorder (BD)
  • Generalised anxiety disorder (GAD)
  • Schizophrenia or schizoaffective disorder (SCZ)
  • Attention deficit hyperactivity disorder (ADHD)
  • Personality disorder (PD).

Included patients also had to have at least five measurements of CGI-S and GAF within any 180 consecutive days before any psychiatric hospitalisation.

Predictors for psychiatric hospitalisation within 180 days included: age, diagnosis, gender, clinical severity (average of CGI-S scores), clinical instability (visit-to-visit fluctuation in CGI-S scores), functional severity (average of GAF scores), and functional instability (visit-to-visit fluctuation in GAF scores).

15 sites with the most recent data (30,493 patients) were used to develop the model, and 5 sites with more historical data (6,556 patients) were then used to validate the model. This temporal split allowed for a more suitable test of model transportability to other settings.

Several Cox Proportional hazard models were developed:

  1. “Unadjusted” model: Including all predictors
  2. “Adjusted” model: As above, but also adjusting for the probability of psychiatric hospitalisation at each site (given differing tendencies to hospitalise patients across sites)
  3. “Baseline” model: Including only diagnosis, gender, and age
  4. “Clinical benchmark” model: Including only diagnosis, gender, age and clinical severity – to reflect data that typically informs clinician decision-making.

For internal and external validation, model performance was primarily assessed by discrimination using the C-index. This measure quantifies the probability that the model assigns a higher predicted score to an individual who is hospitalised sooner, compared to an individual who is hospitalised later or never hospitalised in the study period. A C-index value of 0.5 indicates that the model’s discrimination is no better than chance, values between 0.70 and 0.80 are considered “good”, and those above 0.80 are considered “excellent”.

Additionally, they applied the model to each diagnosis separately and computed discrimination performance to evaluate the transdiagnostic validity of the model. They also assessed the model separately in white and non-white people in order to assess the fairness of the model (i.e., whether model performance varies across these demographic groups).

Results

The study included 37,049 patients: 30,493 in the development dataset used to build the model, and 6,556 in the validation dataset used to test the model.

The unadjusted model that used all predictors achieved a C-index of 0.74 (95%CI: 0.72 to 0.76) when tested on the same data it was trained on (internal validation) and a C-index of 0.80 (95%CI: 0.78 to 0.82) when tested on new data from different clinics (external validation). This means the model could reliably distinguish between patients who were more or less likely to be hospitalised. When the model was adjusted to account for the probability of psychiatric hospitalisation at each site, its performance improved even further, reaching a C-index of 0.84 (95%CI = 0.82 to 0.86) in external validation. The similar (and marginally increased) C-index in external validation compared to internal validation suggests that the model is robust and could work well in real-world clinical settings.

The discrimination performance of the adjusted prediction model was significantly greater than both the “baseline” model (greater mean C-index by 0.18, 95%CI: 0.14 to 0.23, p<0.001) and the “clinical benchmark” model (greater mean C-index by 0.15, 95%CI: 0.11 to 0.20, p<0.001). This suggests that the prediction model is better at estimating the risk of psychiatric hospitalisation when including certain measures (i.e., clinical instability and functional instability) beyond those that clinicians are likely to use for decision-making (i.e., diagnosis, gender, age and clinical severity).

Discrimination performance remained good across all diagnostic categories, with the C-indexes ranging from 0.74 (schizophrenia or schizoaffective disorder) to 0.81 (major depressive disorder and generalised anxiety disorder) in the adjusted models, suggesting that the model works well across different mental health conditions.

Further, the prediction model performed fairly across white and non-white ethnicities, with no significant differences found in discrimination performance when assessing the model separately in these subgroups. This suggests that the model is equitable across demographic groups.

The prediction model showed good discrimination between (a) individuals who were hospitalised sooner and (b) individuals who were hospitalised later or not at all.

Conclusions

The authors concluded that they have developed a prediction model for the 6-month risk of psychiatric hospitalisation using readily available factors, showing good performance in both internal and external validation. They say that their prediction model can:

facilitate evidence-based clinical decision-making [and] help target effective interventions to the patients most likely to benefit from them.

Importantly, their prediction model also performed well across diagnoses and fairly across white and non-white people.

The prediction model accurately estimated 6-month hospitalisation risk using routine clinical data, supporting fair, diagnosis-wide use in guiding timely and targeted interventions.

Strengths and limitations

One of the key strengths of this study is that the developed clinical prediction model only requires the use of readily available factors (age, gender, diagnosis) and two single-item clinical measures which are often routinely collected in clinical care without the need for specific training. Therefore, the model could be implemented into clinical settings with no significant additional burden to clinicians.

Other strengths include: an appropriately large sample size for the development of the prediction model; the use of real-world data which makes the findings more generalisable compared to trial-derived data; the inclusion of a range of psychiatric diagnoses to show the model’s transdiagnostic generalisability unlike previous prediction modelling studies; and clear and thorough reporting of methodology.

Whilst the authors did assess fairness by evaluating the model separately in individuals of white and non-white ethnicities, this approach is limited as it only captures whether the model is discriminating similarly among individuals within the same subgroup (ethnicity), but not whether it discriminates fairly across these subgroups (i.e., comparing the risks assigned to a white individual and a non-white individual). Further, whilst not including ethnicity as a predictor has been called for due to potentially negative consequences such as the exacerbation of health disparities (Vyas et al, 2020), it remains important to assess the inclusion and removal of such sensitive predictors as they may also improve the discrimination and fairness of a prediction model (Khor et al, 2023).

The use of routinely collected variables increases the clinical utility of the model and overcomes a common barrier (extra burden on clinicians) to model implementation.

The use of routinely collected variables increases the clinical utility of the model and overcomes a common barrier (extra burden on clinicians) to model implementation.

Implications for practice

The developed clinical prediction model shows promise for implementation into clinical settings with little burden to clinicians, given the use of brief and readily available variables as well as the transparency of the model. However, this would first require integration into clinical workflows as a practical tool (for example, through an electronic health record system or an app) in accordance with implementation governance and local regulations. Clinicians would then be able to input new measurements of the CGI-S and GAF, which would continually update patients’ early warning scores, and to use these scores to inform (and not determine) their clinical decision-making in conjunction with their own judgement of all relevant contextual factors.

The improved performance of the main adjusted prediction model over the “clinical benchmark” model suggests that the inclusion of clinical and functional instability may offer clinicians a useful second opinion when presented with repeated measurements where trends are not necessarily clear. Research has shown however that clinicians perceive early warning scores of deterioration as both useful in mitigating cognitive biases and clinical uncertainty, and harmful in reducing their capacity to act on their own judgement (e.g., if a risk score has not breached a threshold to warrant a response) (Blythe et al., 2024). Consequently, the real-world implementation of clinical prediction models requires careful consideration and ethical safeguards as they may be used to justify the refusal of potential resources for individuals with mental health difficulties.

The model developed in this study still requires prospective validation in other settings to assess its generalisability and transportability as well as assessments of its clinical utility (does it effectively identify patients who will benefit the most from available interventions?), before it can have more tangible implications for clinical practice. Nonetheless, the authors have shown that that there is scope for beneficial individualised prediction for hospitalisation through capturing longitudinal, routinely-collected measures.

The prediction model has promise to be used as an adjunct to clinician judgement for improved decision-making and treatment stratification.

The prediction model has promise to be used as an adjunct to clinician judgement for improved decision-making and treatment stratification.

Statement of interests

I have no conflicts of interest.

Links

Primary paper

Taquet M, Fazel S & Rush A J (2025) Transdiagnostic early warning score for psychiatric hospitalisation: development and evaluation of a prediction model. BMJ Mental Health, 28(1).

Other references

Blythe R, Naicker S, White N. et al (2024) Clinician perspectives and recommendations regarding design of clinical prediction models for deteriorating patients in acute care. BMC Medical Informatics and Decision Making, 24(1), 241.

Khor S, Haupt E C, Hahn E E. et al (2023) Racial and ethnic bias in risk prediction models for colorectal cancer recurrence when race and ethnicity are omitted as predictors. JAMA Network Open, 6(6), e2318495.

Stensland M, Watson P R & Grazier K L (2012) An examination of costs, charges, and payments for inpatient psychiatric treatment in community hospitals. Psychiatric Services, 63(7), 666-671.

Taquet M, Griffiths K, Palmer E O. et al (2023) Early trajectory of clinical global impression as a transdiagnostic predictor of psychiatric hospitalisation: a retrospective cohort study. The Lancet Psychiatry, 10(5), 334-341.

Vyas D A, Eisenstein L G & Jones D S (2020) Hidden in plain sight – reconsidering the use of race correction in clinical algorithms. The New England Journal of Medicine, 383(9), 874–882.

Walter F, Carr M J, Mok P L H. et al (2019) Multiple adverse outcomes following first discharge from inpatient psychiatric care: a national cohort study. The Lancet Psychiatry, 6(7), 582–589.

Walter F. Clinical severity and instability as predictors for psychiatric hospitalisation: can one size fit all? The Mental Elf, 13 Oct 2023.

Wang D  W  L & Colucci E (2017) Should compulsory admission to hospital be part of suicide prevention strategies? BJPsych Bulletin, 41(3), 169–171.

Photo credits