-
Author
You Min Lee -
Discovery PI
Dr. Stephen Marder
-
Project Co-Author
-
Abstract Title
From Innovation to Implementation: NLP for Predicting Psychosis Conversion in Clinical High-Risk Populations—A Review of Methodological Evolution, Translational Barriers, and Future Directions
-
Discovery AOC Petal or Dual Degree Program
Basic, Clinical, & Translational Research
-
Abstract
Background/Significance: Artificial Intelligence (AI) is increasingly being utilized in schizophrenia research to improve early detection, symptom monitoring, medication adherence, and relapse risk stratification. In schizophrenia, disorganized speech is considered the diagnostic hallmark of formal thought disorder. As a subset of AI, Natural Language Processing (NLP) has been emerging as a promising biomarker to be used for prediction of early psychosis, notably the prodromal phase such as Clinical High Risk for Psychosis (CHR-P) phase. Current assessments for CHR-P relies on clinician ratings, which may be subjective, time intensive, and biased. Early detection during this window may improve patient outcomes by enabling interventions that slow down or prevent progression.
Objective: This review traces the methodological evolution of NLP methods from Latent Semantic Analysis through word embeddings to transformer-based models, then discusses the critical barriers to clinical translation.
Methods: PubMed, Web of Science, and Google Scholar database were searched through March 2026 using terms “natural language processing,” "NLP," “linguistics,” “automated,” “clinical high risk,” “psychosis,” “schizophrenia,” “prediction,” and “clinical utility.” In addition to the primary database search, backward citation tracking was used to identify relevant articles that were not captured by the initial search. This allowed identification of additional articles that met the inclusion criteria.
Identified papers were categorized into different themes for synthesis, including:
-
Foundational Predictive Modeling
-
Modern Technical Evolution
-
Implementation Challenges
Results/Discussion: Despite decades of “proof of concept” findings of NLP studies reporting high prediction accuracies, clinical utility remains virtually non-existent. The models have become far more sophisticated over the years, yet the field remains in the early validation stage, with clinical implementation limited by concerns over generalizability, construct validity, algorithmic bias, and transparency. NLP coherence measures fail to generalize across different languages and penalize speakers of racial minority independent of clinical status. Another study found significant correlation between coherence measures and average sentence length, which was also associated with higher scholastic achievement and racial identity. Most importantly, studies have not addressed what standardized interventions should follow a positive NLP screen. Future research should not only prioritize large scale validation, but also focus on developing standardized, effective interventions for CHR-P that offer NLP prediction clinical meaning. Further, it is necessary for the development framework to include clear regulatory pathways, clinician friendly interfaces, and clinical action protocols. NLP as an automated tool for predicting psychosis conversion is very promising, but fulfilling that promise demands not just technological innovation but also a realistic and transparent translational framework.
-