Accelerating Clinical NLP at Scale with a Hybrid Framework with Reduced GPU Demands: A Case Study in Dementia Identification

Shi, Jianlin; Gan, Qiwei; Hanchrow, Elizabeth; Bowles, Annie; Stanley, John; Bress, Adam P.; Cohen, Jordana B.; Alba, Patrick R.

Computer Science > Computation and Language

arXiv:2504.12494 (cs)

[Submitted on 16 Apr 2025]

Title:Accelerating Clinical NLP at Scale with a Hybrid Framework with Reduced GPU Demands: A Case Study in Dementia Identification

Authors:Jianlin Shi, Qiwei Gan, Elizabeth Hanchrow, Annie Bowles, John Stanley, Adam P. Bress, Jordana B. Cohen, Patrick R. Alba

View PDF

Abstract:Clinical natural language processing (NLP) is increasingly in demand in both clinical research and operational practice. However, most of the state-of-the-art solutions are transformers-based and require high computational resources, limiting their accessibility. We propose a hybrid NLP framework that integrates rule-based filtering, a Support Vector Machine (SVM) classifier, and a BERT-based model to improve efficiency while maintaining accuracy. We applied this framework in a dementia identification case study involving 4.9 million veterans with incident hypertension, analyzing 2.1 billion clinical notes. At the patient level, our method achieved a precision of 0.90, a recall of 0.84, and an F1-score of 0.87. Additionally, this NLP approach identified over three times as many dementia cases as structured data methods. All processing was completed in approximately two weeks using a single machine with dual A40 GPUs. This study demonstrates the feasibility of hybrid NLP solutions for large-scale clinical text analysis, making state-of-the-art methods more accessible to healthcare organizations with limited computational resources.

Comments:	This manuscript has been submitted to AMIA 2025 annual symposium (this https URL)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2504.12494 [cs.CL]
	(or arXiv:2504.12494v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.12494

Submission history

From: Jianlin Shi [view email]
[v1] Wed, 16 Apr 2025 21:24:38 UTC (276 KB)

Computer Science > Computation and Language

Title:Accelerating Clinical NLP at Scale with a Hybrid Framework with Reduced GPU Demands: A Case Study in Dementia Identification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Accelerating Clinical NLP at Scale with a Hybrid Framework with Reduced GPU Demands: A Case Study in Dementia Identification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators