Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
'),o.close()}("https://assets.zendesk.com/embeddable_framework/main.js","jmir.zendesk.com");/*]]>*/

Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Currently submitted to: JMIR Medical Education

Date Submitted: Mar 27, 2024
(closed for review but you can still tweet)

Warning: This is an author submission that is not peer-reviewed or edited. Preprints - unless they show as "accepted" - should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

Evaluation of ChatGPT-Generated Differential Diagnosis for Common Diseases with Atypical Presentation

  • Kiyoshi Shikino; 
  • Taro Shimizu; 
  • Yuki Otsuka; 
  • Masaki Tago; 
  • Takahashi Hiromizu; 
  • Takashi Watari; 
  • Yosuke Sasaki; 
  • Gemmei Iizuka; 
  • Hiroki Tamura; 
  • Koichi Nakashima; 
  • Kotaro Kunitomo; 
  • Morika Suzuki; 
  • Sayaka Aoyama; 
  • Shintaro Kosaka; 
  • Teiko Kawahigashi; 
  • Tomohiro Matsumoto; 
  • Fumina Orihara; 
  • Toru Morikawa; 
  • Toshinori Nishizawa; 
  • Yoji Hoshina; 
  • Yu Yamamoto; 
  • Yuichiro Matsuo; 
  • Yuto Unoki; 
  • Hirofumi Kimura; 
  • Midori Tokushima; 
  • Satoshi Watanuki; 
  • Takuma Saito; 
  • Fumio Otsuka; 
  • Yasuharu Tokuda

ABSTRACT

Background:

Despite significant advancements in medical knowledge and medical diagnosis techniques, misdiagnosis remains a significant public health issue, contributing to mortality and morbidity worldwide. Artificial intelligence (AI), especially models such as the Generative Pre-trained Transformer (GPT), has shown promise in enhancing diagnostic accuracy. However, the effectiveness of these AI models in diagnosing atypical presentations of common diseases has not been extensively explored.

Objective:

This study aimed to assess the diagnostic accuracy of the AI model ChatGPT-4 in generating differential diagnoses for atypical presentations of common diseases, and to understand its reliance on patient history during the diagnostic process.

Methods:

We utilized 25 clinical vignettes from the Journal of Generalist Medicine that presented atypical manifestations of common diseases. Two general medicine physicians categorized the cases based on atypicality. ChatGPT-4 was then employed to generate differential diagnoses, based on the clinical information provided. The concordance between AI-generated and final diagnoses was measured, with a focus on the top-ranked disease (top 1) and the top five differential diagnoses (top 5).

Results:

ChatGPT-4’s diagnostic accuracy decreased with an increase in atypical presentation. For Category 1 (C1) cases, the concordance rates were 17% for the top 1 and 67% for the top 5. Categories 3 (C3) and 4 (C4) showed a 0% concordance for top 1, and markedly lower rates for the top 5, indicating difficulties in handling highly atypical cases.

Conclusions:

ChatGPT-4 demonstrates potential as an auxiliary tool for diagnosing typical and mildly atypical presentations of common diseases. However, its performance declines with greater atypicality. The findings of study underscores the need for AI systems to encompass a broader range of linguistic capabilities, cultural understanding, and diverse clinical scenarios to improve diagnostic utility in real-world settings. Clinical Trial: NA


 Citation

Please cite as:

Shikino K, Shimizu T, Otsuka Y, Tago M, Hiromizu T, Watari T, Sasaki Y, Iizuka G, Tamura H, Nakashima K, Kunitomo K, Suzuki M, Aoyama S, Kosaka S, Kawahigashi T, Matsumoto T, Orihara F, Morikawa T, Nishizawa T, Hoshina Y, Yamamoto Y, Matsuo Y, Unoki Y, Kimura H, Tokushima M, Watanuki S, Saito T, Otsuka F, Tokuda Y

Evaluation of ChatGPT-Generated Differential Diagnosis for Common Diseases with Atypical Presentation

JMIR Preprints. 27/03/2024:58758

URL: https://preprints.jmir.org/preprint/58758

Per the author's request the PDF is not available.

Advertisement