CPU 2024

Dados do Trabalho


Título

Is ChatGPT a Reliable Tool for Prostate Cancer Assessment Compared to Global Clinical Guidelines?

Resumo

Objective: Evaluate the reliability of ChatGPT versions 3.5 and 4.0 in providing reliable answers compared to major clinical guidelines, guidance for patients and physicians with regard to Prostate Cancer (PC).
Methods: The study was conducted through a cross-sectional design, using 15 questions based on the three main global guidelines for prostate cancer (NCCN, AUA, and EAU). The questions were asked to ChatGPT in both versions 3.5 and 4.0 in September 2023 and were then evaluated by 9 urologists specialized in Urologic-Oncology. The specialists assessed the ChatGPT responses based on a Likert scale, and the concordance between ChatGPT 3.5 and 4.0 responses and the specialists' evaluations was assessed using Cohen's Kappa and Weighted Kappa tests.

Results: ChatGPT 4.0 had a higher average than 3.5 for all 43 answers in the NCCN guideline, except for one answer to question 10. Only 7 out of 44 answers were considered above Poor or Slight, while the remaining 37 were classified as Poor or Slight. According to the study, ChatGPT 3.5 and 4.0 were evaluated as a recommendation tool for both patients and doctors. The results showed that out of 44 responses, 22 responses from ChatGPT 3.5 had a low Likert average of less than 3, while only 2 responses from ChatGPT 4.0 had a low Likert average.

Conclusion: LLMs, such as ChatGPT, has the potential to become an important tool for guiding prostate cancer patients and physicians, although its reliability may be affected by the source of its information.

Palavras Chave

PROSTATE CANCER; ChatGPT; Artificial Inteligence

Área

Geral

Categoria

Estudos transversais

Autores

Marcelo Langer Wroclawski, Alexandre Kyoshi Hidaka, Felipe Placco Araujo Glina, Khalil Smaid, Rafael Tourinho-Barbosa, Arie Carneiro, Marcio Covas Moschovas, Marcelo Langer Wroclawski