This week, two publications ignited debate in the digital health world. An editorial in Nature Medicine, one of the world's most influential medical journals, warns about the explosion in users turning to consumer AI for health questions.[1] And a consumer survey published on 22 April 2026 delivers a brutal figure: reliability falls below 35% in real conditions.[2]
The tests underpinning chatbot publishers' marketing use medical MCQs — textbook cases with a single correct answer. On this controlled ground, ChatGPT, Copilot and Perplexity score close to 95%. Impressive.
In clinical reality, patients do not present as textbook cases. They arrive with vague symptoms, multiple comorbidities, ongoing treatments, psychosocial factors and family histories. AI cannot see them, cannot ask follow-up questions as a physician would, and has no access to their longitudinal data. This is where reliability collapses.
The Nature Medicine editorial (27 April 2026) points precisely to this gap: very good laboratory performance, but very limited real clinical utility outside textbook cases.
OpenAI, Microsoft and Perplexity have all launched consumer health products recently, capitalising on the public trust their brands command. The positioning is attractive: available 24/7, free or low-cost, no appointment needed.
But behind this positioning, several structural problems:
When a Swiss patient shares health information with ChatGPT Health or Copilot Health, that data transits through servers of US companies. The US CLOUD Act allows federal authorities to demand data access — even when hosted outside the United States.
In Switzerland, health data are sensitive personal data under Art. 5(c) nFADP. Transmitting them to a non-compliant foreign service may constitute a legal violation.[4]
On 20 April 2026, WHO/Europe published its first-ever snapshot of AI in healthcare across the EU's 27 member states.[3] The report is balanced: it acknowledges potential benefits while stressing the need to balance innovation, safeguards, skills and public trust.
| Consumer AI | Professional medical AI | |
|---|---|---|
| Patient context | ❌ Unknown | ✓ Integrated |
| Medical record | ❌ Absent | ✓ Available |
| Medical validation | ❌ None | ✓ Mandatory |
| Data privacy | ⚠ Variable / CLOUD Act | ✓ Swiss · nFADP |
| Liability | ❌ None | ✓ Physician + provider |
| Real-world reliability | < 35% | ~85% |
| Specialisation | ❌ Generalist | ✓ Medical |
The demand for medical information outside consultation hours is real and legitimate. Patients have the right to seek to understand their health. But consumer AI creates an illusion of medical competence that can lead to serious errors.
Simple messages to convey:
See also our article on liability in the event of AI-related medical error.
Can ChatGPT really be used for medical advice?
ChatGPT and similar tools can provide useful general medical information — definitions, mechanisms, medication information. But they do not know the patient's context, have no access to the medical record, and their responses are not validated by a healthcare professional. The April 2026 consumer survey shows reliability falls below 35% in real conditions. They are useful as general information sources, not as diagnostic or therapeutic advice tools.
What is the difference between consumer AI and professional medical AI?
The fundamental difference is context. Consumer AI answers questions without knowing the patient's age, medical history, current treatments, allergies or test results. Professional medical AI is integrated into the clinical workflow: it knows the patient record, its responses are intended for the healthcare professional (not the patient directly), and every recommendation is subject to physician validation.
Are data shared with ChatGPT Health or Copilot Health protected?
This is one of the most serious risks. OpenAI and Microsoft are US companies subject to the CLOUD Act — US authorities can demand data access even when hosted outside the US. In Switzerland, health data are sensitive personal data under the nFADP. Transmitting them to a non-compliant foreign service may constitute a legal violation.
Why do AI tools seem reliable in tests but not in real conditions?
Benchmarks use textbook cases from medical MCQs — questions with a single correct, well-defined answer. In clinical reality, patients present with vague symptoms, comorbidities and complex life contexts. AI cannot see the patient, cannot ask follow-up questions as a physician would, and has no access to longitudinal data. The Nature Medicine editorial (April 2026) specifically highlights this gap between laboratory performance and real clinical performance.
Integrated clinical context, mandatory medical validation, Swiss hosting compliant with nFADP. Not a consumer chatbot — a professional tool.
Try free →