
A new Google-backed study has revealed that a growing number of people are turning to AI chatbots like ChatGPT and Gemini for health-related advice but often ask questions that lead to biased or misleading answers. The research is named What is Up, Doc? and is published on the arXiv platform based on data from a dataset known as HealthChat-11K. With researchers behind the study (Google, UNC Chapel Hill, Duke University, and the University of Washington), the study brings new doubts into the credibility of AI in healthcare. More importantly, how users formulate their questions, particularly on issues of treatment, can have a great impact on the precision of chatbot answers.
How AI Chatbots Are Becoming the New Digital Doctors
Driven by rising healthcare costs and limited access to physicians, more people are turning to AI chatbots for fast, convenient medical advice. The study reports that nearly 31% of U.S. adults in 2024 used generative AI for health-related queries. But instead of asking about symptoms or general health information, most users focused on treatment advice and often framed questions in ways that could bias the response. Researchers found that nearly a quarter of treatment-related questions included leading phrases like “Is this the best drug for me?” or “This should work, right?”
This triggers what’s known as “sycophancy bias,” where the AI aims to agree with the user to appear helpful, even if it results in inaccurate or unsafe recommendations. The chats were also usually short and lacked important context, such as medical history or symptom severity. In some cases, users inquired about inappropriate treatments, which the chatbot partially validated due to poor question framing. The conclusion? While chatbots are increasingly seen as helpful “digital assistants,” they are not doctors. And without better training on how to respond to vague, biased, or emotionally charged questions, AI health advice could easily lead users down the wrong path.
Emotional Cues and Poor Context Trip Up AI in Health Chats
The study didn’t just examine the facts; it looked at the emotional tone of health-related AI chatbots. Researchers found that users occasionally expressed frustration, confusion, or even gratitude during conversations. These emotional exchanges often marked turning points. Either ending the interaction or leading to repetitive cycles, where the user kept challenging the chatbot’s responses. This highlights a key limitation: most AI models still lack the ability to read and respond appropriately to emotional cues. Something vital in health-related interactions. And the problem is deeper than tone. Many users simply didn’t provide enough medical context, yet expected accurate, personalized answers. For example, someone might ask about the best medication for blood pressure without sharing existing health conditions or current prescriptions.
Many of these systems tend to think that the user is able to provide a clear description of the condition. They also tend to fall short of the subtlety that must be followed when receiving good medical advice. With skewed or partial information provided by the user, the AI will not necessarily know how to follow up on the information they present, which makes it more likely to present advice that may be extremely harmful. The researchers indicate that developers face the task of teaching chatbots to perceive the lack of context, recognize emotional inferences, and answer with an explanatory question instead of agreeable answers.
Smarter Questions, Safer Answers: What’s Next for Health AI?
The research underscores a simple but powerful idea: how you ask health questions matters as much as what you ask. Poorly framed or emotionally charged questions can lead even advanced chatbots to give flawed advice. To solve this, scientists suggest developing AI capable of identifying the lack of context and inquiring about understanding and controlling emotional signals in a more responsible manner. In the meantime, AI must not be a substitute for actual doctors. Be careful about the words you use when you are seeking health questions on AI. A better question could mean a safer answer, and that might make all the difference.