Report: The generative AI road map, its hurdles and strategies to mitigate

By Anastassia Gliadkovskaya May 17, 2024 5:09pm

While generative AI holds much promise in alleviating burdens in healthcare, concerns around transparency, bias and clinical deskilling remain, a new report says.

The IHI Lucian Leape Institute, a think tank within the Institute for Healthcare Improvement, convened an expert panel to build out the report. The panel was made up of Big Tech players like Amazon, Google and Microsoft as well as leaders such as Harvard Medical School, The Leapfrog Group and Kaiser Permanente. Together, their goal was to identify the main areas where generative AI is likely to be used as well as its challenges and the potential ways to mitigate them.

“There's a lot of conversation about AI in healthcare in general,” IHI President and CEO Kedar Mate, M.D., told Fierce Healthcare. Mate is also president of the Leape Institute. “Before that got fully underway, we wanted to be sure that concerns … were being thought about,” he said.

Most generative AI promises detailed in the report came as no surprise: the ability to reduce clinician burnout, improve diagnostic accuracy and reduce the costs of care. The risks the report identified included the depersonalization of care, inaccuracies and bias, challenges with integration and workforce deskilling.

The 50-page report contained a multitude of specific recommendations for various stakeholders across the healthcare ecosystem. It identified six best practices to guide the ongoing development of generative AI tools and their integration into clinical care delivery:

Serve and safeguard the patient
Learn with, engage and listen to clinicians
Evaluate and ensure AI efficacy and freedom from bias
Establish strict AI governance, oversight and guidance for health systems and the federal government
Be intentional with the design, implementation and ongoing evaluation of AI tools
Engage in collaborative learning across health systems

“Key groups must work together with intention and discipline to implement genAI in ways that enhance patient safety,” the report said.

The report built its recommendations around three areas anticipated to be the primary use cases for generative AI in the near future: documentation support, clinical decision support and patient chatbots.

Generative AI for documentation support is pretty widely used already, Mate noted, with patient chatbots not far behind. But clinical decision support tools will take longer to roll out, in part due to trust and validation concerns and also the need for a solid regulatory framework to approve algorithms.

“That type of technology still needs to evolve,” Mate said. Administrative uses are already further along. “That doesn’t face the same kind of regulatory pressure because they’re not patient-facing,” per Mate.

Documentation support

GenAI tools can and are being used in an administrative setting, from developing patient history summaries to transcribing conversations to drafting responses to patient messages. These offerings, the report said, can reduce clinical documentation burden, resolve inaccuracies in the EHR, improve the accessibility of documentation and strengthen communication.

But several risks and challenges exist for these use cases. There might be a lack of patient transparency or informed consent on the use of a tool. Clinicians may have to spend more time manually reviewing flagged inaccuracies. And, importantly, even if a tool frees up a clinician’s time, that free time might be replaced by new expectations, such as seeing a greater volume of patients.

“If we take that newly freed up time … will that actually create more burnout in the end, is the concern,” Mate said.

When it comes to who should or is likely to be responsible for ensuring that doesn’t happen, Mate believes it will fall to health systems’ AI governance committees as opposed to outside regulators.

What’s more, the onus on these committees will be not only to be proactive about the thoughtful procurement of AI tools, but also the meaningful ongoing auditing of such tools, which are continuously learning.

Additionally, deskilling was identified as a threat to generative AI in documentation support. There may be an over-reliance on tools that may then lead to unnoticed errors and inadequate oversight, the report said. To combat this, there should be exercises built into clinician workflows to help keep their memory active on how to manually perform certain automated tasks. One example identified in the report included TSA agents, who are periodically shown images of weapons in luggage to test their level of attention and to promote vigilance.

Mate believes deskilling is a real concern, because, as the industry evolves, so, too, do approaches to teaching. For instance, to become a licensed doctor, Mate took a closed-book exam. But a decade later, when he re-certified, it was an open-book exam. “In just 10 years’ time … a very significant difference had emerged,” Mate reflected. Thus, it is inevitable that the way of thinking in medical schools will continue to evolve with the emergence of AI technologies.

“I do think this risk, or benefit, of changing skills is very high,” Mate said.

Clinical decision support

Generative AI tools could offer diagnostic support and recommendations, provide early detection of changes to patient conditions or suggest treatment plans. These applications can potentially improve accuracy, save clinicians time and possibly reduce costs. But many risks related to generative AI clinical decision support remain, per the report.

There is not a lot of evidence supporting their effectiveness. Rigorous evaluations of clinical decision support tools, mostly before the advent of generative AI, have yielded small improvements in clinician behaviors, the report highlighted. There are some successful AI-based clinical decision support examples, but they have involved supervised machine learning. One example is the use of AI-based computer aided detection in radiology, now routinely used in mammography.

Additionally, any excitement about tools is tempered by evidence that they can contribute to alert fatigue and frustrations with EHRs. There are also concerns about clinical over-reliance, compliance and automation bias.

At the same time, it’s unrealistic to expect clinicians to double-check the accuracy of every single generative AI recommendation, per Mate. There need to be new digital ways to audit algorithms.

“The volume of these clinical decision supports that AI can offer... will become a new cognitive burden and load that clinicians will have to carry,” Mate cautioned. And these tools must be promoted as an aid to, not a replacement for, clinical decision-making.

Chatbots supporting patients

Generative AI as a patient-facing chatbot can help collect data and support patient triage, respond to basic patient questions and facilitate care navigation. These functions can expand access to care, democratize access to understandable information and offer more accurate and reliable data, the report said.

But there are concerns about the ethics of technology that mimics a human as well as whether that is being properly disclosed to a user. There are also concerns about chatbots’ accuracy, the loss of human connection and information flow—that is, what can help guarantee a patient that is triaged to a high-risk category actually gets the care they need.

The report offered several mitigation strategies, including embedding disclosures of a chatbot’s function for patients, designing chatbots with guardrails and ensuring escalation pathways for chatbot users. An example of a guardrail is having certain prompts that a chatbot cannot respond to. Maintaining human oversight of chatbots also remains crucial, the report said. That includes routine auditing of chatbot performance and clinical review of conversations to ensure that patients’ needs are being met.

Patient representatives involved in the report were pretty favorable to AI being used in care, Mate said. They have hope in its ability to possibly remove the “chronic headaches” of patient safety and quality—things like human errors in drug interactions or handoffs. But they insist on transparency around AI use.

“The patient community was very clear on the fact that at least for now… there would be a clear annotation to whatever work product is created that was aided and augmented by an AI technology,” Mate said.

“So much around how this technology is going to be deployed in our environments is not worked out, and I hope this paper and the various audiences in it … will help,” Mate concluded.

Institute for Healthcare Improvement Artificial Intelligence generative AI Workforce Physician Satisfaction Patient Safety Providers AI and Machine Learning