Good algorithms do not guarantee clinical adoption. Here is what actually builds trust between AI systems and the physicians who use them.
If you ask most clinical AI companies what their primary adoption challenge is, they will tell you about data access, EHR integration, and regulatory clearance. These are real challenges. But they are not the rate-limiting factor for most platforms that have already cleared those hurdles. The rate-limiting factor is physician trust.
I have spent the past three years in direct conversations with oncologists at our twelve partner health systems - not selling to them, but listening to them after deployment. What I have heard consistently is not that oncologists distrust AI in principle. Most of them have accepted that machine learning models can identify patterns in large datasets that exceed human cognitive capacity. What they distrust, specifically and reasonably, is being asked to act on an alert from a system they do not understand, whose failure modes they cannot anticipate, and whose recommendations could expose them to malpractice liability if the alert proves to be wrong.
That trust problem is not solved by publishing better AUC scores. It is solved by doing the hard, unglamorous work of meeting physicians where they are - understanding their specific concerns, earning their confidence through transparency about limitations, and demonstrating over time that the platform's errors are the kinds of errors that an experienced oncologist would recognize and catch before they reach the patient. This is the human side of clinical AI, and it is where most companies underinvest.
Across our twelve deployments, oncologists consistently ask three questions when they first encounter a Pegasi alert, and how we answer those questions determines whether they engage with the platform or dismiss it.
"Why did the system flag this patient?" This is a request for interpretability - for the causal reasoning behind the alert, not just the probability score. Physicians are trained to reason in mechanistic terms: this symptom suggests this pathophysiology, which implies this diagnostic pathway. A platform that returns a number ("75% probability of Stage I colorectal cancer") without explaining what signals drove that number is asking the clinician to trust a black box. The question is reasonable and the demand for transparency is appropriate. Pegasi's alert interface addresses this by listing the top contributing signals - "CEA trend: 34% increase over 90 days; CT finding: 0.8 cm nodule with lobulated margin; Family history: first-degree relative with colorectal cancer before age 50" - in plain language alongside the probability estimate. The clinician can assess whether those signals make sense clinically, which both provides a validity check and builds confidence in the system's reasoning over time.
"What happens when you are wrong?" This is a question about failure modes, and it is more sophisticated than it sounds. Oncologists want to know whether the system's errors will be randomly distributed across the case space, or whether they will be concentrated in specific patient types where the model is systematically less reliable. A system that fails randomly is manageable. A system that systematically fails for, say, elderly patients with complex comorbidities or patients from underrepresented populations with limited training data representation is a system that could create systematic harm while appearing to perform well on aggregate metrics. This concern is legitimate and consistent with the literature on algorithmic bias in healthcare AI. Our response is to publish stratified performance metrics by age, sex, race, and comorbidity burden - not to claim the model is perfect, but to show where it is more and less reliable and let clinicians calibrate their level of scrutiny accordingly.
"Who is responsible if I follow your recommendation and the patient has a bad outcome?" This is the liability question, and it is the one that AI vendors most commonly try to sidestep. The honest answer is that clinical judgment remains entirely the physician's responsibility. Pegasi is a decision support tool, not a diagnostic authority. The legal framework for clinical AI decision support treats it the same way it treats any diagnostic test result: the clinician's obligation is to exercise sound professional judgment in interpreting and acting on the information provided, not to blindly follow any single input. We make this explicit in our BAA and in the onboarding training for every physician who uses the platform. The goal is to clarify the accountability structure, not to obscure it.
Trust with physicians builds in a predictable arc, and understanding that arc helps set appropriate expectations for both the health system and the clinical team.
In the first four to six weeks of deployment, utilization is low and skepticism is high. This is normal. Most clinicians are pattern-matching the new tool against prior experiences with EHR alert systems, which have conditioned them to be dismissive - a reflex developed for good reasons given the poor signal-to-noise ratios of most legacy clinical decision support. The best thing that can happen in this period is a high-confidence alert that the clinician would have reached independently through their own clinical reasoning. That moment of "the system got it right, and I can see exactly why" is the first brick in the trust foundation.
Between weeks six and sixteen, utilization typically increases as clinicians who had positive early experiences share them with colleagues informally. This peer-to-peer trust propagation is far more powerful than any training session or product demonstration. An oncologist who tells a colleague, "I had a case last week where I might have missed a Stage I finding for another few months, but the Pegasi alert caught it and here is what it was tracking" is doing trust-building work that no vendor communication can replicate.
By month six, the physicians who have adopted Pegasi as a routine part of their workflow are typically the ones who experienced an alert-to-diagnosis case early - where following up on a platform alert led to a confirmed early-stage diagnosis in a patient who presented asymptomatically. This is the proof point that converts intellectually convinced physicians into active champions. One concrete outcome, with a specific patient and a specific diagnosis, outweighs all the model validation papers we can publish.
One of the least-discussed aspects of clinical AI design is cognitive load. Oncologists are already operating at high cognitive load. The average oncologist manages 150-200 active patients, reads imaging reports and pathology results daily, attends tumor board meetings, manages chemotherapy protocols, and handles an inbox of patient messages that modern EHR design has made structurally overwhelming. Adding a new alert system to this environment is not neutral - it either reduces cognitive load by aggregating and prioritizing information, or it increases it by adding another source of notifications that require processing.
Pegasi's design philosophy on cognitive load is explicit: every alert must either (a) surface something the oncologist would not otherwise see, or (b) surface something they would eventually see but significantly faster. An alert that tells an oncologist something they already knew, in a format that requires as much time to read as their normal workflow would have taken to reach the same conclusion, has a cognitive load cost with no clinical benefit. It trains the clinician to ignore future alerts.
This design principle has concrete implications for alert frequency and calibration. We work with each partner institution's clinical team to set alert sensitivity thresholds appropriate for their patient population and workflow capacity. A community oncology practice with two physicians and a high-volume panel cannot absorb the same alert volume as a 15-physician academic medical center with dedicated alert response protocols. Calibration is not a one-size-fits-all parameter, and the health systems where Pegasi has the highest long-term utilization rates are the ones where we invested the most time in institution-specific threshold calibration during the first three months after go-live.
One of the more difficult design challenges in clinical AI is handling disagreement between the platform's recommendation and the clinician's own assessment. If the platform flags a case as high priority and the oncologist reviews it and disagrees, how that interaction is handled has significant implications for trust in both directions.
The current Pegasi workflow allows clinicians to dismiss an alert with a documented reason from a structured list (false positive based on clinical context; patient already being managed for this finding; patient preference; other). This dismissal is logged and reviewed in the monthly quality dashboard. When a clinician consistently dismisses a particular alert type, it is a signal either that the alert criteria need calibration (reducing false positives for that patient type) or that there is a clinical practice pattern that the Pegasi clinical team should understand. In some cases, the dismissal pattern reveals a genuine error in the model's behavior that requires retraining. In others, it reveals a clinical preference difference between institutions that should be addressed through threshold customization. Both outcomes improve the system over time.
The feedback loop between clinician dismissals and model improvement is not just a technical feature. It is a trust-building mechanism. When an oncologist sees that their documented dismissal reasons have led to a reduction in that type of false positive over subsequent months, they experience the platform as responsive to their judgment rather than imposing itself on them. That experience - of the AI learning from and respecting the clinician's expertise - is foundational to the kind of collaborative relationship that makes clinical AI genuinely useful rather than merely tolerated.