Insights
The Quiet Consensus: Surgical AI’s Next Decade Will Be a Data Decade

Date
Jun 12, 2026
Author
Prashanth Ray
I sat through a webinar last week with five senior figures in surgical AI and robotics. A urological surgeon-pioneer of the robotic era. A surgical data scientist. A clinical researcher working at the intersection of training and AI deployment. A robotic urology pioneer with thirty years in the operating room. And the engineer often credited with founding the modern field of surgical robotics research. Five people who have been in this work, in some cases, since before "surgical AI" was a phrase.
They were there to discuss autonomy, training, regulation, and the future of the surgical team. What they ended up saying, in different language, from different angles, over the course of seventy minutes, was the same thing.
The binding constraint on autonomous and AI-assisted surgery is not the algorithms. It is not the robots. It is the data.
Specifically: the data is not granular enough. It is not diverse enough. It does not capture causality, only correlation. It does not record counterfactuals, what would have happened if the surgeon had made a different choice. And it does not yet describe the tissue dynamics that surgeons are actually responding to when they operate.
Each panellist reached this from their own discipline. The clinician-researcher emphasised the counterfactual gap, the absent data on what did not happen. The data scientist emphasised interpretability, the need for features extracted from the data that mean something causally, not just statistically. The robotic surgeon emphasised "tissue thermodynamics," the under-studied physics of how tissues actually behave under instrumentation. The engineer emphasised the need to correlate technical performance to clinical outcomes, a problem he described as still largely unsolved after thirty-five years in the field. The pioneer surgeon emphasised that we are "getting overconfident" that we do not know what we do not know, and most of what we do not know is locked inside the data we are not yet capturing.
Listening to this, two things became clear.
The first is that the surgical AI community is converging on the same diagnostic. Not loudly, not as a slogan, and not in a way that lends itself to a press release. But the alignment is real, and it is broad. When the godfather of surgical robotics research, two senior clinical urologists, a leading data scientist, and a US robotic urology pioneer all separately point to the same constraint, the field has, in effect, decided what the next decade is about.
The second is that almost everything the panel said was true of open surgery in an even more acute form.
This part went largely unstated, because the panel's clinical examples were drawn from the modalities the speakers work in laparoscopic, robotic, urological. Every diagnostic they offered, however, applies to open surgery several times over.
The data is not granular enough. In laparoscopic surgery, there is at least an endoscope capturing video as a by-product of the procedure. In open surgery, the seventy-five percent of global procedures that do not pass through an instrument port, there is by default, no capture at all. The granularity problem in laparoscopic data is the problem of structuring data that exists. The granularity problem in open surgical data is the prior problem of generating it.
The data is not diverse enough. The most-cited public surgical video datasets are all from a handful of European and North American centres. Open surgery in a Tier-2 Indian hospital, in a rural African operating theatre, in a Latin American teaching hospital, in the procedures and contexts where the global majority of surgery actually happens, none of this is in the data the field currently learns from. Whatever generalisability problem the panel was naming, the open-surgery component of it is an order of magnitude larger.
The data does not capture counterfactuals. This is true in laparoscopic surgery and almost completely true in open surgery. In open procedures, the surgeon's decision tree at any given moment is largely held in private cognition and the alternative paths considered, and the reasons for not taking them, mostly evaporate at the moment of the next decision. Without capture, there is no counterfactual data; without counterfactual data, there is no causal inference; without causal inference, surgical AI remains correlational. The chain is unforgiving and the bottleneck is the first link.
There is a moment in the webinar, near the end, when the engineer who founded the modern field of surgical robotics observes that the most important capability for any partially autonomous surgical system may be the ability to say "I am out of my depth here."
This is a striking framing because it inverts how the field usually talks about autonomy. The question is not whether the system can act on its own. The question is whether the system knows when it should not.
That capability, the ability to recognise out-of-distribution situations, depends on a particular kind of data substrate. The system has to have seen enough of the in-distribution cases to know what looks normal, and it has to have seen enough of the edge cases to recognise when the situation has slipped beyond them. Neither part of this is currently possible in open surgery, because the in-distribution archive is empty and the edge cases are by definition uncatalogued.
Autonomy, in the calibrated sense the field is starting to mean by it, is downstream of capture.
So the quiet consensus the webinar revealed is, in a deeper sense, the consensus that surgical AI's next decade will be a data decade. The algorithms exist. The architectures exist. The hardware, at the high end, exists. What is missing is the substrate, in the right granularity, with the right diversity, with the right counterfactual annotation, captured from the modalities and the settings that have not yet been instrumented.
This is the work we have set ourselves at Nuevata. Iris is the capture layer for surgeon point-of-view in open procedures. Mira is the structured intelligence layer that turns Iris footage into a corpus the field can learn from.


