Healthcare has no shortage of data. Electronic health records (EHRs), claims data, insurance premiums, and financial systems generate enormous volumes every day. Yet despite the promise of AI and advanced analytics, healthcare organizations often struggle to turn that data into action.
In a recent conversation with Federico Arroyo, a consultant and instructor at the California Institute of Technology’s Center for Technology and Management Education, we surfaced recurring challenges that highlight where projects stall—and where progress is truly being made. Federico divides his professional efforts between consulting for diverse organizations and teaching at Caltech, which allows him to test new frameworks in industry while translating insights into accessible, applicable content for practitioners in training.
For many data scientists, healthcare problems initially look like optimization puzzles. But unlike an e-commerce churn model or a fraud-detection algorithm, success in healthcare requires stepping into the patient’s shoes.
Take clinical trial design. On paper, it’s about efficiency: lowering costs and shortening cycles. In reality, it’s about the patient journey—what happens from the moment someone shows up on Day 1 to the final visit. Models that ignore this experience risk producing elegant answers that fail in practice.
As Federico noted: “Junior data scientists often miss the patient’s perspective by focusing only on optimization.” Closing this domain knowledge gap is critical.
Healthcare leaders are under pressure to innovate, and sometimes that ambition stretches too far.
One example: a client request to use large language models (LLMs) to write entire FDA submissions—100-200 page documents critical for drug approvals. The idea is bold, but the risks are enormous. Even a small error or hallucination could derail approval.
The practical path isn’t wholesale replacement but incremental adoption: automating certain sections, using AI to assist human writers, and building toward reliability step by step. In this space, MVPs are far safer than moonshots.
Preventative care is widely seen as a way to reduce healthcare costs: identify at-risk patients earlier, and fewer end up in the ER for expensive treatments.
But here’s the problem: the data is incomplete. We don’t have information on the people who don’t show up to doctors’ offices, making predictive modeling extremely difficult.
The first step is integration—bringing together claims, EHRs, and financial data that typically sit in disconnected systems. This unified view enables risk models that can at least flag patients who look similar to those who ended up needing acute care.
The next hurdle is engagement: reaching those patients and ensuring they take action. That can mean personalized outreach (where AI can help), better call-center processes, or even community-based programs.
The point: AI and ML are useful here, but they’re only effective when paired with solid data integration and human-centered engagement strategies.
Even when executives agree on the need for integration, healthcare providers often face a hidden barrier: vendor contracts.
Many EHR systems explicitly forbid data extraction, locking hospitals into outdated architectures. On the other side, cloud platforms like Snowflake or Azure are sometimes positioned with large, multi-million-dollar consulting roadmaps that feel out of reach.
But scalable solutions don’t have to be expensive. At 205 Data Lab, we’ve seen first-hand that leaner, warehouse-first implementations can replace big-ticket roadmaps with real, working solutions delivered faster and at a fraction of the cost. The challenge isn’t just technical—it’s navigating vendor lock-in and making sure investments are proportional to business value.
Despite the challenges, progress is happening through incremental wins. Some examples:
In each of these cases, data integration is the foundation. Without connecting the disparate systems, the analytics and AI layers can’t deliver meaningful results. With integration in place, however, these small steps save time, reduce cost, and improve outcomes. Over time, they compound into measurable impact for both patients and providers.
The future of healthcare data isn’t about flashy AI promises. It’s about solving the real blockers: domain knowledge gaps, unrealistic adoption paths, and vendor lock-in that slows teams down.
We’ve seen that once data is cleaned and integrated, architectures get simpler. Analytics become faster. And real innovation—whether it’s automation, personalization, or better decision-making—finally becomes possible.
At 205 Data Lab, that’s exactly where we focus: building lean, warehouse-first solutions that strip away complexity and unlock progress that sticks.
Stay in the loop with everything you need to know.