Patient Data Security in Healthcare AI Platforms

January 28, 2025  |  12 min read
Healthcare data security

The question we hear most often from hospital CIOs and clinical informatics directors before deploying any AI platform is not "does your model perform well?" It is "how do you protect patient data?" The answer matters far more than the performance metrics, because a breach involving Protected Health Information (PHI) carries consequences that no diagnostic accuracy score can offset.

This article describes the security architecture Pegasi built for healthcare AI, why we made specific design choices, and what questions any health system should be asking AI vendors before signing a contract.

The Fundamental Security Challenge in Clinical AI

Traditional enterprise software security focuses on protecting data at rest and in transit. Healthcare AI has a more complex problem: the model itself must process sensitive data to function, which means PHI enters the computational pipeline. Every design decision — where computation happens, how data flows, what gets logged — has security implications.

There are two broad architectural approaches healthcare AI vendors take. The first is cloud-processing: PHI is sent to the vendor's cloud infrastructure, processed there, and results returned to the health system. The second is on-premises or in-environment processing: the AI model runs inside the health system's own secure environment, and PHI never leaves that perimeter. These approaches have very different security and compliance profiles.

Pegasi's In-Environment Architecture

Pegasi's diagnostic platform processes all PHI within the health system's own secure environment. The architecture works as follows:

  1. Model deployment: Pegasi's AI models are deployed as containerized services within the health system's existing infrastructure — whether that is an on-premises data center or the health system's HIPAA-compliant cloud environment (typically AWS GovCloud or Azure Government).
  2. Data processing boundary: Patient data from PACS, LIS, and EHR systems is processed entirely within this deployed environment. No PHI traverses the public internet to Pegasi's infrastructure.
  3. Result delivery: Diagnostic outputs — risk scores, biomarker flags, report summaries — are written directly to the health system's EMR through approved HL7 FHIR or HL7 v2 interfaces, never routed through Pegasi servers.
  4. Model updates: When Pegasi releases updated model weights, they are delivered as signed, encrypted packages through a secure artifact registry. The update process does not require any PHI to be transmitted.

Encryption Standards

Within the deployed environment, all data — at rest and in transit between platform components — is encrypted using industry-standard algorithms:

  • At rest: AES-256 encryption for all stored data, including model input staging areas and intermediate computation results. Keys are managed through the health system's existing key management infrastructure (AWS KMS, Azure Key Vault, or on-premises HSM).
  • In transit: TLS 1.3 for all internal service-to-service communication within the deployed environment. Mutual TLS (mTLS) is available for environments requiring bidirectional authentication.
  • Model weights: Model files are encrypted at rest and signature-verified before loading to prevent tampering. Verification uses SHA-256 checksums signed with Pegasi's code-signing certificate.

Access Controls and Audit Logging

Access to Pegasi's platform components follows the principle of least privilege:

  • Role-based access control (RBAC) with roles mapped to clinical functions: ordering physician, reviewing oncologist, radiologist, administrator
  • Single Sign-On (SSO) integration with the health system's identity provider (Okta, Azure AD, Ping Identity) using SAML 2.0 or OpenID Connect
  • Multi-factor authentication (MFA) required for all administrative access
  • Session timeout enforced at 15 minutes of inactivity for clinical interfaces
  • Complete audit log of every data access event, stored in an append-only log that cannot be modified or deleted by platform administrators

Audit logs capture: user identity, access timestamp, patient record identifier (de-identified in logs), action type, and system component accessed. Logs are retained for a minimum of 6 years to satisfy HIPAA audit requirements and are exportable to the health system's SIEM.

HIPAA Compliance Framework

Pegasi operates as a Business Associate under HIPAA. Before any deployment, we execute a Business Associate Agreement (BAA) with the covered entity. Our BAA compliance program includes:

  • Annual HIPAA risk analysis and risk management documentation per 45 CFR ยง164.308(a)(1)
  • Workforce security training with role-specific HIPAA modules, completed annually by all Pegasi staff with PHI access
  • Incident response plan with defined breach notification timelines (60-day notification to covered entity as required, plus internal escalation within 24 hours)
  • Physical safeguards for any Pegasi facilities where PHI could theoretically be accessed (relevant for remote support sessions)
  • Subcontractor BAAs with all third-party service providers who could access PHI (our cloud infrastructure providers)

SOC 2 Type II Certification

Pegasi maintains SOC 2 Type II certification, covering the Trust Services Criteria for Security, Availability, and Confidentiality. The Type II designation means our controls were tested by an independent auditor across a 12-month observation period — not just designed on paper, but operating effectively over time. Health systems can request a copy of our most recent SOC 2 report under NDA by contacting privacy@pegasiio.com.

Questions to Ask Any Healthcare AI Vendor

If you are evaluating AI platforms for clinical use, here are the questions that cut through marketing language to reveal actual security posture:

  1. "Where exactly is PHI processed?" — Demand a data flow diagram. "In a HIPAA-compliant cloud" is not the same as "within your environment."
  2. "Who at your company can access our patient data and under what conditions?" — The answer should be "no one, by design" with a clear explanation of why that is technically enforced, not just a policy.
  3. "What is your breach notification process and timeline?" — Look for a specific SLA, not "we comply with HIPAA."
  4. "Can we see your most recent SOC 2 report?" — Any vendor who can't produce this on request has not completed the audit.
  5. "How are model updates delivered without transmitting PHI?" — The update process is often overlooked but creates a recurring security touchpoint.
  6. "What happens to our data if we terminate the contract?" — You need a specific data deletion and certification procedure.

Security Is a Design Constraint, Not an Afterthought

The vendors who treat security as a compliance checkbox will produce documentation that sounds thorough and architectures that are not. The vendors who treat security as a core design constraint will make different product decisions from the start — choosing in-environment processing over cloud APIs, choosing append-only audit logs over editable records, choosing mTLS over TLS, choosing to limit their own access to patient data rather than maximizing it for model improvement.

At Pegasi, we made the choice early that we would not build our training data pipeline on access to production patient records from our deployed health system partners. Our models are trained on de-identified datasets from consented research programs. Our production deployments are designed so that Pegasi has no technical ability to access PHI from health system deployments, even if we wanted to. That constraint shapes everything downstream.

Security in healthcare AI is not optional and it is not separable from the clinical value proposition. If you have specific questions about Pegasi's security architecture for your institution's evaluation, contact our security team at privacy@pegasiio.com. We are happy to walk through our full technical security documentation with your IT and compliance teams.

← Back to News