Clinia
Concepts

Architecture

System components, data flow, and layer boundaries for Patient Memory.

System Diagram

Architecture diagram

Design principles

These principles are load-bearing. They explain implementation choices that might otherwise seem arbitrary.

Connectors are the only integration point. Every source format has exactly one connector. Each connector's sole job is to transform its format into a common representation. Everything downstream is format-agnostic - adding a new source type (HL7 v2, DICOM, patient-reported) requires only a new connector.

Provenance is non-negotiable. Every merged entity carries a complete audit trail: which sources contributed, what conflicts existed, how each conflict was resolved, and which layer made the final decision. This is a first-class design constraint, not logging.

Entity resolution escalates, never guesses. The deduplication pipeline runs through four layers of increasing cost. A pair only escalates when the current layer can't decide confidently. The most expensive layer - LLM adjudication - is reserved for genuinely ambiguous cases.

The VFS gives agents a filesystem metaphor. Rather than exposing a graph query interface, the system wraps the resolved graph in a virtual file tree. Agents navigate what's there through progressive directory browsing, without needing to know what queries to issue upfront.

Ingestion is additive, storage is persistent. New sources accumulate on top of existing data. Sending a new encounter or a follow-up lab result does not require re-submitting the full patient history. Each ingest call resolves the new data against what is already in the graph and merges the result in place, keeping the patient record current without discarding anything already consolidated.

Layer 1 - Connect

The Connect layer is the only place where source format matters. Each connector reads its native format and produces a standardized clinical representation - a set of entities, events, relationships, documents, and narrative sections. Everything downstream operates on this contract exclusively.

ConnectorInput
FHIRFHIR R4 Bundle or single resource (JSON)
CDAHL7 CDA / C-CDA XML
DocumentPDFs, images, any unstructured file

Connectors also emit extraction warnings for missing codes, unparseable fields, or ambiguous data. These are surfaced via GET /patients/{patientId}/ingest/status and do not halt ingestion.

Layer 2 - Consolidate

Consolidation takes all connector outputs and produces a single deduplicated, relationship-enriched graph. It runs in three sequential stages.

Stage 1 - Entity resolution

Entity resolution deduplicates clinical entities across all sources. The same condition appearing in a FHIR bundle and two CDA documents produces one node in the graph, not three. See Entity Resolution

Every merge decision produces a provenance trace recording contributing sources, any field-level conflicts, the conflict resolution strategy applied, and which layer made the final call. See Provenance and Auditability.

Stage 2 - Relationship inference

After entities are deduplicated, the relationship resolver infers typed edges using the clinical knowledge base. Explicit references from source data (FHIR reasonReference, CDA entryRelationship) are preserved as-is. Inferred edges are only added when no explicit edge of the same type already exists between a pair. See Relationship Inference for the full node and edge type reference.

Stage 3 - Post-processing

A heuristic pass that fixes common EHR data quality issues that entity resolution alone cannot address:

  • Conditions with Z80–Z84 ICD-10 codes reclassified from condition to family_history
  • Conditions with "hx of", "history of", or Z87.x codes have their status set to resolved
  • Symptoms are linked to their likely parent conditions via symptom_of edges
  • Care coordination entries and referral records are reclassified to care_plan

Every post-processing action is logged as a PostProcessAction and visible in the reconciliation report (GET /patients/{patientId}/reconciliation). See Review Reconciliation.

Layer 3 - Serve

The Serve layer exposes the consolidated graph through a Virtual File System (VFS): a navigable directory tree that agents browse the same way they browse folders. No files exist on disk. Each path has a resolver function that materializes content from the live graph on demand.

/patient/{id}/
├── conditions/
│   └── active/
│       └── {slug}/
│           ├── _story.md    ← longitudinal condition narrative
│           └── _raw.json    ← structured entity + relationships
├── medications/
│   ├── current/
│   └── discontinued/
├── labs/
│   ├── latest
│   └── trends/
│       └── {loinc-slug}
├── encounters/
│   └── {year}/
├── timeline/
│   └── {year}/
├── allergies/
├── directives/
├── insurance/
├── family_history/
├── sources/
└── memory/

The _story.md file at any condition path is the centerpiece: it assembles medications, monitoring labs, complications, comorbidities, and a progression timeline into a single Markdown document, built lazily from the current graph state on each read.

API surface

Two surfaces expose the VFS and ingest capabilities:

MCP tools (agent-facing)

See MCP Tools.

ToolDescription
browse_patient(path)Directory listing or file preview at a VFS path
read_patient(path, format?, token_budget?)File content in narrative, structured, or compact format
search_patient(patientId, query)BM25 full-text search across all indexed content
get_patient_info(patientId)Patient demographics and pipeline statistics

REST endpoints

See REST API.

MethodPathDescription
GET/patientsList all registered patients
GET/patients/:idPatient metadata and pipeline stats
GET/patients/:id/vfsBrowse the VFS
GET/patients/:id/readRead a VFS path
GET/patients/:id/resolutionFull entity resolution report with provenance
GET/patients/:id/reconciliationReconciliation summary: inferred relationships + post-process actions
POST/patients/:id/ingest/fhirIngest a FHIR R4 Bundle
POST/patients/:id/ingest/cdaIngest a CDA / C-CDA XML document
POST/patients/:id/ingest/documentRegister an unstructured document by metadata
POST/patients/:id/ingestBatch ingest (all three types in one call)
GET/patients/:id/ingest/statusCurrent ingest state and extraction warnings
POST/patients/:id/ingest/resetClear all patient data
DELETE/patients/:idRemove patient from registry

On this page