How Patient Memory deduplicates clinical entities across multiple sources using a 4-layer escalating pipeline.

What it is

A complex patient record typically has the same condition, medication, or allergy represented multiple times across different sources, with different codes, different display text, and sometimes contradictory details.

Entity resolution is the process of deciding: are these two records the same real-world thing?

The 4-Layer Pipeline

Patient Memory uses an escalating cascade that only moves to more expensive methods when cheaper ones can't decide.

Layer	Method	Handles
1	Deterministic code matching	Entities sharing standard codes (SNOMED, RxNorm, ICD-10, LOINC)
2	NLP normalization + fuzzy matching	Same concept with different display text or minor variation
3	Embedding similarity	Semantically similar entities with no shared codes
4	LLM adjudication	Genuinely ambiguous cases requiring clinical judgment

A pair of entities only escalates to the next layer when the current layer cannot confidently resolve it. Layer 4 fires for at most ~15 pairs per patient to ensure cost-effectiveness.

In practice: ~40% of pairs resolve in Layer 1 for free. ~70% resolve before reaching LLM. The LLM sees only hard cases where clinical judgment is genuinely needed.

Provenance

Every merged entity carries a full audit trail (see Provenance and Auditability for the complete structure):

{
  "sources": [
    { "type": "fhir", "origin": "FHIR Bundle", "reliability": 0.85, "ref": "Condition/abc" },
    {
      "type": "cda",
      "origin": "Summary_20230907.xml",
      "reliability": 0.8,
      "ref": "observation/456"
    }
  ],
  "conflicts": [],
  "resolvedBy": "deterministic-code",
  "reasoning": "Both entities share SNOMED code 44054006.",
  "confidence": 0.9
}

When sources conflict (e.g., different medication doses), the conflict is recorded with both values and the resolution strategy used. Every merge decision is fully auditable.

Conflict Resolution

When sources disagree on an attribute value, the pipeline uses one of these strategies:

Strategy	When used
Corroboration	Select the value agreed upon by the majority of sources
Recency	Prefer the most recent source
Reliability	Weight sources by their reliability score
LLM judgment	Complex conflicts where an LLM evaluates both values with full patient context

Post-Processing

After entity deduplication, three passes run:

Family history reclassification: conditions encoded as "hx of X" are moved to the family history subgraph
Status inference: resolved and historical conditions are flagged based on textual cues
Symptom linking: symptoms are linked to likely parent conditions via the clinical knowledge base

Auditing Resolution Decisions

The full resolution report (GET /patients/{patientId}/resolution) returns every entity in the graph with its complete provenance trace. Use this to debug unexpected merge or non-merge outcomes. See Audit Entity Resolution.

For a higher-level reconciliation summary (stats, inferred relationships, post-processing actions), use GET /patients/{patientId}/reconciliation. See Review the Reconciliation Summary.