Entity Resolution
How Patient Memory deduplicates clinical entities across multiple sources using a 4-layer escalating pipeline.
What it is
A complex patient record typically has the same condition, medication, or allergy represented multiple times across different sources, with different codes, different display text, and sometimes contradictory details.
Entity resolution is the process of deciding: are these two records the same real-world thing?
The 4-Layer Pipeline
Patient Memory uses an escalating cascade that only moves to more expensive methods when cheaper ones can't decide.
| Layer | Method | Handles |
|---|---|---|
| 1 | Deterministic code matching | Entities sharing standard codes (SNOMED, RxNorm, ICD-10, LOINC) |
| 2 | NLP normalization + fuzzy matching | Same concept with different display text or minor variation |
| 3 | Embedding similarity | Semantically similar entities with no shared codes |
| 4 | LLM adjudication | Genuinely ambiguous cases requiring clinical judgment |
A pair of entities only escalates to the next layer when the current layer cannot confidently resolve it. Layer 4 fires for at most ~15 pairs per patient to ensure cost-effectiveness.
In practice: ~40% of pairs resolve in Layer 1 for free. ~70% resolve before reaching LLM. The LLM sees only hard cases where clinical judgment is genuinely needed.
Provenance
Every merged entity carries a full audit trail (see Provenance and Auditability for the complete structure):
{
"sources": [
{ "type": "fhir", "origin": "FHIR Bundle", "reliability": 0.85, "ref": "Condition/abc" },
{
"type": "cda",
"origin": "Summary_20230907.xml",
"reliability": 0.8,
"ref": "observation/456"
}
],
"conflicts": [],
"resolvedBy": "deterministic-code",
"reasoning": "Both entities share SNOMED code 44054006.",
"confidence": 0.9
}When sources conflict (e.g., different medication doses), the conflict is recorded with both values and the resolution strategy used. Every merge decision is fully auditable.
Conflict Resolution
When sources disagree on an attribute value, the pipeline uses one of these strategies:
| Strategy | When used |
|---|---|
| Corroboration | Select the value agreed upon by the majority of sources |
| Recency | Prefer the most recent source |
| Reliability | Weight sources by their reliability score |
| LLM judgment | Complex conflicts where an LLM evaluates both values with full patient context |
Post-Processing
After entity deduplication, three passes run:
- Family history reclassification: conditions encoded as "hx of X" are moved to the family history subgraph
- Status inference: resolved and historical conditions are flagged based on textual cues
- Symptom linking: symptoms are linked to likely parent conditions via the clinical knowledge base
Auditing Resolution Decisions
The full resolution report (GET /patients/{patientId}/resolution) returns every entity in the graph with its complete provenance trace. Use this to debug unexpected merge or non-merge outcomes. See Audit Entity Resolution.
For a higher-level reconciliation summary (stats, inferred relationships, post-processing actions), use GET /patients/{patientId}/reconciliation. See Review the Reconciliation Summary.