Clinia
How-to Guides

Ingest a CDA Document

POST a CDA or C-CDA XML document and handle common parsing gotchas.

Where to start

Send a CDA or C-CDA XML document to register it against a patient. The patientId in the URL is your registry key (the stable identifier used in all subsequent calls for this patient).

Make the request

curl -X POST "https://api.<workspace-id>.clinia.cloud/patients/{patientId}/ingest/cda" \
  -H "X-Clinia-API-Key: <clinia-api-key>" \
  -H "Content-Type: text/plain" \
  --data-binary @summary.xml

Use --data-binary rather than -d to preserve XML whitespace and encoding declarations.

A successful response:

{
  "ok": true,
  "source": "cda",
  "stats": { "itemsScanned": 28, "entitiesExtracted": 9, "eventsExtracted": 4 },
  "warnings": 1,
  "patient": { "id": "123", "name": "Jane Smith" }
}

Request details

The body must be a raw CDA/C-CDA XML string. The Content-Type must be text/plain. Do not JSON-encode the XML.

OID-to-URI code system mapping. CDA documents use OIDs for code systems. The extractor maps the common OIDs automatically:

OIDSystem
2.16.840.1.113883.6.96SNOMED CT
2.16.840.1.113883.6.90ICD-10-CM
2.16.840.1.113883.6.88RxNorm
2.16.840.1.113883.6.1LOINC
2.16.840.1.113883.6.69NDC

Codes using unrecognized OIDs are still extracted but cannot be matched by the deterministic layer of entity resolution. They may still resolve via NLP normalization or embedding similarity in layers 2–3. Each unresolved entity contributes a warning to the ingest response.

Encoding. CDA files are commonly encoded as Windows-1252 or ISO-8859-1. If the XML declaration declares a non-UTF-8 encoding, convert the file before sending:

iconv -f windows-1252 -t utf-8 summary.xml | \
  curl -X POST "https://api.<workspace-id>.clinia.cloud/patients/{patientId}/ingest/cda" \
    -H "X-Clinia-API-Key: <clinia-api-key>" \
    -H "Content-Type: text/plain" \
    --data-binary @-

Error responses

StatusCause
400Request body is not recognized as CDA XML
422XML could not be parsed (malformed structure or missing required CDA header elements)

On a 422, check that the document has a valid ClinicalDocument root element and a structuredBody or nonXMLBody.

Check ingest status

Each ingest call runs entity resolution synchronously, so data is queryable immediately on return. Use the status endpoint to confirm readiness after ingesting multiple sources in sequence, or to inspect extraction warnings:

curl "https://api.<workspace-id>.clinia.cloud/patients/{patientId}/ingest/status" \
  -H "X-Clinia-API-Key: <clinia-api-key>"
{
  "ready": true,
  "sources": [
    { "label": "Summary_20230907.xml", "stats": { "itemsScanned": 28, "entitiesExtracted": 9, "eventsExtracted": 4 } }
  ],
  "patient": { "id": "123", "name": "Jane Smith" },
  "loadStats": { "entitiesExtracted": 9, "eventsExtracted": 4, "relationshipsExtracted": 6 },
  "loadMs": 620
}

ready: true means the pipeline has run and the VFS is queryable. When warnings is non-empty, each entry describes a resource that was skipped during extraction (typically sections with unrecognized OIDs or missing required fields).

Batch ingest

Use the batch endpoint to submit multiple source types in a single call. Entity resolution runs once over all sources rather than once per source call:

curl -X POST "https://api.<workspace-id>.clinia.cloud/patients/{patientId}/ingest" \
  -H "X-Clinia-API-Key: <clinia-api-key>" \
  -H "Content-Type: application/json" \
  -d '{ "cda": ["<ClinicalDocument>...</ClinicalDocument>"] }'

The batch body accepts any combination of fhir, cda, and document fields. See the REST API reference for the full request shape.

On this page