Generating a Comprehensive, Multi-Modal, Ontology-Driven Value Set for Renal Replacement Therapy (RRT) Identification in Critical Care Databases
This document outlines a comprehensive, multi-modal strategy for the creation of a definitive value set to identify all instances of Renal Replacement Therapy (RRT) within the MIMIC-IV and eICU-CRD databases. The accurate and exhaustive identification of RRT is paramount for the integrity of the Sepsis-Associated Acute Kidney Injury (SA-AKI) cohort, where the initiation of RRT within the first 24 hours of ICU admission is a critical exclusion criterion that fundamentally alters the natural history of the disease.
The primary challenge in this task is the fragmented nature of RRT evidence in electronic health records (EHRs). RRT is not documented in a single, standardized field but is scattered across disparate data sources, including procedure records, charted machine parameters, specific fluid and medication inputs, and administrative billing codes.1 A simplistic approach, such as creating a hardcoded list of procedure codes or item identifiers, is guaranteed to be incomplete and will fail to capture the full spectrum of RRT modalities, leading to patient misclassification and compromised study validity.
The strategy detailed herein is founded on the principle of triangulation. It leverages multiple, independent modalities of evidence—lexical, procedural, observational, and administrative—and queries multiple ontological sources (SNOMED CT, ICD, LOINC, UMLS) to maximize sensitivity, ensuring all true RRT events are captured. Simultaneously, this approach provides the rich metadata necessary to maintain high precision by allowing researchers to filter results based on the type and strength of the evidence. This robust methodology will produce a definitive RRT value set, forming a reliable foundation for cohort definition and subsequent clinical analysis.
This section establishes the lexical foundation for the entire identification process. These case-insensitive terms and phrases will be used to query free-text columns (e.g., mimic.d_items.label, eicu.treatment.treatmentstring) and to identify initial seed concepts within the available ontologies.
This list captures the formal names, common clinical shorthand, and international spelling variations for all major RRT modalities. This is the primary set of terms for identifying documented procedures. The diversity of terminology used in clinical practice is extensive, encompassing intermittent, continuous, and hybrid therapies.2
A significant challenge arises from the interchangeable use of terms for "hybrid" therapies. Modalities such as Prolonged Intermittent Renal Replacement Therapy (PIRRT), Sustained Low-Efficiency Dialysis (SLED), Slow Low-Efficiency Daily Dialysis (SLEDD), and Extended Daily Dialysis (EDD) are often used synonymously in clinical practice to describe therapies that are longer in duration than conventional intermittent hemodialysis (IHD) but are not continuous.3 For the purpose of identifying RRT initiation, the fine-grained distinction between these hybrid modalities is less critical than correctly classifying them as a group separate from standard IHD or CRRT. Therefore, the final lookup table should map all these lexical variants to a single, canonical
rrt_modality, such as 'PIRRT/SLED', to simplify analysis and avoid creating artificial distinctions that the source data cannot reliably support.
Search Terms:
- renal replacement therapy
- hemodialysis
- haemodialysis
- intermittent hemodialysis
- IHD
- peritoneal dialysis
- PD
- continuous renal replacement therapy
- CRRT
- continuous veno-venous hemofiltration
- CVVH
- continuous venovenous hemofiltration
- continuous veno-venous hemodialysis
- CVVHD
- continuous venovenous hemodialysis
- continuous veno-venous hemodiafiltration
- CVVHDF
- continuous venovenous hemodiafiltration
- slow continuous ultrafiltration
- SCUF
- ultrafiltration
- hemofiltration
- hemodiafiltration
- prolonged intermittent renal replacement therapy
- PIRRT
- sustained low-efficiency dialysis
- SLED
- slow low-efficiency daily dialysis
- SLEDD
- extended daily dialysis
- EDD
Manufacturer and machine model names are powerful tools for discovering RRT-related identifiers (e.g., itemids) in dictionary tables like mimic.d_items. While the presence of a machine name like "Fresenius" in a clinical note is weak evidence of active RRT on its own, searching for it within structured data dictionary labels is an excellent way to find specific supplies or machine settings that might otherwise be missed.5 This two-step process—using machine names for discovery to populate the lookup table, then using the discovered identifiers for targeted searching—is a core component of this robust strategy.
Search Terms:
- dialysis machine
- dialyzer
- hemofilter
- Fresenius
- Gambro
- Baxter
- Prisma
- Prismaflex
- Aquarius
- NxStage
- Tablo
- 4008
- 5008
- 6008
- DIANOVA
The administration of fluids used exclusively for RRT constitutes strong evidence of therapy. This is particularly true for CRRT, which requires large volumes of sterile replacement fluid and dialysate.9 The evolution of CRRT solutions from lactate-based to bicarbonate-based, and the more recent introduction of phosphate-containing solutions, provides highly specific search terms.9 For example, a search for a modern CRRT solution like "Phoxilium" not only confirms RRT but also suggests a specific, modern modality. Similarly, a documented infusion of "trisodium citrate" running concurrently with high-volume fluid exchange is a near-certain indicator of CRRT with regional citrate anticoagulation, a common practice in the ICU.10
Search Terms:
- dialysate
- dialysis fluid
- hemodialysis solution
- replacement fluid
- substitution solution
- effluent
- ultrafiltrate
- anticoagulation
- citrate
- trisodium citrate
- regional citrate
- heparin
- Prismasate
- Prismocal
- Phoxilium
- Hemosol
- Nutrineal
- bibag
The type of vascular access provides crucial context, particularly for differentiating between acute and chronic RRT. For this study's objective of excluding patients with RRT initiated within 24 hours of ICU admission, terms related to the insertion of temporary catheters are of the highest value. The presence of a mature arteriovenous (AV) fistula or graft signifies chronic end-stage renal disease (ESRD) and is not, by itself, evidence of a new RRT initiation event in the ICU.11 In contrast, a procedure note or a
procedureevents entry for the insertion of a non-tunneled catheter (e.g., vas-cath) is a powerful leading indicator that acute RRT is being planned or initiated. This distinction should be captured in the final lookup table, for instance, by assigning different rrt_modality categories such as 'Access_Insertion_Acute' versus 'Access_Chronic'.
Search Terms:
- dialysis catheter
- hemodialysis catheter
- vas-cath
- vascath
- permacath
- permcath
- trialysis
- Quinton catheter
- Mahurkar catheter
- Tesio catheter
- arteriovenous fistula
- AV fistula
- AVF
- arteriovenous graft
- AV graft
- AVG
- PD catheter
- peritoneal catheter
- Tenckhoff catheter
This section details the specific, actionable strategies for querying each available ontology to systematically build a comprehensive set of RRT-related concepts.
SNOMED CT provides the most comprehensive, logically structured hierarchy of clinical procedures, making it the ideal starting point for identifying RRT modalities.13 A recursive traversal from well-chosen parent concepts is the most robust method to capture all procedural variations. The available database schema confirms the presence of the necessary tables:
snomed_concept, snomed_description, and snomed_relationship.1
The strategy involves two steps:
- Identify Seed Concepts: Rather than starting from a single, overly broad term, this strategy uses a curated list of high-level concepts for the three main RRT families. This ensures comprehensive yet relevant traversal of the hierarchy.
- Recursive Traversal: A recursive Common Table Expression (CTE) in SQL will be used to query the snomed_relationship table. The query will start with the seed concepts and iteratively find all descendant concepts linked by an is a relationship (where type_id = 116680003).
Table B1: SNOMED CT Seed Concepts for RRT
Modality Family | Seed Concept ID | Preferred Term | Rationale |
---|---|---|---|
Hemodialysis (Intermittent & Hybrid) | 399668008 | Hemodialysis procedure (procedure) | Captures all forms of IHD, SLED, PIRRT, etc., under a single parent. |
Peritoneal Dialysis | 43075005 | Peritoneal dialysis procedure (procedure) | Top-level concept for all PD-related procedures (e.g., CAPD, CCPD).14 |
Continuous RRT | 714749008 | Continuous renal replacement therapy (procedure) | The specific parent for CVVH, CVVHD, CVVHDF, and SCUF.14 |
General Dialysis | 108241001 | Dialysis procedure (procedure) | A broader parent to be used as a catch-all, with careful review of its children to exclude non-RRT concepts.14 |
SQL Snippet: Recursive SNOMED CT Traversal
SQL
WITH RECURSIVE rrt_hierarchy AS (
-- 1. Anchor Members: Start with our chosen seed concepts
SELECT
c.id AS concept_id,
d.term AS concept_name,
c.id AS base_concept_id,
d.term AS base_concept_name,
0 AS level
FROM snomed_ct.snomed_concept c
JOIN snomed_ct.snomed_description d ON c.id = d.concept_id
WHERE c.id IN (399668008, 43075005, 714749008, 108241001) -- Seed concepts from Table B1
AND c.active = TRUE
AND d.active = TRUE
AND d.type_id = 900000000000003001 -- Fully Specified Name
UNION ALL
\-- 2\. Recursive Step: Find all children of the concepts found so far
SELECT
r.source\_id AS concept\_id,
d.term AS concept\_name,
rh.base\_concept\_id,
rh.base\_concept\_name,
rh.level \+ 1
FROM snomed\_ct.snomed\_relationship r
JOIN rrt\_hierarchy rh ON r.destination\_id \= rh.concept\_id
JOIN snomed\_ct.snomed\_description d ON r.source\_id \= d.concept\_id
WHERE r.type\_id \= 116680003 \-- 'Is a' relationship
AND r.active \= TRUE
AND d.active \= TRUE
AND d.type\_id \= 900000000000003001
)
SELECT DISTINCT concept_id, concept_name, base_concept_id, base_concept_name
FROM rrt_hierarchy;
ICD codes are essential for capturing RRT evidence from billing and administrative data sources, such as mimic.procedures_icd.1 It is necessary to identify both procedure codes (ICD-9-CM, ICD-10-PCS) and diagnosis/status codes (ICD-10-CM).
ICD-10-PCS (Procedures): The key to identifying RRT procedures in ICD-10-PCS lies in the root operation. Most RRT procedures fall under Section 5 (Extracorporeal or Systemic Assistance and Performance). Specifically, the table 5A1D corresponds to the Performance of Urinary system function.15 Within this table, the 5th character, which specifies duration, is critical for distinguishing between intermittent, prolonged intermittent, and continuous modalities.18
ICD-10-CM (Diagnoses): These codes do not represent procedures but rather patient status or encounters for care. Z codes are particularly important. For instance, Z99.2 indicates a chronic dependence on dialysis, while Z49.x codes specify an encounter for dialysis care, providing strong circumstantial evidence.19
Mapping SNOMED CT to ICD: While the snomed_ct.snomed_icd_map table exists, a more comprehensive and robust strategy involves leveraging the UMLS Metathesaurus (umls.mrconso) as a central mapping hub.21 The
mrconso table links concepts (via a Concept Unique Identifier, or CUI) to their representations in dozens of source vocabularies (identified by the SAB column).23 The recommended mapping workflow is:
- Take a SNOMED CT concept ID from the hierarchy traversal.
- Find its corresponding CUI in umls.mrconso where SAB = 'SNOMEDCT_US'.
- Use that CUI to find all associated rows where SAB is 'ICD10PCS', 'ICD10CM', or 'ICD9CM'.
This approach leverages the full semantic linking power of the UMLS, which is superior to relying on a single, potentially less current, direct map file.
Table B2: Key ICD-10-PCS Codes for RRT
Code | Description | RRT Modality | Source |
---|---|---|---|
5A1D70Z | Performance of Urinary Filtration, Intermittent, Less than 6 Hours Per Day | IHD | 18 |
5A1D80Z | Performance of Urinary Filtration, Prolonged Intermittent, 6-18 hours Per Day | PIRRT/SLED | 18 |
5A1D90Z | Performance of Urinary Filtration, Continuous, Greater than 18 hours Per Day | CRRT | 18 |
3E1M39Z | Irrigation of Peritoneal Cavity using Dialysate, Percutaneous Approach | PD | 26 |
Table B3: Key ICD-10-CM Codes for RRT
Code | Description | Implication | Source |
---|---|---|---|
Z99.2 | Dependence on renal dialysis | Chronic RRT status | 20 |
Z49.01 | Encounter for fitting and adjustment of extracorporeal dialysis catheter | Active RRT care | 19 |
Z49.31 | Encounter for adequacy testing for hemodialysis | Active RRT care | 29 |
N18.6 | End-stage renal disease | Chronic status, often implies RRT (use with Z99.2) | 19 |
LOINC is essential for identifying RRT from charted data (e.g., in mimic.chartevents), which represents a distinct and powerful evidence stream separate from procedures or billing. Machine settings such as blood flow rates and dialysate flow rates, as well as circuit pressures, are definitive proof that an RRT circuit is active and running.
The strategy is to identify key LOINC panels and individual terms related to RRT machine parameters. The LOINC panel 99707-2 | Continuous renal replacement therapy panel is a primary target, as it contains a wealth of relevant child codes for CRRT.31
Table B4: Key LOINC Codes for RRT Machine Parameters & Observations
LOINC | Long Common Name | Evidence Type | Source |
---|---|---|---|
99711-4 | Blood flow rate Renal replacement therapy circuit | Machine Setting | 31 |
99712-2 | Dialysate flow rate Renal replacement therapy circuit | Machine Setting | 31 |
99713-0 | Post-filter replacement fluid rate Renal replacement therapy circuit | Machine Setting | 31 |
99720-5 | Transmembrane pressure Renal replacement therapy circuit | Machine Setting | 31 |
99718-9 | Effluent pressure Renal replacement therapy circuit | Machine Setting | 31 |
99735-3 | Ultrafiltrate volume removed 1 hour | Fluid Balance | 31 |
99708-0 | Continuous renal replacement therapy mode | Modality Specifier | 31 |
LL6149-0 | (Answer List) CRRT mode | Modality Specifier | 34 |
83064-6 | Calcium.ionized [Moles/volume] in Blood drawn from CRRT circuit | Lab (Anticoagulation) | 35 |
The UMLS Semantic Network provides a powerful mechanism to filter lexically-derived concepts, ensuring they are of the correct semantic type (e.g., a procedure, a device) and not a homonym from an unrelated domain. A broad lexical search for a term like "graft" could return concepts related to surgery, botany, or politics; filtering by semantic type ensures only medically relevant concepts are retained.
The strategy is to take the candidate terms and concepts from lexical searches, find their CUIs in umls.mrconso, and then query the umls.mrsty table to retrieve their assigned Semantic Types (TUIs). Only concepts with TUIs from a predefined, curated list will be retained. This step dramatically increases the precision of the lexical search component. Based on the project's needs to identify procedures, devices, and fluids, the following semantic types are most relevant.36
Table B5: Key UMLS Semantic Types (TUIs) for RRT Filtering
TUI | Semantic Type Name | Application to RRT | Source |
---|---|---|---|
T061 | Therapeutic or Preventive Procedure | Core RRT procedures (hemodialysis, CVVH, etc.) | 37 |
T074 | Medical Device | Dialysis machines, dialyzers, circuits | 36 |
T203 | Drug Delivery Device | Dialysis catheters, ports | 42 |
T121 | Pharmacologic Substance | Dialysate, replacement fluid, citrate, heparin | 36 |
T034 | Laboratory or Test Result | Charted values like effluent rate, pressures | 39 |
This final section provides the implementation blueprint for consolidating all identified concepts and mapping them to the specific identifiers in the MIMIC-IV and eICU-CRD databases, culminating in the creation of the final lookup table.
The first step is to create a single staging table that aggregates all unique concepts and their identifiers gathered from the four sources in Part B (SNOMED CT, ICD, LOINC, and Lexical/UMLS-filtered). This unification is achieved using a UNION of the results from each stream, creating a master list of all potential RRT-related concepts.
SQL Logic Outline: Staging Table Creation
SQL
CREATE TABLE saaki.rrt_concepts_staging AS (
-- 1. SNOMED-CT derived procedure concepts
SELECT
concept_id::VARCHAR AS concept_code,
'SNOMED_CT' AS source_ontology,
concept_name,
'Procedure' AS concept_type
FROM rrt_snomed_hierarchy -- Result of the recursive query in Part B
UNION ALL
-- 2. ICD-10-PCS derived procedure codes
SELECT
icd_code,
'ICD10_PCS' AS source_ontology,
long_title,
'Procedure' AS concept_type
FROM mimic.d_icd_procedures
WHERE icd_code IN ('5A1D70Z', '5A1D80Z', '5A1D90Z', '3E1M39Z') -- From Table B2
UNION ALL
-- 3. LOINC-derived machine parameter concepts
SELECT
loinc_num,
'LOINC' AS source_ontology,
long_common_name,
'Machine Setting' AS concept_type
FROM loinc.loinc
WHERE loinc_num IN ('99711-4', '99712-2', '99720-5', '99708-0') -- From Table B4
UNION ALL
-- 4. Lexically-derived fluid and device concepts (after UMLS filtering)
SELECT
term,
'Lexical' AS source_ontology,
term,
'Fluid' AS concept_type
FROM lexical_fluids_table -- A temporary table of lexically-found fluids
);
This is the most critical implementation step: linking the abstract concepts from the staging table to the concrete identifiers used in the EHR databases. The database DDL provides the map for this process.1
MIMIC-IV Mapping:
- itemid Mapping: The primary task is to map the concept_name from the staging table to the mimic.d_items.label column to get the relevant itemids for procedureevents, inputevents, and chartevents. A simple equality join will fail due to variations in clinical terminology (e.g., SNOMED CT's "Continuous venovenous hemofiltration" vs. a likely d_items label of "CRRT - CVVH"). Therefore, a multi-pass, pattern-based joining strategy is required. This involves attempting a direct join on cleaned, lowercased strings, followed by generating ILIKE patterns from both full names and abbreviations (e.g., label ILIKE '%cvvh%' or label ILIKE '%hemofiltration%') for any concepts that remain unmatched.
- ICD Code Mapping: This is a straightforward join between the list of identified ICD codes and the mimic.procedures_icd.icd_code column.
eICU-CRD Mapping:
- treatmentstring Mapping: The hierarchical nature of this column (e.g., renal|dialysis|CRRT) is a major advantage. Specific and efficient ILIKE patterns such as '%renal|dialysis%' can be generated to capture broad categories, with more specific patterns like '%renal|dialysis|crrt%' for individual modalities.
- intakeoutput Mapping: Similar ILIKE patterns will be used on the celllabel and cellpath columns to find terms indicative of RRT, such as 'dialysis', 'crrt', 'effluent', or 'ultrafiltrate'.
- Direct Flag: The eicu.apacheapsvar.dialysis = 1 flag is a high-confidence, direct identifier that requires no complex mapping and serves as an excellent validation point.
The final step is to consolidate all mapped identifiers from both MIMIC-IV and eICU-CRD into the final saaki.rrt_lookup table. This table will serve as the definitive value set for the cohort definition.
Table C1: Final saaki.rrt_lookup Table Schema
Column Name | Data Type | Description | Example |
---|---|---|---|
identifier_type | VARCHAR | The type of identifier, specifying the database, table, and column. | MIMIC_ITEMID, ICD10_PCS, EICU_TREATMENT_STRING_PATTERN, EICU_DIALYSIS_FLAG |
identifier_value | VARCHAR | The actual value of the identifier. For patterns, this will be an ILIKE string. | 225802, 5A1D90Z, %renal |
rrt_modality | VARCHAR | A standardized category for the RRT type or evidence stream. | CRRT, IHD, PD, PIRRT/SLED, Machine_Setting, Fluid, Access_Insertion |
source_ontologies | TEXT | An array of ontologies or methods that led to this identifier's inclusion. | {'SNOMED_CT', 'Lexical'}, {'ICD10PCS'} |
concept_name | VARCHAR | The preferred name of the source concept from the ontology or lexical list. | Continuous venovenous hemodialysis, Performance of Urinary Filtration, Continuous |
Final SQL Logic Outline: Populating the Lookup Table
SQL
CREATE TABLE saaki.rrt_lookup (
identifier_type VARCHAR(50) NOT NULL,
identifier_value VARCHAR(255) NOT NULL,
rrt_modality VARCHAR(50),
source_ontologies TEXT,
concept_name VARCHAR(255),
PRIMARY KEY (identifier_type, identifier_value)
);
-- Example INSERT statements for different evidence streams
INSERT INTO saaki.rrt_lookup (identifier_type, identifier_value, rrt_modality, source_ontologies, concept_name)
VALUES
-- MIMIC ItemID for a CRRT procedure
('MIMIC_ITEMID', '225802', 'CRRT', ARRAY, 'Continuous venovenous hemodialysis'),
\-- MIMIC ICD-10-PCS code for IHD
('ICD10\_PCS', '5A1D70Z', 'IHD', ARRAY, 'Performance of Urinary Filtration, Intermittent, Less than 6 Hours Per Day'),
\-- eICU treatment string pattern for any dialysis
('EICU\_TREATMENT\_STRING\_PATTERN', '%renal|dialysis%', 'RRT\_Generic', ARRAY\['Lexical'\], 'Renal Dialysis'),
\-- eICU direct flag for dialysis
('EICU\_DIALYSIS\_FLAG', '1', 'RRT\_Generic', ARRAY, 'APACHE APS Dialysis Flag'),
\-- MIMIC ItemID for a CRRT machine setting
('MIMIC\_ITEMID', '224149', 'Machine\_Setting', ARRAY\['LOINC', 'Lexical'\], 'Access Pressure');
--... additional INSERTs for all identified mappings...
The accurate identification of Renal Replacement Therapy is a foundational requirement for credible research using large critical care databases. The fragmented and non-standardized nature of RRT documentation presents a significant methodological challenge. The multi-modal, ontology-driven strategy detailed in this report provides a robust and comprehensive solution to this challenge.
By systematically triangulating evidence from lexical searches, procedural hierarchies in SNOMED CT, administrative codes in ICD, and observational data in LOINC, this approach ensures maximum sensitivity in detecting RRT events. The subsequent use of the UMLS Semantic Network to filter results and the careful mapping to specific EHR data fields ensures high precision. The final lookup table, saaki.rrt_lookup, will not only be an exhaustive list of identifiers but will also be enriched with metadata detailing the modality, evidence type, and ontological source of each entry. This provides the necessary granularity for nuanced analysis and serves as a durable, reproducible asset for defining the SA-AKI cohort and for future renal-related research. The implementation of this strategy will significantly enhance the validity and reliability of study findings by minimizing patient misclassification at the cohort definition stage.