Generating a Comprehensive, Multi-Modal, Ontology-Driven Value Set for Renal Replacement Therapy (RRT) Identification in Critical Care Databases

Introduction and Strategic Overview

This document outlines a comprehensive, multi-modal strategy for the creation of a definitive value set to identify all instances of Renal Replacement Therapy (RRT) within the MIMIC-IV and eICU-CRD databases. The accurate and exhaustive identification of RRT is paramount for the integrity of the Sepsis-Associated Acute Kidney Injury (SA-AKI) cohort, where the initiation of RRT within the first 24 hours of ICU admission is a critical exclusion criterion that fundamentally alters the natural history of the disease.

The primary challenge in this task is the fragmented nature of RRT evidence in electronic health records (EHRs). RRT is not documented in a single, standardized field but is scattered across disparate data sources, including procedure records, charted machine parameters, specific fluid and medication inputs, and administrative billing codes.1 A simplistic approach, such as creating a hardcoded list of procedure codes or item identifiers, is guaranteed to be incomplete and will fail to capture the full spectrum of RRT modalities, leading to patient misclassification and compromised study validity.

The strategy detailed herein is founded on the principle of triangulation. It leverages multiple, independent modalities of evidence—lexical, procedural, observational, and administrative—and queries multiple ontological sources (SNOMED CT, ICD, LOINC, UMLS) to maximize sensitivity, ensuring all true RRT events are captured. Simultaneously, this approach provides the rich metadata necessary to maintain high precision by allowing researchers to filter results based on the type and strength of the evidence. This robust methodology will produce a definitive RRT value set, forming a reliable foundation for cohort definition and subsequent clinical analysis.

Part A: An Exhaustive List of Lexical Search Terms

This section establishes the lexical foundation for the entire identification process. These case-insensitive terms and phrases will be used to query free-text columns (e.g., mimic.d_items.label, eicu.treatment.treatmentstring) and to identify initial seed concepts within the available ontologies.

Core Procedure Names and Modalities

This list captures the formal names, common clinical shorthand, and international spelling variations for all major RRT modalities. This is the primary set of terms for identifying documented procedures. The diversity of terminology used in clinical practice is extensive, encompassing intermittent, continuous, and hybrid therapies.2

A significant challenge arises from the interchangeable use of terms for "hybrid" therapies. Modalities such as Prolonged Intermittent Renal Replacement Therapy (PIRRT), Sustained Low-Efficiency Dialysis (SLED), Slow Low-Efficiency Daily Dialysis (SLEDD), and Extended Daily Dialysis (EDD) are often used synonymously in clinical practice to describe therapies that are longer in duration than conventional intermittent hemodialysis (IHD) but are not continuous.3 For the purpose of identifying RRT initiation, the fine-grained distinction between these hybrid modalities is less critical than correctly classifying them as a group separate from standard IHD or CRRT. Therefore, the final lookup table should map all these lexical variants to a single, canonical

rrt_modality, such as 'PIRRT/SLED', to simplify analysis and avoid creating artificial distinctions that the source data cannot reliably support.

Search Terms:

renal replacement therapy
hemodialysis
haemodialysis
intermittent hemodialysis
IHD
peritoneal dialysis
PD
continuous renal replacement therapy
CRRT
continuous veno-venous hemofiltration
CVVH
continuous venovenous hemofiltration
continuous veno-venous hemodialysis
CVVHD
continuous venovenous hemodialysis
continuous veno-venous hemodiafiltration
CVVHDF
continuous venovenous hemodiafiltration
slow continuous ultrafiltration
SCUF
ultrafiltration
hemofiltration
hemodiafiltration
prolonged intermittent renal replacement therapy
PIRRT
sustained low-efficiency dialysis
SLED
slow low-efficiency daily dialysis
SLEDD
extended daily dialysis
EDD

Machine, Device, and Manufacturer Names

Manufacturer and machine model names are powerful tools for discovering RRT-related identifiers (e.g., itemids) in dictionary tables like mimic.d_items. While the presence of a machine name like "Fresenius" in a clinical note is weak evidence of active RRT on its own, searching for it within structured data dictionary labels is an excellent way to find specific supplies or machine settings that might otherwise be missed.5 This two-step process—using machine names for discovery to populate the lookup table, then using the discovered identifiers for targeted searching—is a core component of this robust strategy.

Search Terms:

dialysis machine
dialyzer
hemofilter
Fresenius
Gambro
Baxter
Prisma
Prismaflex
Aquarius
NxStage
Tablo
4008
5008
6008
DIANOVA

RRT-Specific Fluids, Solutions, and Anticoagulants

The administration of fluids used exclusively for RRT constitutes strong evidence of therapy. This is particularly true for CRRT, which requires large volumes of sterile replacement fluid and dialysate.9 The evolution of CRRT solutions from lactate-based to bicarbonate-based, and the more recent introduction of phosphate-containing solutions, provides highly specific search terms.9 For example, a search for a modern CRRT solution like "Phoxilium" not only confirms RRT but also suggests a specific, modern modality. Similarly, a documented infusion of "trisodium citrate" running concurrently with high-volume fluid exchange is a near-certain indicator of CRRT with regional citrate anticoagulation, a common practice in the ICU.10

Search Terms:

dialysate
dialysis fluid
hemodialysis solution
replacement fluid
substitution solution
effluent
ultrafiltrate
anticoagulation
citrate
trisodium citrate
regional citrate
heparin
Prismasate
Prismocal
Phoxilium
Hemosol
Nutrineal
bibag

Vascular Access Terminology

The type of vascular access provides crucial context, particularly for differentiating between acute and chronic RRT. For this study's objective of excluding patients with RRT initiated within 24 hours of ICU admission, terms related to the insertion of temporary catheters are of the highest value. The presence of a mature arteriovenous (AV) fistula or graft signifies chronic end-stage renal disease (ESRD) and is not, by itself, evidence of a new RRT initiation event in the ICU.11 In contrast, a procedure note or a

procedureevents entry for the insertion of a non-tunneled catheter (e.g., vas-cath) is a powerful leading indicator that acute RRT is being planned or initiated. This distinction should be captured in the final lookup table, for instance, by assigning different rrt_modality categories such as 'Access_Insertion_Acute' versus 'Access_Chronic'.

Search Terms:

dialysis catheter
hemodialysis catheter
vas-cath
vascath
permacath
permcath
trialysis
Quinton catheter
Mahurkar catheter
Tesio catheter
arteriovenous fistula
AV fistula
AVF
arteriovenous graft
AV graft
AVG
PD catheter
peritoneal catheter
Tenckhoff catheter

Part B: A Multi-Ontology Querying Strategy

This section details the specific, actionable strategies for querying each available ontology to systematically build a comprehensive set of RRT-related concepts.

SNOMED CT: The Core Procedural Hierarchy

SNOMED CT provides the most comprehensive, logically structured hierarchy of clinical procedures, making it the ideal starting point for identifying RRT modalities.13 A recursive traversal from well-chosen parent concepts is the most robust method to capture all procedural variations. The available database schema confirms the presence of the necessary tables:

snomed_concept, snomed_description, and snomed_relationship.1

The strategy involves two steps:

Identify Seed Concepts: Rather than starting from a single, overly broad term, this strategy uses a curated list of high-level concepts for the three main RRT families. This ensures comprehensive yet relevant traversal of the hierarchy.
Recursive Traversal: A recursive Common Table Expression (CTE) in SQL will be used to query the snomed_relationship table. The query will start with the seed concepts and iteratively find all descendant concepts linked by an is a relationship (where type_id = 116680003).

Table B1: SNOMED CT Seed Concepts for RRT

Modality Family	Seed Concept ID	Preferred Term	Rationale
Hemodialysis (Intermittent & Hybrid)	399668008	Hemodialysis procedure (procedure)	Captures all forms of IHD, SLED, PIRRT, etc., under a single parent.
Peritoneal Dialysis	43075005	Peritoneal dialysis procedure (procedure)	Top-level concept for all PD-related procedures (e.g., CAPD, CCPD).14
Continuous RRT	714749008	Continuous renal replacement therapy (procedure)	The specific parent for CVVH, CVVHD, CVVHDF, and SCUF.14
General Dialysis	108241001	Dialysis procedure (procedure)	A broader parent to be used as a catch-all, with careful review of its children to exclude non-RRT concepts.14

SQL Snippet: Recursive SNOMED CT Traversal

SQL

WITH RECURSIVE rrt_hierarchy AS (
-- 1. Anchor Members: Start with our chosen seed concepts
SELECT
c.id AS concept_id,
d.term AS concept_name,
c.id AS base_concept_id,
d.term AS base_concept_name,
0 AS level
FROM snomed_ct.snomed_concept c
JOIN snomed_ct.snomed_description d ON c.id = d.concept_id
WHERE c.id IN (399668008, 43075005, 714749008, 108241001) -- Seed concepts from Table B1
AND c.active = TRUE
AND d.active = TRUE
AND d.type_id = 900000000000003001 -- Fully Specified Name

UNION ALL

\-- 2\. Recursive Step: Find all children of the concepts found so far  
SELECT  
    r.source\_id AS concept\_id,  
    d.term AS concept\_name,  
    rh.base\_concept\_id,  
    rh.base\_concept\_name,  
    rh.level \+ 1  
FROM snomed\_ct.snomed\_relationship r  
JOIN rrt\_hierarchy rh ON r.destination\_id \= rh.concept\_id  
JOIN snomed\_ct.snomed\_description d ON r.source\_id \= d.concept\_id  
WHERE r.type\_id \= 116680003 \-- 'Is a' relationship  
  AND r.active \= TRUE  
  AND d.active \= TRUE  
  AND d.type\_id \= 900000000000003001

)
SELECT DISTINCT concept_id, concept_name, base_concept_id, base_concept_name
FROM rrt_hierarchy;

ICD Procedure and Diagnosis Codes: The Administrative Layer

ICD codes are essential for capturing RRT evidence from billing and administrative data sources, such as mimic.procedures_icd.1 It is necessary to identify both procedure codes (ICD-9-CM, ICD-10-PCS) and diagnosis/status codes (ICD-10-CM).

ICD-10-PCS (Procedures): The key to identifying RRT procedures in ICD-10-PCS lies in the root operation. Most RRT procedures fall under Section 5 (Extracorporeal or Systemic Assistance and Performance). Specifically, the table 5A1D corresponds to the Performance of Urinary system function.15 Within this table, the 5th character, which specifies duration, is critical for distinguishing between intermittent, prolonged intermittent, and continuous modalities.18

ICD-10-CM (Diagnoses): These codes do not represent procedures but rather patient status or encounters for care. Z codes are particularly important. For instance, Z99.2 indicates a chronic dependence on dialysis, while Z49.x codes specify an encounter for dialysis care, providing strong circumstantial evidence.19

Mapping SNOMED CT to ICD: While the snomed_ct.snomed_icd_map table exists, a more comprehensive and robust strategy involves leveraging the UMLS Metathesaurus (umls.mrconso) as a central mapping hub.21 The

mrconso table links concepts (via a Concept Unique Identifier, or CUI) to their representations in dozens of source vocabularies (identified by the SAB column).23 The recommended mapping workflow is:

Take a SNOMED CT concept ID from the hierarchy traversal.
Find its corresponding CUI in umls.mrconso where SAB = 'SNOMEDCT_US'.
Use that CUI to find all associated rows where SAB is 'ICD10PCS', 'ICD10CM', or 'ICD9CM'.
This approach leverages the full semantic linking power of the UMLS, which is superior to relying on a single, potentially less current, direct map file.

Table B2: Key ICD-10-PCS Codes for RRT

Code	Description	RRT Modality	Source
5A1D70Z	Performance of Urinary Filtration, Intermittent, Less than 6 Hours Per Day	IHD	18
5A1D80Z	Performance of Urinary Filtration, Prolonged Intermittent, 6-18 hours Per Day	PIRRT/SLED	18
5A1D90Z	Performance of Urinary Filtration, Continuous, Greater than 18 hours Per Day	CRRT	18
3E1M39Z	Irrigation of Peritoneal Cavity using Dialysate, Percutaneous Approach	PD	26

Table B3: Key ICD-10-CM Codes for RRT

Code	Description	Implication	Source
Z99.2	Dependence on renal dialysis	Chronic RRT status	20
Z49.01	Encounter for fitting and adjustment of extracorporeal dialysis catheter	Active RRT care	19
Z49.31	Encounter for adequacy testing for hemodialysis	Active RRT care	29
N18.6	End-stage renal disease	Chronic status, often implies RRT (use with Z99.2)	19

LOINC: The Observational and Machine Parameter Layer

LOINC is essential for identifying RRT from charted data (e.g., in mimic.chartevents), which represents a distinct and powerful evidence stream separate from procedures or billing. Machine settings such as blood flow rates and dialysate flow rates, as well as circuit pressures, are definitive proof that an RRT circuit is active and running.

The strategy is to identify key LOINC panels and individual terms related to RRT machine parameters. The LOINC panel 99707-2 | Continuous renal replacement therapy panel is a primary target, as it contains a wealth of relevant child codes for CRRT.31

Table B4: Key LOINC Codes for RRT Machine Parameters & Observations

LOINC	Long Common Name	Evidence Type	Source
99711-4	Blood flow rate Renal replacement therapy circuit	Machine Setting	31
99712-2	Dialysate flow rate Renal replacement therapy circuit	Machine Setting	31
99713-0	Post-filter replacement fluid rate Renal replacement therapy circuit	Machine Setting	31
99720-5	Transmembrane pressure Renal replacement therapy circuit	Machine Setting	31
99718-9	Effluent pressure Renal replacement therapy circuit	Machine Setting	31
99735-3	Ultrafiltrate volume removed 1 hour	Fluid Balance	31
99708-0	Continuous renal replacement therapy mode	Modality Specifier	31
LL6149-0	(Answer List) CRRT mode	Modality Specifier	34
83064-6	Calcium.ionized [Moles/volume] in Blood drawn from CRRT circuit	Lab (Anticoagulation)	35

UMLS: The Semantic Unifier

The UMLS Semantic Network provides a powerful mechanism to filter lexically-derived concepts, ensuring they are of the correct semantic type (e.g., a procedure, a device) and not a homonym from an unrelated domain. A broad lexical search for a term like "graft" could return concepts related to surgery, botany, or politics; filtering by semantic type ensures only medically relevant concepts are retained.

The strategy is to take the candidate terms and concepts from lexical searches, find their CUIs in umls.mrconso, and then query the umls.mrsty table to retrieve their assigned Semantic Types (TUIs). Only concepts with TUIs from a predefined, curated list will be retained. This step dramatically increases the precision of the lexical search component. Based on the project's needs to identify procedures, devices, and fluids, the following semantic types are most relevant.36

Table B5: Key UMLS Semantic Types (TUIs) for RRT Filtering

TUI	Semantic Type Name	Application to RRT	Source
T061	Therapeutic or Preventive Procedure	Core RRT procedures (hemodialysis, CVVH, etc.)	37
T074	Medical Device	Dialysis machines, dialyzers, circuits	36
T203	Drug Delivery Device	Dialysis catheters, ports	42
T121	Pharmacologic Substance	Dialysate, replacement fluid, citrate, heparin	36
T034	Laboratory or Test Result	Charted values like effluent rate, pressures	39

Part C: A Robust Mapping and Table Creation Strategy

This final section provides the implementation blueprint for consolidating all identified concepts and mapping them to the specific identifiers in the MIMIC-IV and eICU-CRD databases, culminating in the creation of the final lookup table.

Unifying the Concept Space

The first step is to create a single staging table that aggregates all unique concepts and their identifiers gathered from the four sources in Part B (SNOMED CT, ICD, LOINC, and Lexical/UMLS-filtered). This unification is achieved using a UNION of the results from each stream, creating a master list of all potential RRT-related concepts.

SQL Logic Outline: Staging Table Creation

SQL

CREATE TABLE saaki.rrt_concepts_staging AS (
-- 1. SNOMED-CT derived procedure concepts
SELECT
concept_id::VARCHAR AS concept_code,
'SNOMED_CT' AS source_ontology,
concept_name,
'Procedure' AS concept_type
FROM rrt_snomed_hierarchy -- Result of the recursive query in Part B
UNION ALL
-- 2. ICD-10-PCS derived procedure codes
SELECT
icd_code,
'ICD10_PCS' AS source_ontology,
long_title,
'Procedure' AS concept_type
FROM mimic.d_icd_procedures
WHERE icd_code IN ('5A1D70Z', '5A1D80Z', '5A1D90Z', '3E1M39Z') -- From Table B2
UNION ALL
-- 3. LOINC-derived machine parameter concepts
SELECT
loinc_num,
'LOINC' AS source_ontology,
long_common_name,
'Machine Setting' AS concept_type
FROM loinc.loinc
WHERE loinc_num IN ('99711-4', '99712-2', '99720-5', '99708-0') -- From Table B4
UNION ALL
-- 4. Lexically-derived fluid and device concepts (after UMLS filtering)
SELECT
term,
'Lexical' AS source_ontology,
term,
'Fluid' AS concept_type
FROM lexical_fluids_table -- A temporary table of lexically-found fluids
);

Mapping to EHR Identifiers

This is the most critical implementation step: linking the abstract concepts from the staging table to the concrete identifiers used in the EHR databases. The database DDL provides the map for this process.1

MIMIC-IV Mapping:

itemid Mapping: The primary task is to map the concept_name from the staging table to the mimic.d_items.label column to get the relevant itemids for procedureevents, inputevents, and chartevents. A simple equality join will fail due to variations in clinical terminology (e.g., SNOMED CT's "Continuous venovenous hemofiltration" vs. a likely d_items label of "CRRT - CVVH"). Therefore, a multi-pass, pattern-based joining strategy is required. This involves attempting a direct join on cleaned, lowercased strings, followed by generating ILIKE patterns from both full names and abbreviations (e.g., label ILIKE '%cvvh%' or label ILIKE '%hemofiltration%') for any concepts that remain unmatched.
ICD Code Mapping: This is a straightforward join between the list of identified ICD codes and the mimic.procedures_icd.icd_code column.

eICU-CRD Mapping:

treatmentstring Mapping: The hierarchical nature of this column (e.g., renal|dialysis|CRRT) is a major advantage. Specific and efficient ILIKE patterns such as '%renal|dialysis%' can be generated to capture broad categories, with more specific patterns like '%renal|dialysis|crrt%' for individual modalities.
intakeoutput Mapping: Similar ILIKE patterns will be used on the celllabel and cellpath columns to find terms indicative of RRT, such as 'dialysis', 'crrt', 'effluent', or 'ultrafiltrate'.
Direct Flag: The eicu.apacheapsvar.dialysis = 1 flag is a high-confidence, direct identifier that requires no complex mapping and serves as an excellent validation point.

Final Lookup Table Creation and Schema

The final step is to consolidate all mapped identifiers from both MIMIC-IV and eICU-CRD into the final saaki.rrt_lookup table. This table will serve as the definitive value set for the cohort definition.

Table C1: Final saaki.rrt_lookup Table Schema

Column Name	Data Type	Description	Example
identifier_type	VARCHAR	The type of identifier, specifying the database, table, and column.	MIMIC_ITEMID, ICD10_PCS, EICU_TREATMENT_STRING_PATTERN, EICU_DIALYSIS_FLAG
identifier_value	VARCHAR	The actual value of the identifier. For patterns, this will be an ILIKE string.	225802, 5A1D90Z, %renal
rrt_modality	VARCHAR	A standardized category for the RRT type or evidence stream.	CRRT, IHD, PD, PIRRT/SLED, Machine_Setting, Fluid, Access_Insertion
source_ontologies	TEXT	An array of ontologies or methods that led to this identifier's inclusion.	{'SNOMED_CT', 'Lexical'}, {'ICD10PCS'}
concept_name	VARCHAR	The preferred name of the source concept from the ontology or lexical list.	Continuous venovenous hemodialysis, Performance of Urinary Filtration, Continuous

Final SQL Logic Outline: Populating the Lookup Table

SQL

CREATE TABLE saaki.rrt_lookup (
identifier_type VARCHAR(50) NOT NULL,
identifier_value VARCHAR(255) NOT NULL,
rrt_modality VARCHAR(50),
source_ontologies TEXT,
concept_name VARCHAR(255),
PRIMARY KEY (identifier_type, identifier_value)
);

-- Example INSERT statements for different evidence streams
INSERT INTO saaki.rrt_lookup (identifier_type, identifier_value, rrt_modality, source_ontologies, concept_name)
VALUES
-- MIMIC ItemID for a CRRT procedure
('MIMIC_ITEMID', '225802', 'CRRT', ARRAY, 'Continuous venovenous hemodialysis'),

\-- MIMIC ICD-10-PCS code for IHD  
('ICD10\_PCS', '5A1D70Z', 'IHD', ARRAY, 'Performance of Urinary Filtration, Intermittent, Less than 6 Hours Per Day'),

\-- eICU treatment string pattern for any dialysis  
('EICU\_TREATMENT\_STRING\_PATTERN', '%renal|dialysis%', 'RRT\_Generic', ARRAY\['Lexical'\], 'Renal Dialysis'),

\-- eICU direct flag for dialysis  
('EICU\_DIALYSIS\_FLAG', '1', 'RRT\_Generic', ARRAY, 'APACHE APS Dialysis Flag'),

\-- MIMIC ItemID for a CRRT machine setting  
('MIMIC\_ITEMID', '224149', 'Machine\_Setting', ARRAY\['LOINC', 'Lexical'\], 'Access Pressure');

--... additional INSERTs for all identified mappings...

Conclusion

The accurate identification of Renal Replacement Therapy is a foundational requirement for credible research using large critical care databases. The fragmented and non-standardized nature of RRT documentation presents a significant methodological challenge. The multi-modal, ontology-driven strategy detailed in this report provides a robust and comprehensive solution to this challenge.

By systematically triangulating evidence from lexical searches, procedural hierarchies in SNOMED CT, administrative codes in ICD, and observational data in LOINC, this approach ensures maximum sensitivity in detecting RRT events. The subsequent use of the UMLS Semantic Network to filter results and the careful mapping to specific EHR data fields ensures high precision. The final lookup table, saaki.rrt_lookup, will not only be an exhaustive list of identifiers but will also be enriched with metadata detailing the modality, evidence type, and ontological source of each entry. This provides the necessary granularity for nuanced analysis and serves as a durable, reproducible asset for defining the SA-AKI cohort and for future renal-related research. The implementation of this strategy will significantly enhance the validity and reliability of study findings by minimizing patient misclassification at the cohort definition stage.

stabgan/Renal replacement therapy.md

Generating a Comprehensive, Multi-Modal, Ontology-Driven Value Set for Renal Replacement Therapy (RRT) Identification in Critical Care Databases

Introduction and Strategic Overview

Part A: An Exhaustive List of Lexical Search Terms

Core Procedure Names and Modalities

Machine, Device, and Manufacturer Names

RRT-Specific Fluids, Solutions, and Anticoagulants

Vascular Access Terminology

Part B: A Multi-Ontology Querying Strategy

SNOMED CT: The Core Procedural Hierarchy

ICD Procedure and Diagnosis Codes: The Administrative Layer

LOINC: The Observational and Machine Parameter Layer

UMLS: The Semantic Unifier

Part C: A Robust Mapping and Table Creation Strategy

Unifying the Concept Space

Mapping to EHR Identifiers

Final Lookup Table Creation and Schema

Conclusion