Course Content
Clinical Research Data Management Course

A data dictionary is a structured document that describes the variables in a dataset or database. It is one of the most important outputs of the CRF design process. While the CRF shows what the user sees, the data dictionary explains how the data are defined, coded, validated, and documented. It supports database development, statistical analysis, quality control, sharing, and archival.
A basic data dictionary includes variable name, variable label, form or instrument, field type, response options, validation rules, units, branching logic, required status, and notes. More detailed dictionaries may include source document, collection timepoint, derivation rules, permissible ranges, missing value rules, data type, coding standard, and references to protocol sections. In REDCap, the data dictionary can be downloaded, edited as a CSV file, and uploaded to create or revise instruments.
Metadata are data about data. They provide context that allows data to be interpreted correctly. For example, a dataset may contain a variable called hb, but without metadata it may not be clear whether this means hemoglobin, what unit is used, which device or laboratory method produced the value, when it was measured, what range is expected, or how missing values were handled. Metadata transform values into interpretable research evidence.
Good metadata are essential for FAIR data practice. Data cannot be findable, interoperable, or reusable if future users cannot understand what the variables mean. They are also essential for reproducibility. If an analysis is repeated two years later, the analyst should be able to identify which dataset version was used, what variables meant, how categories were coded, and how derived variables were produced.
In clinical research, data dictionaries should be treated as controlled documents. They should be reviewed by data managers, investigators, statisticians, and where appropriate, monitors or sponsors. Changes should be documented. If a variable definition changes during a study, the change may affect interpretation of data collected before and after the change. Without documentation, such changes can be invisible but damaging.