Course Content
Clinical Research Data Management Course

Clinical research data often include sensitive personal information. This may include names, contact details, dates of birth, medical histories, laboratory results, HIV status, pregnancy status, genetic information, geolocation, or other identifiers. Even when direct identifiers are removed, combinations of variables may still create re-identification risk. Data protection is therefore not an optional administrative requirement. It is a core ethical obligation.

Ethical data management begins with respect for participants. Participants should be informed about what data will be collected, why they are collected, how they will be protected, who may access them, how long they will be stored, and whether they may be shared for future research. Data managers do not usually lead the consent process, but they must ensure that systems respect consent conditions. If participants decline future data sharing, this decision must be documented and honored.
Confidentiality requires practical controls. These include role-based access, strong authentication, secure storage, encrypted transfers where appropriate, controlled exports, separation of identifiers from analysis data, limited access to linkage files, secure backups, and documented procedures for breach response. In REDCap, user rights and Data Access Groups are important tools for limiting access. In R workflows, analysts should avoid storing identifiable data in unsecured folders or sharing raw files through informal channels.
Data protection also has legal dimensions. Depending on the study context, teams may need to comply with institutional policies, national laws such as the Kenya Data Protection Act, sponsor requirements, ethics committee approvals, and international frameworks such as GDPR or HIPAA where applicable. A data manager does not need to be a lawyer but must know when data handling decisions require governance review.
Research ethics and data protection are especially important in multisite studies. A national or regional study may involve facilities with different infrastructure, different user groups, and varying levels of data experience. Without clear access controls and standard operating procedures, staff may see records from sites they should not access, export unnecessary identifiers, or use shared accounts. Good governance prevents these problems by defining who can access what data, for what purpose, and under what conditions.

Figure 1.4: Suggested image showing layered data protection controls: consent, role-based access, audit trails, secure storage, controlled export, and archival governance