Data entry is the process through which clinical research observations, measurements, assessments, and documents become structured data in a research database. Although the phrase may sound simple, data entry is a critical stage in the research lifecycle. It is the point at which protocol-defined information is transferred from clinical practice, laboratory work, participant interviews, field activities, or source records into a system that will eventually support monitoring,
analysis, reporting, and archival.
In a well-managed clinical study, data entry is not an isolated clerical task. It is part of a controlled workflow that includes source documentation, form completion, review, validation, query resolution, quality control, and auditability. The workflow should define who enters data, when data are entered, what source documents are used, how corrections are made, how discrepancies are handled, and how supervisors or data managers monitor completeness and accuracy.
Different studies use different data entry models. In a paper-first model, study staff complete paper CRFs during or after participant encounters, and data clerks later enter those values into REDCap or another electronic database. In an electronic-first model, staff enter data directly into an eCRF during the clinical encounter or shortly afterward. In a participant-facing survey model, participants enter data themselves through a secure survey link or mobile device. In an integrated eSource model, some data may flow from electronic health records, laboratory systems, or devices into the research database.
Each model has advantages and risks. Paper-first workflows may be practical in settings with limited connectivity or where staff are accustomed to paper documentation, but they introduce transcription risk and delay. Direct electronic capture reduces transcription and can apply validation immediately, but it requires reliable devices, connectivity, user training, and careful management of access. Participant-facing surveys may reduce staff workload but require attention to literacy, language, identity verification, privacy, and incomplete submissions.
Integrated data flows can reduce duplicate entry but require strong mapping, governance, and validation between systems.
The data manager’s task is to design a workflow that fits the study context while protecting data quality and participant confidentiality. In many Kenyan and regional research settings, hybrid workflows are common. A site may enter most data directly into REDCap but temporarily use paper forms during network outages. A field team may collect data on tablets and synchronize later. A laboratory may provide results as a spreadsheet that is reviewed and imported.
These realities should be planned rather than treated as exceptions.