How crucial and essential data management is in a clinical trial must be one of the most common aspects highlighted when talking about medical research.
This is not surprising due to the fact that collecting and processing data are key to achieve the pursued objective: to obtain relevant outcomes from all that great amount of information for developing improvements in the health sector .
For that reason, it is extremely important to guarantee high quality data management which requires skilled staff who can ensure excellence and professionalism.
Moreover, as the clinical trials industry is growing and changing at a great speed, it is fundamental to be up to date and aware of new technologies and methods that can reduce risks at minimum.
Nowadays, the quantity of software programs that exist to streamline data management is the largest number in history, therefore, data managers need to be trained and have the utmost knowledge of them.
Besides, not every study requires the same solutions, hence it is imperative to detect their specific needs and be able to adapt to them.
Some of the data sources used are Electronic Data Capture (EDC), Electronic Clinical Outcome Assessment (eCOA), and central laboratory data, among others.
Also, there is the possibility of obtaining real-time data thanks to the use of electronic tools such as wearable devices or Electronic Health Records (EHRs). 
What are the clinical data managers’ activities?
This relatively new profession has gained importance and recognition due to its scope of responsibilities, and its multitasking and varied interconnectedness features.
Moreover, as mentioned above, clinical data management is dynamic and complex which stresses the importance of becoming a well-established discipline, a process that continues taking place nowadays.
Finally, it is fair to emphasize that, in addition to the inherent complexity of this professional activity, there is the added layer of difficulty of being a central part of the clinical research framework, a strictly regulated sector, which entails being compliant with a great deal of policies, guidelines, and standards. 
In this article, the main activities of Clinical Data Managers (CDMs) will be explained to provide you with a better understanding of what this role needs to deal with on a daily basis, and the importance of carefully selecting who will be performing these tasks in your clinical trial.
1. Database design
The first task of a data manager is designing the clinical database, which is a clinical software designed bearing in mind all crucial aspects in a clinical trial, that is to say, to meet all the regulatory requirements and collect the essential information for achieving the study objectives. In order to ensure data security, a “system validation” is performed, which evaluates “system specifications, user requirements, and regulatory compliance”.  Furthermore, before introducing real data into the fields of the database and e-CRF layouts, a test is run with random data to ensure their correct functioning. In these data entries the study details such as objectives, sites, patients, etc. are defined. 
2. CRF tracking
After designing the database and e-CRF layouts, and having the site’s personnel entered data in it, the Clinical Research Associate (CRA) and the CDMs work together to verify that the information is correct, and all fields are complete. On the one hand, the CRA is responsible for confirming that the site’s personnel have coherently and accurately filled up the e-CRF. And, on the other hand, after the CRA task is complete, CDMs start tracking the electronic data capture tool for missing pages and inconsistent data. 
3. Data validation
Another important task, closely related to the one previously mentioned, is data validation. In this step, the protocol specifications play a fundamental role because what has been established in them is compared to what has been entered in the e-CRF. To do so, there are edit check programs, which are compliant with the logic conditions described in the Data Validation Plan (DVP) and have been designed to identify inconsistencies, missing information, wrong ranges, and deviations from the protocol. Furthermore, as in the case of databases, these programs are tested before implementing them to minimize the malfunctioning risk. When a discrepancy has been identified, it is highlighted in the system and there are two ways of dealing with them, which will be explained in the next section of this article. The e-CRFs undergo frequent data validation processes so queries are solved throughout the clinical investigation. 
4. Discrepancy management
This point in the data management process, also called queries resolution, is considered to be critical because it is when data is cleaned, and discrepancies are handled in different ways depending on their nature. To manage inconsistent, missing, or wrong data, there are three steps that need to be carried out.
Firstly, CDMs review the information that has been identified as discrepancies to determine what is the correct procedure to be performed. There are two options: to raise a query for the site personnel to respond or to generate Self-Evident Corrections (SEC) without the need of receiving an explanation from the site. 
Secondly, it is necessary to find out the reason. Nowadays, the use of e-CRFs is widespread, hence, the need of sending a Data Clarification Form in order to get an explanation is no longer common. When using e-CRFs, investigators receive a notification of the raised query and can change the data online. However, when the paper CRF is the data capture tool used, CDMs have to request that explanation from the site through the previously mentioned forms. The other possible way of resolving a discrepancy does not require the investigators intervention as they normally are related to spelling errors, and, in those cases, CDMs can correct it after generating a SEC. 
Finally, the resolution of the discrepancies needs to be carried out. When resolving them is possible, investigators or data entry staff change the wrong data, and the query is recorded as closed. Queries that are no longer open will not be considered again as validation failures during future validation processes. Nevertheless, sometimes certain discrepancies can’t be resolved, and they are considered as irresolvable. In those cases, they are recorded and stored in the discrepancy database with an audit trail. 
5. Medical coding
When all data entered in the e-CRF have been reviewed and their correctness, accuracy and completeness have been checked, the need of classifying the medical terminologies arises, and therefore, CDMs must make sure those terms are coded. These medical terminologies are related to adverse events and concomitant medications, hence, among all the data manager’s skills there should be, on the one hand, “the knowledge of medical terminology, understanding of disease entities, drugs used, and a basic knowledge of the pathological processes involved”, and, on the other hand, “knowledge about the structure of electronic medical dictionaries and the hierarchy of classifications available in them”. 
For this categorizing, a large codelist of adverse events is used, also referred to as dictionary or thesaurus, that is the source list where all reported adverse events are matched against in order to group those terms. This step is taken to make sure all adverse events that are reported in different ways are counted as the same kind of event. Nowadays, many automatic matching tools called autocoders are being used by CDMs, and only when the terms do not automatically match the codelist, this task needs to be performed manually. 
Even though computerized tools help in this process, the CDM’s intervention is essential due to the fact that those tools involve some level of text matching, which makes it dependent on the text that investigators introduce in the fields. Thus, data managers correct misspellings, adjust abbreviations, and carry out other minor text changes to streamline coding. Also, when a match can’t be found, these professionals must apply the closest related dictionary term to the reported one, and to do so, they are aided by other tools coming from the dictionary or the coding software. 
In spite of all the strategies mentioned above to code adverse events, sometimes these terms can’t be matched at all. In those cases, a query is raised for the site to provide more information about the reported term. 
6. Database lock
Reaching the database locking step means that CDMs tasks are almost over. In this point of the process, the final data validation is carried out to verify all data are correct. Generally, after locking no more changes can be made. Considering how important this step is, CDMs use a pre-lock checklist in order to make sure all data management activities are performed and there are no discrepancies. Once all the stakeholders approve the database locking, the data extraction is done for statistical analysis. 
To sum up, data management is a long, delicate, and fundamental step of the clinical trial management work, due to the large amount of information gathered, and the different aspects that need to be accounted for. For this reason, ongoing quality control of data processing is undertaken at regular intervals and the CDMs team frequently reviews all discrepancies to ensure that they are resolved during the course of the clinical trial .
Sofpromed is a biometrics CRO committed to provide excellence in all the services that are offered. Our experienced clinical data management team consists of highly qualified, and professional data managers who can guarantee accuracy and efficiency in data processing.
 Lu, Z., & Su, J. (2010). Clinical data management: Current status, challenges, and future directions from industry perspectives. Open Access Journal of Clinical Trials, 93. https://doi.org/10.2147/oajct.s8172
 Mosley, M., Brackett, M., Earley, S. and Henderson, D. (2009) The DAMA Guide to The Data Management Body of Knowledge (DAMA-DMBOK Guide), (1sr ed.). Technics Publications, LLC.
 Krishnankutty, B., Bellary, S., Kumar, B. R. P., & Moodahadu, L. S. (2012). Data management in clinical research: An overview. Indian Journal of Pharmacology, 44(2), 168. https://doi.org/10.4103/0253-7613.93842
 Prokscha, S. (2011). Practical Guide to Clinical Data Management (3rd ed.). CRC Press.