Mining Metadata for Clinical Research Activities

Derek Lawrence, Senior Clinical Data Manager, has 9 years of data management and analysis experience in the health care/pharmaceutical industry. Derek serves as Rho’s Operational Service Leader in Clinical Data Management, an internal expert responsible for disseminating the application of new technology, best practices, and processes.

Metadata: An Underutilized Resource

As anyone involved in clinical database creation knows, considerable resources are devoted to the development and validation of electronic data capture (EDC) systems. Once these databases are live and clinical data begin coming in, various processes for setting up data cleaning programming, database quality review, and reporting are put into play. Unfortunately, most of the processes are manual and require the data managers, programmers, and biostatisticians to have a series of specific conversations concerning the database’s setup, structure, and dynamic behavior that would in turn affect how programming tasks were approached and how biostatistics should best approach the data.

The solution for not only decreasing the amount of time spent setting up these activities, but also increasing the accuracy of said setup presents itself in the effective usage of the project’s metadata. This metadata, or “data about data”, spans all elements of the clinical database, including:

CRF metadata
- Labels, formats, response options, entry requirements, field-level checks, etc.
Form metadata
- Source data verification (SDV), signature participation, orientation (standard vs. log), etc.
Event metadata
- Visit windows, associated CRFs, repeatability, access requirements, etc.
Query metadata
- Current status, dates, resolutions, marking groups, etc.

Establishing Usable Datasets

The first step in mining the metadata is to create machine-readable datasets from the source in question. In the case of most commercially- available EDC systems, the CRF and Event metadata contents of a project can be exported in a variety of formats (XML, Excel, etc.). During the nightly process by which clinical data are exported from our EDC studies and saved to the Rho network, we added a post-processing step where a macro reads in the exported study metadata files and produces working datasets. From here, these elements of the clinical database are machine-readable and available for use. Other standard EDC reports provide additional sources for Forms and Query metadata. These data can be extracted from the system either directly using an API (application programming interface) or by creating reports using EDC system-specific tools, which can be scheduled and saved to the network automatically. The contents of these reports can also be converted to datasets for ease of use.

A Wide Variety of Applications

From this point, we can automate a number of tasks that traditionally required manual review, specifications, and the application of subject matter expertise in order to successfully complete. From driving the database validation process to the creation of system performance metrics to the programming and configuration of statistical datachecks, the now-accessible metadata allows us to more rapidly and accurately initiate a multitude of tasks with much of the manual component removed. We will cover the use of some of the specific data monitoring and cleaning uses using study metadata in a series of future blog posts.

Related insights

Should I up-version study SDTM and ADaM when it becomes time to submit a marketing application?

Do I need to up-version to the most recent versions of the standard from the current Catalog

Blog

Best Practices for Hardcoding Clinical Trial Data

In clinical trials, the accuracy and integrity of data are paramount. While the goal is to handle

Blog

Blog

Mining Metadata for Clinical Research Activities

Metadata: An Underutilized Resource

Establishing Usable Datasets

A Wide Variety of Applications

Related insights

Taking Advantage of a Type C FDA Meeting for ISS Planning

Study Data Standardization Plan (SDSP) – What it is and why it’s needed?

Traceability: The Breadcrumb Trail of Clinical Trial Data

Clinical Trial Enrollment Caps: What They Are and When to Use Them

Should I up-version study SDTM and ADaM when it becomes time to submit a marketing application?

Best Practices for Hardcoding Clinical Trial Data

END-IN-MIND TRIAL DESIGN

EXPERTISE

CAPABILITIES

Rho US Headquarters

Rho European Headquarters

END IN MIND TRIAL DESIGN

EXPERTISE

CAPABILITIES

Rho US Headquarters

Rho European Headquarters