Data quality in DCT: Why traceability is the red line between valid and unusable - and the operational challenges sites are facing

‍

No regulation prepares you for the moment when data in a study becomes invisible.

In today’s clinical research, data is no longer collected solely within the walls of a clinic under a physician’s supervision. Trials have become decentralized, dynamic, distributed — and patients now contribute data remotely through wearables, mobile apps, and video consultations.

This technological shift, accelerated by the pandemic and backed by regulators, has created extraordinary opportunities for patient access. But at the same time, it has raised a question that no sponsor, investigator, or authority can ignore:

Who truly controls data quality in a decentralized clinical trial?

The answer isn’t a simple one — but we’ll build it, step by step, in the pages that follow.

In the past, data validity was traditionally ensured by the investigator, within a controlled clinical setting. Today, that same responsibility must be upheld in a context where data flows in real-time from multiple off-site sources. And this is not optional — it’s a requirement established by global guidelines such as ICH E6(R3), EMA documents, and the FDA’s guidance on decentralized clinical trials.

Moreover, in 2024–2025, more and more RFPs (Request for Proposal) and partner selection processes demand clear evidence of auditability, traceability, and data compliance. Not just recruitment. Not just speed. But verifiable and continuous quality.

In a decentralized ecosystem, data quality is not just a KPI (Key Performance Indicator). It is a strategic factor of compliance, reputation, and partnership. It is what differentiates the sites invited into global projects from those that disappear from the sponsors’ radar.

This article is a practical roadmap for sites, investigators, and SMOs that don’t just want to stay in the game — but become preferred partners. We will define what “data quality” really means in the DCT era, what international standards require, and — most importantly — how to put it all into practice in a real, applicable, auditable way.

‍

1. Why Data Quality Matters in DCTs

In a traditional clinical trial, data quality was naturally ensured through proximity: the patient was present at the site, the investigator directly observed procedures, and the monitor periodically reviewed the source documents.
In a decentralized clinical trial (DCT), this control chain is fragmented — and with it, the certainty that data is complete, coherent, and verifiable.

According to the new ICH E6(R3) guidelines, quality must be a proactive component of trial design:

“The quality of a trial should be considered at all stages of the trial process, with a focus on those activities that are essential to ensure human subject protection and the reliability of trial results.”

This principle is further reinforced by the “quality by design” concept introduced in the same document, which emphasizes that quality standards must be embedded from the start — not just checked at the end.

Similarly, the European Medicines Agency (EMA) urges sponsors to assess the impact of decentralized elements on data reliability from the earliest planning phases:

“Sponsors are expected to consider the impact of decentralised elements on data reliability and trial oversight from the earliest planning stages.”

Therefore, quality is no longer the sole responsibility of the Quality Assurance (QA) department or the monitor. It becomes a design imperative, shared by all actors involved — from the sponsor and Contract Research Organization (CRO) to the investigator and frontline support teams.

For sites, investigators, and Site Management Organizations (SMOs), this is no longer optional. Since 2024, an increasing number of Requests for Proposal (RFPs) require clear evidence of data traceability and auditability.
Strong recruitment is no longer enough. You must demonstrate that your data is reliable, complete, and aligned with international standards.

This is the real stake of data quality in DCTs: long-term visibility in front of sponsors, eligibility in international consortia, and credibility in the eyes of regulators.

‍

2. Types of Data in Decentralized Clinical Trials

No regulation prepares you for the moment when study data becomes invisible.

In a decentralized clinical trial (DCT), data no longer flows through a single pipeline. It can be classified into three main functional categories:

Patient-reported data (ePRO – Electronic Patient-Reported Outcomes) — via mobile apps, questionnaires, or electronic forms;
Digitally generated data (DHT – Digital Health Technologies) — including wearables, sensors, passive apps, or APIs;
Clinical and administrative data — such as electronic consent (eConsent), electronic health records (EHR – Electronic Health Records), and remote consultations via telemedicine.

Each type of data has its own channel, specific risks, and distinct validation requirements. Some are entered manually, others are captured automatically. Some can be directly monitored by site staff, while others remain entirely outside the site's control.
🔍 This diversity is not only technological — it's also regulatory and logistical, as it directly impacts traceability, temporal consistency, and audit readiness.

2.1. Diversity of Data Sources in DCT

In a DCT, not all data is created at the site. We have eConsent, data from wearables, mobile apps for ePRO, sensors, EHR, or telemedicine. Each source has its own format, operational context, and points of vulnerability. And their integration is never automatic.

FDA (U.S. Food and Drug Administration) explicitly acknowledges these sources and requires clear validation mechanisms:

“Digital health technologies may collect data remotely... the sponsor must describe how data integrity will be maintained and how the data will be integrated with other study data.”
(FDA Guidance on DCT)

EMA (European Medicines Agency) treats these data types with the same rigor as traditional methods. For example, electronic consent must meet the same requirements as written consent:

“The same quality and content standards apply to eConsent as to traditional written consent.”
(EMA Reflection Paper)

So, the diversity of data sources in DCTs does not imply unlimited flexibility. It requires a digitally integrated ecosystem, capable of transforming this mosaic of inputs into valid, traceable, and auditable information.

However, this very diversity also creates the greatest vulnerabilities.

2.2. When Data Becomes Invisible

What does “the moment when data becomes invisible” actually mean?
It’s not a metaphor. It’s a real, well-documented risk in decentralized trials:

🟥 1. Data can no longer be verified at the source
Example: A patient completes an ePRO via a mobile app. If:

the timestamp is not correctly synchronized,
the data isn’t saved to the main system (EDC – Electronic Data Capture),
or the app allows edits without an audit trail,

then no one can prove that the data is original, contemporaneous, and complete — the essential attributes of ALCOA+.

🟥 2. Sensors transmit data directly to the cloud without local backup
Example: A smartwatch captures heart rate or blood pressure. If:

the connection drops,
the server does not store detailed logs with patient ID,
or the API lacks local redundancy,

then the data is permanently lost — and the investigator may not even be aware anything is missing.

🟥 3. Lack of unified integration (poor interoperability)
Data from wearables, eConsent, EHR, or telemedicine:

does not reach a central platform,
cannot be compared or reconciled,
or is misaligned in time.

🔍 The result: a reviewer from EMA or FDA cannot reconstruct the logical flow of data — which means the data becomes invisible from a scientific and regulatory standpoint.
And without traceability, you cannot prove who generated the data, when, or in what context — meaning the data source becomes unverifiable.

🔍 In clinical research, the data source is the foundation of scientific validation. If you cannot demonstrate who generated the data, when, and how, then the integrity of that dataset is compromised — no matter how clinically relevant it may appear.

✅ So what does the phrase mean:
“No regulation prepares you for the moment when data becomes invisible”?
It means that field-level risks exceed what’s written in the guidance — unless you have a solid, integrated, and well-audited digital ecosystem.

Not all data in a DCT is created equal. Some is more fragile than it seems.
Understanding each source and its vulnerabilities is essential to preventing quality loss — before the data ever becomes “invisible.”

‍

3. Common Risks to Data Quality

Data collected in Decentralized Clinical Trials (DCT) is not just more abundant. It’s more fragile.

The context in which it is generated — outside the clinic, without direct supervision, via distributed digital systems — introduces a series of well-documented but often underestimated risks.

📍 1. Functional incompleteness
Data may appear numerically complete but be missing at critical moments: unsaved ePROs (Electronic Patient-Reported Outcomes), telemedicine sessions without a valid timestamp, wearable data not linked to a patient ID. The problem isn’t the absence of data — it’s its contextual void.

📍 2. Inconsistency between sources
The same patient may have contradictory data from different sources — a sensor API reports 12,000 steps, while a synced mobile app shows 8,500. Without a robust reconciliation system, this data is not just inconclusive — it becomes scientifically unusable.

📍 3. Delays with critical impact
Data generated outside the clinic may reach the central platform (EDC – Electronic Data Capture) too late — sometimes after the window for intervention has passed. These delays affect not only patient monitoring but also the sponsor’s ability to act in time.

📌 These risks are acknowledged by regulators:

ICH E6(R3)
“Important risks to data integrity can arise when key trial activities are performed outside of controlled clinical environments.”
🔹 Interpretation: ICH requires these risks to be anticipated and addressed in the design phase — not treated reactively.

FDA Guidance on DCT
“Investigators must ensure that data generated by delegated parties remains complete, consistent, and accurate...”
🔹 Interpretation: Quality cannot be outsourced. Investigators remain responsible, even for remotely collected data.

👉 These risks do not occur despite technology — they result from poorly integrated technology. They cannot be fixed with administrative patchwork, but only through an ecosystem designed to prevent — not to repair.

‍

4. International standards for data validity

Technology can accelerate trials, but only standards make them credible.

As we've seen, data collected outside the clinic are exposed to risks of inconsistency, delays, and contextual loss. The only real protection against these risks is compliance with international data quality standards.

🔍 What does “valid data” mean in clinical research?
It means data that can be traced back to the source, that are correct, complete, always available for audit, and that cannot be modified without a trace.
The universal standard that defines these criteria is called ALCOA+.

✅ ALCOA+: 9 essential criteria
Every data point must be:

Attributable – it must be clear who generated the data
Legible – readable and easy to understand
Contemporaneous – recorded at the time the event occurred
Original – not copied or transcribed
Accurate – correct, error-free
Complete – no missing parts
Consistent – coherent over time and across sources
Enduring – preserved reliably over time
Available – accessible at any time for verification

🔹 These criteria are not theoretical — they are the minimum requirements for DCT data to be accepted by authorities. If even one is missing, the entire validation chain can be compromised.

🧩 ICH E6(R3): how quality is built
The international Good Clinical Practice (GCP) guideline, in its ICH E6(R3) version, goes even further:
It’s not enough to have good data; you need a system that consistently guarantees their quality.

“The sponsor should implement systems and procedures that are proportionate to the risks, to ensure adequate quality management throughout the lifecycle of the trial.”

🔹 This means that every data source — from eConsent to wearables — must be designed not just for efficiency, but for full traceability and auditability at any time.

Standards are not a bureaucratic step. They are the invisible infrastructure that makes the difference between an accepted result and a rejected study.
Every stakeholder in the process — from sponsor and CRO to investigator and on-site teams — must understand that validity is not checked at the end. It is built from the first interaction with data.

5. The role of SMOs and clinical sites in the data ecosystem

Standards exist. But who applies them in the field?
In a DCT (Decentralized Clinical Trial), technology is not autonomous. Sites and SMOs are the ones who bring these standards to life — or ignore them, with major risks.

In a decentralized trial, data collection no longer happens only in the clinic. But responsibility for their validity still remains there.

Sites and SMOs (Site Management Organizations) are often assigned a purely logistical role, limited to recruitment and operational management. But in the context of DCTs, they become essential nodes in the data validation chain.

🎯 Why? Because they are the actors best positioned to ensure continuous oversight of what happens beyond the clinic. They can impose clear procedures for data reconciliation, verify consistency across multiple sources, identify signs of non-compliance, and intervene quickly with corrections or requests for clarification. Furthermore, they can establish internal monitoring protocols that include periodic checks, random audits, and recurring training for patients and technology providers.

🔍 EMA (European Medicines Agency) clarifies this aspect:

“Investigators must maintain oversight and ensure protocol adherence, regardless of whether trial-related procedures are conducted remotely.”

🧭 Interpretation: The ultimate responsibility lies with the investigator, but it cannot be effectively exercised without the support of the site. In DCTs, sites and SMOs become the practical extension of that responsibility — they are the ones who can turn oversight into a continuous, actionable, and documented process.

🔍 FDA (Food and Drug Administration) adds an essential condition:

“Sponsors should clearly define in agreements and protocols which tasks are delegated and how oversight is maintained.”

🧭 This means that monitoring and responsibility must be documented, not just assumed. SMOs must demonstrate, through concrete procedures, how they ensure the validity of data obtained from decentralized sources — whether it’s ePRO, sensors, or telemedicine.

🧩 In the absence of this active control:

You cannot verify the moment when the data were generated;
You cannot validate whether the patient followed the instructions;
You cannot reconstruct the data trail back to its primary source.

📌 Therefore, sites and SMOs are not mere executors.
They are the guarantors of traceability and data coherence. And in the DCT era, this role is more complex — it no longer involves only physical oversight, but the active management of digital risk.

📍 Without visibility, there is no responsibility.
And without responsibility, decentralized data become nothing more than digital ruins — impossible to audit, impossible to validate.

‍

In decentralized trials, it’s not the data that’s fragile. It’s the control over it.

Data quality is not a technical detail. It is the line between scientific progress and digital chaos.
In DCTs (Decentralized Clinical Trials), data no longer flows through the investigator’s hands. It must now pass through an operational structure capable of understanding it, validating it, and defending it.

Technology doesn’t fail. People do — when they deploy it without visibility.
Regulations are not missing. What’s missing is the consistent commitment to apply them — even when systems don’t give all the answers.

📌 This is where research centers and SMOs (Site Management Organizations) come in — not as cogs in a digital machine, but as critical links in the trust chain.
Without traceability to the source, any data point becomes just an assumption — and assumptions cannot be scientifically validated.
Traceability is essential, non-negotiable, and the red line between valid and unusable. If you can’t demonstrate who generated the data, when, and under what conditions, your study isn’t just fragile — it’s vulnerable to rejection, null in an audit, and unusable for regulatory or strategic decisions.

This article has outlined the challenges. In the next one, we’ll explore potential operational solutions — not as universal recipes, but as concrete reference points for those ready to turn decentralization into a true competitive advantage.

🔗 Until then, one key question remains:
Can the data you collect today truly be verified tomorrow — or are you just hoping it can?

‍

📚 Sources:

ICH E6(R3) – Guideline for Good Clinical Practice (GCP) (Versiunea finală, ianuarie 2025)
📄 https://database.ich.org/sites/default/files/ICH_E6%28R3%29_Step4_FinalGuideline_2025_0106.pdf

‍

EMA – Recommendation Paper on Decentralised Elements in Clinical Trials (decembrie 2022)
📄 https://health.ec.europa.eu/system/files/2023-03/mp_decentralised-elements_clinical-trials_rec_en.pdf

FDA – Conducting Clinical Trials With Decentralized Elements (septembrie 2024)

📄 https://www.fda.gov/media/167696/download

‍

FDA – Digital Health Technologies for Remote Data Acquisition in Clinical Investigations (decembrie 2023)

📄 https://www.fda.gov/media/155022/download

Data quality in DCT: Why traceability is the red line between valid and unusable - and the operational challenges sites are facing

Need help?

Terms and conditions

ATUM MEDICAL RESEARCH