How negative controls can improve the quality of causal real world evidence

Health analytics Real world evidence generation Global market access strategy Life sciences

Dr Ben Bray Partner and Evidence Generation Lead

Sherry Wang Associate Consultant

Dr Deborah Layton Principal and Drug Safety Lead

Dr Ben Bray Partner and Evidence Generation Lead

Contact Ben

Sherry Wang Associate Consultant

Contact Sherry

Dr Deborah Layton Principal and Drug Safety Lead

Contact Deborah

23 July 2025 4 mins estimated reading time

Subscribe to our thinking

Get relevant insights, leading perspectives and event invitations delivered right to your inbox.
Get started to select your preferences.

Negative controls are powerful tools for detecting bias in real world evidence (RWE) studies (Lipsitch et al., 2010). Borrowed from experimental biology, negative controls represent groups not exposed to the treatment and thus should exhibit no treatment-related effects. In RWE, negative controls involve exposures or outcomes unrelated to the primary exposure-outcome relationship under study (Arnold et al., 2016; Lipsitch et al., 2010).

Understanding how negative controls work

A valid negative control outcome is an event that shares risk factors but cannot plausibly be caused by the primary exposure. Conversely, a valid negative control exposure is unrelated causally to the main outcome but affects a similar population (Figure 1). Typically employed as sensitivity analyses, negative controls help identify potential bias or residual confounding. In fact, negative controls are one of the few methods in the RWE toolbox which can be used to detect unmeasured confounding, one of the biggest sources of uncertainty for decision makers using RWE. If associations appear in negative controls, this suggests bias or confounding may also affect the primary analysis. Conversely, absence of associations strengthens confidence in the validity of main study findings.

Figure 1. Illustration of an ideal negative control NE and negative control outcome NO for use in evaluating studies of the causal relationship between exposure A and outcome of Y. U indicates confounders. Ideally, the NE should share all the same causes as A, likewise, the NO should share all the same causes as the outcome of interest Y.

Case study: Applying negative controls in post-authorisation safety studies

Negative controls are increasingly used in RWE including pharmacoepidemiology studies to strengthen causal inferences when randomised trials are not feasible (Zafari et al., 2023). For example, in a post-authorisation safety study of AZD1222 (Vaxzevria) (European Medicines Agency, 2024), two negative control outcomes—fractures and urinary tract infections (UTIs)—were included to assess potential bias in the observational comparisons examining the risk of several Adverse Events of Special Interest (AESI) between the exposed and three matched reference populations, separately. The fractures and UTIs were not expected to be causally related to vaccination, making them ideal for identifying underlying methodological issues. The negative control analyses included time to event analysis using survival methodology and the results of these analyses revealed critical insights into possible selection bias of two of the three matched comparator cohorts due to informative censoring, a key limitation. Specifically, in the main comparison of the vaccinated population with a concurrent unvaccinated population, there was a separation of the time to first UTI or fracture Kaplan-Meier curves, indicating significant bias. This suggests that the unvaccinated cohort experienced differential loss to follow-up, potentially inflating risk estimates for the vaccinated group. This divergence was also observed in the comparison of the vaccinated population to a pre-pandemic comparator cohort but was less remarkable. This case study powerfully illustrates how negative controls help contextualise and interpret observed safety signals by identifying the presence and impact of methodological biases. It also demonstrates how each exposure–negative control outcome association being explored may have distinct sets of biases and unmeasured confounders. Thus, any interpretation should take into account the scale of observed bias when considering the observed association.

Addressing challenges in selecting valid negative controls

Selecting valid negative controls requires strong subject-matter expertise and relies heavily on assumptions regarding causal relationships. For example, the negative controls should share the same confounders as the exposure or outcome of interest, which can only be assessed qualitatively based on prior clinical and epidemiological knowledge. Real-world complexity and data incompleteness sometimes undermine these assumptions (Tchetgen, 2014; Yang et al., 2024). Moreover, negative control methods demand high-quality, comprehensive datasets; measurement error and missing data can significantly impair performance.

Cutting edge advances in negative controls using AI

As we have seen in many areas in RWE, artificial intelligence can offer innovative solutions to enhance the reliability and efficiency of negative control approaches.

1. Automated control selection

AI algorithms can scan thousands of variables to identify those behaving like true “negative controls”. For example, the DANCE (Data-drive automated negative control estimation) algorithm when applied to datasets tests each candidate for independence from the exposure-outcome pathway, retaining only those that meet rigorous criteria (Kummerfeld et al., 2024). This reduces reliance on expert selection and lowers the risk of choosing invalid controls. At LCP we have also been exploring the potential for large language models to select valid negative controls and have been very impressed with the results of our initial testing.

2. Automating bias detection and result calibration

AI can enhance interpretation by using multiple negative controls to detect bias. For example, pharmacoepidemiologists can use null (no association) drug-outcome pairs to build an empirical null distribution (Yang et al., 2024). AI compares study results to that distribution, adjusting p-values or confidence intervals to account for the background level of bias. In practice, this means that if most controls show a tiny positive effect, the AI will factor that in, so the primary result is judged against the right baseline (Schuemie et al., 2016; Yang et al., 2024).

3. Tackling hidden confounding

AI can also adjust for hidden confounders using the double negative control method: one control for treatment, and one for outcome. Machine learning models like neural networks can estimate “bridge functions” linking controls to hidden confounders thus learning about complex relationships from data, producing less biased treatment effect estimates (Miao et al., 2024).

4. Improving data quality

AI can improve negative control studies by enhancing data quality. For example, natural language processing (NLP) can extract structured variables from unstructured text such as clinical records, revealing hidden confounders or validating negative controls. AI-based imputation fills missing data, ensuring that the null associations are not due to data. In summary, by improving datasets, AI techniques can indirectly strengthen the foundation on which negative control methods rely (Matthay et al., 2025).

Regulatory considerations

Under PDUFA VII (Fiscal Years 2023–2027), the FDA has committed to enhancing the use of negative controls as part of its broader initiative to modernize drug safety and improve the use of real-world evidence (RWE). (Center for Drug Evaluation and Research, 2023)) Several specific actions relate to negative controls within the Sentinel Initiative, which is a national electronic system for monitoring the safety of FDA-regulated medical products. As of now, there are no publicly available final publications specifically reporting the results of this initiative but methods development projects are likely ongoing, with the final deliverables expected in 2027.

Wrap up

Negative controls are both fascinating and powerful tools for helping to detect bias in RWE studies. Although we are seeing more use of these, it still feels as if their potential to guide more confident use of RWE for regulatory and HTA decision making has not been fully realised. We look forward to seeing more use of negative controls “in the wild” and are excited about the prospect for methods innovation and AI to make these tools even more useful.

At LCP Health Analytics we specialise in generating robust, RWE to inform critical health care decisions and are passionate about high quality science. Please drop us a note if you’d like to discuss how we can help you and to meet with our experts.

References

European Medicines Agency. (2024). A post-authorisation/post-marketing observational study to evaluate the association between exposure to AZD1222 and safety concerns using existing secondary health data sources (COVID-19) | HMA-EMA Catalogues of real-world data sources and studies. https://catalogues.ema.europa.eu/node/3319/administrative-details
Kummerfeld, E., Lim, J., & Shi, X. (n.d.). Data-driven Automated Negative Control Estimation (DANCE): Search for, Validation of, and Causal Inference with Negative Controls.
Matthay, E. C., Neill, D. B., Titus, A. R., Desai, S., Troxel, A. B., Cerdá, M., Díaz, I., Santacatterina, M., & Thorpe, L. E. (2025). Integrating Artificial Intelligence into Causal Research in Epidemiology. Current Epidemiology Reports, 12(1), 6. https://doi.org/10.1007/s40471-025-00359-5
Miao, W., Shi, X., Li, Y., & Tchetgen Tchetgen, E. J. (2024). A confounding bridge approach for double negative control inference on causal effects. Statistical Theory and Related Fields, 8(4), 262–273. https://doi.org/10.1080/24754269.2024.2390748
Schuemie, M. J., Hripcsak, G., Ryan, P. B., Madigan, D., & Suchard, M. A. (2016). Robust empirical calibration of p‐values using observational data. Statistics in Medicine, 35(22), 3883–3888. https://doi.org/10.1002/sim.6977
Yang, Q., Yang, Z., Cai, X., Zhao, H., Jia, J., & Sun, F. (2024). Advances in methodologies of negative controls: A scoping review. Journal of Clinical Epidemiology, 166, 111228. https://doi.org/10.1016/j.jclinepi.2023.111228
Zafari, Z., Park, J., Shah, C. H., dosReis, S., Gorman, E. F., Hua, W., Ma, Y., & Tian, F. (2023). The State of Use and Utility of Negative Controls in Pharmacoepidemiologic Studies. American Journal of Epidemiology, 193(3), 426–453. https://doi.org/10.1093/aje/kwad201