User Tools

Site Tools


OHDSI Best Practices for Estimating Population-Level Effects

:!: This document is under development. Changes can be proposed and discussed via the OHDSI Forum and in the Population-Level Estimation Workgroup meetings.

General principles

  • Transparency: others should be able to reproduce your study in every detail using the information you provide.
  • Prespecify what you're going to estimate and how: this will avoid hidden multiple testing (fishing expeditions, p-value hacking). Run your analysis only once.
  • Validation of your analysis: you should have evidence that your analysis does what you say it does (showing that statistics that are produced have nominal operating characteristics (e.g. p-value calibration), showing that specific important assumptions are met (e.g. covariate balance), using unit tests to validate pieces of code, etc.)

Best practices (generic)

  • Write a full protocol, and make it public prior to running the study. This should include
    • Research question + hypotheses to be tested
    • Which method(s), data, cohort definitions.
    • What is the primary analyses and what are sensitivity analyses?
    • Quality control
    • Amendments and Updates
  • Validate all code used to produce estimates. The purpose of validation is to ensure the code is doing what we require it to do. Possible options are:
    • Unit testing
    • Simulation
    • Double coding
    • Code review
  • Include negative controls (exposure-outcome pairs where we believe there is no effect)
  • Produce calibrated p-values
  • Make all analysis code available as open source so others can easily replicate your study

Best practices (new-user cohort design)

  • Use propensity scores (PS)
  • Build PS model using regularized regression and a large set of candidate covariates (as implemented in the CohortMethod package)
  • Use either variable-ratio matching or stratification on the PS
  • Compute covariate balance after matching for all covariates, and terminate study if a covariate has standardized difference > 0.2

Best practices (self-controlled case series)

  • Include a risk window just prior to start of exposure to detect time-varying confounding (e.g. contra-indications, protopathic bias)

Best practices ((nested) case-control)

  • Don't do a case-control study
development/best_practices_estimation.txt · Last modified: 2016/07/15 11:44 by schuemie