====== OHDSI Best Practices for Estimating Population-Level Effects ====== :!: //This document is under development. Changes can be proposed and discussed via the [[http://forums.ohdsi.org/t/population-level-estimation-workgroup-discussing-best-practices|OHDSI Forum]] and in the [[projects:workgroups:est-methods|Population-Level Estimation Workgroup]] meetings.// ===== General principles ===== * **Transparency**: others should be able to reproduce your study in every detail using the information you provide. * **Prespecify** what you're going to estimate and how: this will avoid hidden multiple testing (fishing expeditions, p-value hacking). Run your analysis only once. * **Validation of your analysis**: you should have evidence that your analysis does what you say it does (showing that statistics that are produced have nominal operating characteristics (e.g. p-value calibration), showing that specific important assumptions are met (e.g. covariate balance), using unit tests to validate pieces of code, etc.) ===== Best practices (generic) ===== * **Write a full protocol**, and make it public prior to running the study. This should include * Research question + hypotheses to be tested * Which method(s), data, cohort definitions. * What is the primary analyses and what are sensitivity analyses? * Quality control * Amendments and Updates * **Validate** all code used to produce estimates. The purpose of validation is to ensure the code is doing what we require it to do. Possible options are: * Unit testing * Simulation * Double coding * Code review * Include **negative controls** (exposure-outcome pairs where we believe there is no effect) * Produce **calibrated p-values** * Make all analysis code available as **open source** so others can easily replicate your study ===== Best practices (new-user cohort design) ===== * Use **propensity scores** (PS) * Build PS model using **regularized regression** and a **large set of candidate covariates** (as implemented in the CohortMethod package) * Use either **variable-ratio matching** or **stratification** on the PS * **Compute covariate balance** after matching for all covariates, and terminate study if a covariate has standardized difference > 0.1 ===== Best practices (self-controlled case series) ===== * Include a **risk window just prior to start of exposure** to detect time-varying confounding (e.g. contra-indications, protopathic bias) ===== Best practices ((nested) case-control) ===== * **Don't** do a case-control study