Select Page
index.knit

## Introduction

Studies designs for estimating causal effects are numerous. Based on the design, it is often necessary to control or address several sources of bias, such as baseline and time-varying confounding, informative censoring, selection bias, and a whole host of others. Designs like the treatment decision design [1], new user design [2], and prevalent new user design [3] each address these biases in different ways and require seemingly different analytic approaches to yield unbiased estimates from their resulting data.

Recently, the ‘clone-censor-weight’ approach [4–6] has become a popular way to estimate the effects of sustained or dynamic treatment regimens. However, this approach, and the way of thinking it entails (which involves conceptualizing a ‘target trial’ and adapting it to the observational setting [7]), is more general, and nearly all studies can be thought of in this way. Here, we show that a standard study of a point treatment can be thought of as a clone-censor-weight design, and we show how confounding and informative censoring can be addressed with a single nuisance model.

## The Setup

Consider a study of a binary baseline treatment, $$A$$, on a time-to-event, $$T$$. Patients may be censored prior to experiencing the event, and the time of censoring is $$C$$. A patient’s observed follow-up time is $$\tilde{T}=min(T,C)$$. In addition, a set of baseline covariates sufficient to control for confounding and informative censoring are collected, denoted $$W$$. Finally, we define $$\Delta=C>\tilde{T}$$, which is an indicator that a patient was not censored at their observed follow-up time (and therefore had the event). A subject’s observed data therefore consist of $$\{A, \tilde{T}, W, \Delta\}$$.

One estimator for the counterfactual cumulative incidence of the outcome under treatment level $$A=a$$ is [8]:

$\hat{Pr}(T(a)<t)=\frac{1}{n}\sum_{i=1}^n{\frac{\Delta_iI(\tilde{T}_i<t)I(A_i=a)}{\hat{Pr}(\Delta=1|W_i,A_i,T_i)\hat{Pr}(A=a|W_i)}},$

where $$T(a)$$ is the time of the event had, possibly counter to fact, a subject received treatment level $$A=a$$, $$n$$ is the total population size, and each of the probabilities in the denominator are modeled appropriately, e.g., with a Cox proportional hazards model for the censoring model and logistic regression for the treatment model.

## Data Generation

Here, we generate a simple dataset for demonstration.

expit <- function(p){
exp(p)/(1+exp(p))
}

n <- 10000

dat <- tibble(
id = 1:n,
W = runif(n),
A = rbinom(n, 1, expit(W)),
T0 = rexp(n, rate = 0.5 + 2*W),
T1 = rexp(n, rate = 1 + 2*W),
T = A*T1 + (1-A)*T0,
C = rexp(n, rate = .5 + .55*A + .5*W)
)

Note that our true causal risk difference is 10.55%.

## Typical Study Design and Analysis

Using the causalRisk package, we can easily implement the estimator described above to get the unadjusted and adjusted cumulative incidence curves: