Let Wi is a binary variable indicating whether person i received the treatment, so that
Yiobs=Yi1Wi+(1−Wi)Yi0
Unconfoundedness Assumption
If we assume that the treatment assignment Wi is independent of Yi1 and Yi0 conditional on Xi, then we can estimate the CATE from observational data by computing the empirical counterpart:
Remark: A common choice for g(X) is an estimator of the propensity score, which is defined as the probability of treatment given the covariates X, i.e. p(Wi=1∣Xi).
Intuition behind the X-Learner
We study an simulated example where we know the uplift is exactly τ=1.
X-Learner Step 1 (same as T-Learner):
Model fit for control (red) and treatment (blue) groups.
T-Learner Estimation:
The solid line represents the difference between the model fit for the control group and the treatment groups.
The estimation is not good as the treatment group is very small.
Imputed Treatment Effects:
D~TD~C=YT−μ^C(XT)=μ^T(XC)−YC
X-Learner Estimation:
The dashed line represents the X-Learner estimation.
It combines the fit from the imputed effects by using and estimator of the propensity score, i.e. g(x)=e^(x). In this example e^(x) will be small as we have much more observations in the control group. Hence the estimated uplift will be close to τ^T.
from causalml.inference.meta import BaseTClassifier
from sklearn.ensemble import HistGradientBoostingClassifier
# define ml model
learner = HistGradientBoostingClassifier()
# set meta-model
t_learner = BaseTClassifier(learner=learner)
# estimate the average treatment effect (ATE)
t_ate_lwr, t_ate, t_ate_upr = t_learner.estimate_ate(X=x, treatment=w, y=y)
# predict treatment effects
t_learner.predict(X=x)
# access ml models
t_learner.models_c[1]
t_learner.models_t[1]
Uplift Model Evaluation?
Uplift Evaluation: Uplift by Percentile
Sort uplift predictions by decreasing order.
Compute percentiles.
Predict uplift for both treated and control observations per percentile.
The difference between those averages is taken for each percentile.
Uplift Evaluation: Uplift by Percentile
A well performing model would have large values in the first percentiles and decreasing values for larger ones
Uplift Evaluation: Cumulative Gain Chart
Predict uplift for both treated and control observations and compute the average prediction per decile (bins) in both groups. Then, the difference between those averages is taken for each decile.
(NTYT−NCYC)(NT+NC)
YT/YC: sum of the treated / control individual outcomes in the bin.
NT/NC: number of treated / control observations in the bin.