Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix S-Learner's leakage #79

Merged
merged 4 commits into from
Aug 12, 2024
Merged

Fix S-Learner's leakage #79

merged 4 commits into from
Aug 12, 2024

Conversation

kklein
Copy link
Collaborator

@kklein kklein commented Aug 10, 2024

This PR seeks to address @ArseniyZvyagintsevQC 's finding that the current implementation of the S-Learner's estimation of the conditional average outcomes is not quite kosher in the in-sample scenario.

Concretely, having observed $X_i, Y_i, W_i=k$, we currently consider $i$ to be unseen when estimating $\mathbb{E}[Y_i|X_i,W_i=k']$ if $k' \neq k$. Yet, the estimator has seen $Y_i$, which may lead to some leakage.

Checklist

  • Added a CHANGELOG.rst entry

@@ -32,12 +32,12 @@ on ground truth CATEs:

| S-learner | causalml_in_sample | causalml_oos | econml_in_sample | econml_oos | metalearners_in_sample | metalearners_oos |
| :------------------------------------------------------------ | -----------------: | -----------: | ---------------: | ---------: | ---------------------: | ---------------: |
| synthetic_data_continuous_outcome_binary_treatment_linear_te | 14.5706 | 14.6248 | 14.5706 | 14.6248 | 14.5729 | 14.6248 |
| synthetic_data_binary_outcome_binary_treatment_linear_te | 0.229101 | 0.228616 | nan | nan | 0.229231 | 0.2286 |
| twins_pandas | 0.314253 | 0.318554 | nan | nan | 0.371613 | 0.319028 |
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the benchmarks were actually quite indicative beforehand! We were doing quite a bit worse than causalml in the in-sample scenario before this change.

@kklein kklein marked this pull request as ready for review August 10, 2024 16:59
Copy link

codecov bot commented Aug 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.41%. Comparing base (d00947a) to head (7124492).

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #79      +/-   ##
==========================================
- Coverage   94.43%   94.41%   -0.02%     
==========================================
  Files          15       15              
  Lines        1779     1774       -5     
==========================================
- Hits         1680     1675       -5     
  Misses         99       99              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kklein kklein merged commit 4409cc5 into main Aug 12, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants