Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ForestDRLearner : outcome binary and treatement is discret ( 3 values) #908

Open
Xela06-mjt opened this issue Aug 7, 2024 · 2 comments
Open

Comments

@Xela06-mjt
Copy link

Xela06-mjt commented Aug 7, 2024

i'm building model with ForestDRLearner . I would to have the treatment which minimizes the outcome and in the end to have
client, best_treatment
1, 0
2, 1
3, 2
4, 0
ect ...

how make this final dataset with this code ? what is the best solution ? this code is not quite what I need

X = sampling.drop(columns=['T', 'Y'])
Y = sampling['Y']
T = sampling['T']

X_train, X_test, T_train, T_test, Y_train, Y_test = train_test_split(X, T, Y, test_size=0.2, random_state=123)

model = ForestDRLearner(
model_propensity=XGBClassifier(learning_rate=0.1, max_depth=3, objective="multi:softprob"),
model_regression=XGBClassifier(learning_rate=0.1, max_depth=3, objective="binary:logistic"),
discrete_outcome=True,
random_state=1,
)

model.fit(Y=Y_train, T=T_train, X=X_train, inference="auto")

cate_estimates = model.effect(X_test)
cate_estimates

best_treatment = np.argmin(cate_estimates, axis=1)

results = pd.DataFrame({

'best_treatment': best_treatment

})

@kbattocchi
Copy link
Collaborator

It's not clear from your description what's not working for you.

One thing to note is that all treatment effects are relative to the 'control' treatment, so really you should append a column of zeros to the effects before taking the argmin (because if each other treatment is negative relative to the control, then you should pick the control even though its relative effect compared to itself is 0).

@Xela06-mjt
Copy link
Author

thank you for your answer, I discovered econml a short time ago, and I am not yet very expert. my problem is that I am not sure of the code that I have to write to answer my problem, I am open to other proposals. In the meantime, I told myself that ForestDRLearner was a good solution to my problem. use. I have my binary outcome and my processing is discrete (it takes 3 values). using the cate, I would like to find what is the best treatment for each client. I started with this code. Maybe this is not the right way to do it? my question is : how to know what is the best treatment?
client | best_treatement
client1 | 2
client2 | 1
client3 | 0
client4 | 2
etc..
i add this code cate_estimates_with_control = np.hstack([np.zeros((cate_estimates.shape[0], 1)), cate_estimates])
I don't know if this matches your suggestion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants