Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Giving individual weights in the xgbsestackedweibull #65

Open
yanih opened this issue Feb 28, 2023 · 1 comment
Open

Giving individual weights in the xgbsestackedweibull #65

yanih opened this issue Feb 28, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@yanih
Copy link

yanih commented Feb 28, 2023

Hi, everyone there,

Problem description

-I have a case-cohort data, which need to give each cases and non-cases corresponding weights to meet the disease rate in a natural population.
-Normally, in a AFT model, as in lifelines: WeibullAFTFitter, the 'weight_col' can let me input weights.
-In the constructions of XGBSEStackedWeibull in https://loft-br.github.io/xgboost-survival-embeddings/modules/stacked_weibull.html, the weilbull_params are the same as lifelines: WeibullAFTFitter, but when I use the code below to put weight_name in weibull_params. I got the error. It seems the 'weight_col' in WeibullAFTFitter cannot work in XGBSE.

Code sample

# parameters
xgb_params = {
    "objective": "survival:aft",
    "eval_metric": "aft-nloglik",
    "aft_loss_distribution": "normal",
    "aft_loss_distribution_scale": 0.795,
    "tree_method": "hist",
    "learning_rate": 5e-2,
    "max_depth": 8,
    "booster": "dart",
    "subsample": 0.5,
    "min_child_weight": 50,
    "colsample_bynode": 0.5
}

weibull_params={ 'weight_col':'weight'}

# fitting XGBSE model
xgbse_model = XGBSEStackedWeibull(xgb_params=xgb_params, weibull_params=weibull_params)

Error

self.weibull_aft = WeibullAFTFitter(**self.weibull_params)

TypeError: __init__() got an unexpected keyword argument 'weight_col'

Expected behavior

Got the individual-weighted XGBoost-AFT model

Possible solutions

  1. Should I label the matrix with weights first?
  2. Or the 'scale_pos_weight' in xgboost can be used?
@yanih yanih added the bug Something isn't working label Feb 28, 2023
@yanih
Copy link
Author

yanih commented Feb 28, 2023

Solutions

I checked the '_stacked_weibull.py' file and found that the ’weight_col‘ indeed wasn't in the fit of WeibullAFTFitter:

self.weibull_aft.fit(df=weibull_train_df, duration_col="duration", event_col="event", ancillary=True)

so I added the weight term in this file, and mainly changed these sentences below (the default weight=1):

def __init__(
        self,
        xgb_params=None,
        weibull_params=None,
        weight=1,
    ):
if xgb_params is None:
            xgb_params = DEFAULT_PARAMS
        if weibull_params is None:
            weibull_params = DEFAULT_PARAMS_WEIBULL

        self.xgb_params = xgb_params
        self.weibull_params = weibull_params
        self.persist_train = False
        self.feature_importances_ = None
        self.weight=weight
.
.
.
 # creating df to use lifelines API
        weibull_train_df = pd.DataFrame(
            {"risk": train_risk, "duration": T_train, "event": E_train, 'weight': self.weight})

        # fitting weibull aft
        self.weibull_aft = WeibullAFTFitter(**self.weibull_params)
        self.weibull_aft.fit(df=weibull_train_df, duration_col="duration", 
                             event_col="event",ancillary=True, weights_col='weight')

Ask for help

I am new in Python, I would appreciate it if anyone can check whether these changes are proper or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant