Skip to content
This repository has been archived by the owner on Jun 22, 2024. It is now read-only.

Pipeline crashing due to wrong time index #39

Closed
olineumann opened this issue Mar 4, 2021 · 1 comment
Closed

Pipeline crashing due to wrong time index #39

olineumann opened this issue Mar 4, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@olineumann
Copy link
Collaborator

Describe the bug
Pipeline is crashing while train because of missing 'time' index due to bug in _get_time_indeces method in utils. I think this happens because of list(x.values())[0].indexes.items() conversion.

Dirty fix:

def _get_time_indeces(x: Dict[str, xr.DataArray]) -> List[str]:
    indexes = []
    if isinstance(x, xr.DataArray):
        for k, v in x.indexes.items():
            if isinstance(v, pd.DatetimeIndex):
                indexes.append(k)
        return indexes
    for k, v in list(x.values())[0].indexes.items():
        if isinstance(v, pd.DatetimeIndex):
            # WARNING: Dirty fix!
            if k == 'time':
                indexes.append('index')
            else:
                indexes.append(k)
    return indexes

To Reproduce
I used the following pipeline to forecast a time simple time series.

    pipeline = Pipeline(path=os.path.join('run', 'forecasting', hparams.name))

    #####
    # Filter and Scale Features
    ###
    scaler = SKLearnWrapper(module=StandardScaler(), name='forecast_y_hat')
    load_scale = scaler(x=pipeline['y'])

    #####
    # Create Features
    ###
    # Shift load to have a load feature to forecast load via a model.
    # NOTE: Shifting will add zeros to the end or the beginning of the data
    shift_1h = ClockShift(name='shift_1h', lag=1)(x=load_scale)
    shift_2h = ClockShift(name='shift_2h', lag=2)(x=load_scale)
    shift_3h = ClockShift(name='shift_3h', lag=3)(x=load_scale)
    shift_4h = ClockShift(name='shift_4h', lag=4)(x=load_scale)
    shift_5h = ClockShift(name='shift_5h', lag=5)(x=load_scale)
    shift_6h = ClockShift(name='shift_6h', lag=6)(x=load_scale)

    #####
    # Define Models
    ###
    model = torch.nn.Sequential(
        # Load, Lag_1h, ..., Lag_nh 
        torch.nn.Linear(6, 8),
        torch.nn.ReLU(),
        # Just load forecast output
        torch.nn.Linear(8, 1),
    )
    forecasting_module = PyTorchWrapper(
        model,
        fit_kwargs={"batch_size": 16, "epochs": 30},
        compile_kwargs={"loss": "mae", "optimizer": "AdamW", "metrics": ["mae"]}
    )
    forecast = forecasting_module(
        f1=shift_1h, f2=shift_2h, f3=shift_3h,
        f4=shift_4h, f5=shift_5h, f6=shift_6h,
        target_load=load_scale
    )

    #####
    # Inverse Scale Features
    ###
    # Rescale load values to calculate metrics on original data.
    # NOTE: As before... Can't pass multiple forecasts to one single scaler.
    # BUG/NOTE: Also naming of files and xarray variables is intransparent (imo). TODO: Has this been fixed meanwhile?
    y_hat = scaler(x=forecast, computation_mode=ComputationMode.Transform, use_inverse_transform=True)

    #####
    # RMSE Calculation
    ###
    _ = RmseCalculator(name='RMSE')(y=pipeline['y'], y_hat=y_hat)
@benHeid
Copy link
Collaborator

benHeid commented Mar 17, 2021

Solved with #44

@benHeid benHeid closed this as completed Mar 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants