Pipeline crashing due to wrong time index #39

olineumann · 2021-03-04T14:31:28Z

Describe the bug
Pipeline is crashing while train because of missing 'time' index due to bug in _get_time_indeces method in utils. I think this happens because of list(x.values())[0].indexes.items() conversion.

Dirty fix:

def _get_time_indeces(x: Dict[str, xr.DataArray]) -> List[str]:
    indexes = []
    if isinstance(x, xr.DataArray):
        for k, v in x.indexes.items():
            if isinstance(v, pd.DatetimeIndex):
                indexes.append(k)
        return indexes
    for k, v in list(x.values())[0].indexes.items():
        if isinstance(v, pd.DatetimeIndex):
            # WARNING: Dirty fix!
            if k == 'time':
                indexes.append('index')
            else:
                indexes.append(k)
    return indexes

To Reproduce
I used the following pipeline to forecast a time simple time series.

    pipeline = Pipeline(path=os.path.join('run', 'forecasting', hparams.name))

    #####
    # Filter and Scale Features
    ###
    scaler = SKLearnWrapper(module=StandardScaler(), name='forecast_y_hat')
    load_scale = scaler(x=pipeline['y'])

    #####
    # Create Features
    ###
    # Shift load to have a load feature to forecast load via a model.
    # NOTE: Shifting will add zeros to the end or the beginning of the data
    shift_1h = ClockShift(name='shift_1h', lag=1)(x=load_scale)
    shift_2h = ClockShift(name='shift_2h', lag=2)(x=load_scale)
    shift_3h = ClockShift(name='shift_3h', lag=3)(x=load_scale)
    shift_4h = ClockShift(name='shift_4h', lag=4)(x=load_scale)
    shift_5h = ClockShift(name='shift_5h', lag=5)(x=load_scale)
    shift_6h = ClockShift(name='shift_6h', lag=6)(x=load_scale)

    #####
    # Define Models
    ###
    model = torch.nn.Sequential(
        # Load, Lag_1h, ..., Lag_nh 
        torch.nn.Linear(6, 8),
        torch.nn.ReLU(),
        # Just load forecast output
        torch.nn.Linear(8, 1),
    )
    forecasting_module = PyTorchWrapper(
        model,
        fit_kwargs={"batch_size": 16, "epochs": 30},
        compile_kwargs={"loss": "mae", "optimizer": "AdamW", "metrics": ["mae"]}
    )
    forecast = forecasting_module(
        f1=shift_1h, f2=shift_2h, f3=shift_3h,
        f4=shift_4h, f5=shift_5h, f6=shift_6h,
        target_load=load_scale
    )

    #####
    # Inverse Scale Features
    ###
    # Rescale load values to calculate metrics on original data.
    # NOTE: As before... Can't pass multiple forecasts to one single scaler.
    # BUG/NOTE: Also naming of files and xarray variables is intransparent (imo). TODO: Has this been fixed meanwhile?
    y_hat = scaler(x=forecast, computation_mode=ComputationMode.Transform, use_inverse_transform=True)

    #####
    # RMSE Calculation
    ###
    _ = RmseCalculator(name='RMSE')(y=pipeline['y'], y_hat=y_hat)

The text was updated successfully, but these errors were encountered:

benHeid · 2021-03-17T07:19:44Z

Solved with #44

olineumann added the bug Something isn't working label Mar 4, 2021

olineumann mentioned this issue Mar 16, 2021

Fixed pipeline crashing due to wrong time index in RMSE error. #44

Merged

benHeid closed this as completed Mar 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline crashing due to wrong time index #39

Pipeline crashing due to wrong time index #39

olineumann commented Mar 4, 2021

benHeid commented Mar 17, 2021

Pipeline crashing due to wrong time index #39

Pipeline crashing due to wrong time index #39

Comments

olineumann commented Mar 4, 2021

benHeid commented Mar 17, 2021