Skip to content

Does MTGP assume target data with zero mean? #3

@Alaya-in-Matrix

Description

@Alaya-in-Matrix

I used mtgp to perform regression with single target, however, I get very biased prediction, it seems that the prediction Ypred is equal to test_y - mean(test_y), the correlation between the predicted values and the true target values is very high (r = 0.99), but the error remains large.

After standardized the target values, I get reasonable predictions.

Below is the code I used for regression, it is modified from toy_example.m

% toy_example.m
% A toy example demonstrating how to use the mtgp package  for M=3 tasks 
%
% function toy_example()
% 
% 1. Generates sample from true MTGP model 
% 2. Assigns cell data for learning and prediction
% 3. Learns hyperparameters with minimize function
% 4. Makes predictions at all points on all tasks
% 5. Plots Data and predictions
%
% Edwin V. Bonilla (edwin.bonilla@nicta.com.au)
clear all; clc;
addpath('./mtgp');
addpath('./mtgp/scripts');
addpath('./mtgp/utils');
addpath('./gpml-matlab/gpml');
rand('state',18);
randn('state',20);

% load data from text files
train_x = importdata('train_x');
train_y = importdata('train_y');
test_x  = importdata('test_x');
test_y  = importdata('test_y');

% currently only perform single-target regression
standardize = 1;
train_y     = train_y(:, 1);
test_y      = test_y(:, 1);

covfunc_x       = {'covSEard'};
M               = size(train_y, 2);    % Number of tasks
D               = size(train_x, 2);    % Dimensionality of input spacce
irank           = M;    % rank for Kf (1, ... M). irank=M -> Full rank

num_train = size(train_x, 1);
num_test  = size(test_x, 1);

if standardize
    [train_y, mu, sig] = zscore(train_y);
end
x                  = test_x;

xtrain       = repmat(train_x, M, 1);
ytrain       = train_y(:);
ind_kx_train = zeros(size(xtrain, 1), 1);
ind_kf_train = zeros(size(xtrain, 1), 1);

for i = 1 : size(xtrain, 1)
    re = rem(i, num_train);
    if re == 0
        ind_kx_train(i) = num_train;
    else
        ind_kx_train(i) = re;
    end
end
for i = 1 : M
    starti = 1 + (i-1) * num_train;
    endi   = i * num_train;
    ind_kf_train(starti:endi) = i;
end

nx = num_train;

% % %% 1. Generating samples from true Model
% % [x, Y, xtrain, ytrain, ind_kf_train, ind_kx_train , nx] = generate_data(covfunc_x, D, M);

%% 2. Assigns cell data for learning and prediction
data  = {covfunc_x, xtrain, ytrain, M, irank, nx, ind_kf_train, ind_kx_train};

%% 3. Hyper-parameter learning
[logtheta_all deriv_range] = init_mtgp_default(xtrain, covfunc_x, M, irank);
[logtheta_all nl]          = learn_mtgp(logtheta_all, deriv_range, data);


%% 4. Making predictions at all points on all tasks
[Ypred, Vpred] = predict_mtgp_all_tasks(logtheta_all, data, x );
if standardize
    pred_y         = (Ypred .* repmat(sig, num_test, 1)) + repmat(mu, num_test, 1);
    pred_s         = sqrt(Vpred) .* repmat(sig, num_test, 1);
else
    pred_y         = Ypred;
    pred_s         = sqrt(Vpred);
end
r     = corr(test_y, pred_y);
mnlps = mean(log(normpdf(test_y, pred_y, pred_s)));

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions