Sparse Gaussian Process Regression

This cookbook illustrates how to use sparse approximations to Gaussian processes. Sparse approximations for full GP are done to reduce computational scaling. This requires selecting an additional \(m\) latent variables which could be a subset of training points. This subset form a pseudo data set with inputs \(\mathbf{X}'\) and targets \(\mathbf{f}'\) .

Given the training data, the predictive distribution \(y^*\) for a new input point \(\mathbf{x}^*\) will then be:

\[p(y^*|\mathbf{x}^*, \mathbf{y}, \mathbf{X}, \mathbf{X}')=\int p(\mathbf{y}^*|\mathbf{x}^*, \mathbf{X}',\mathbf{f}')p(\mathbf{f}'| \mathbf{y}, \mathbf{X}, \mathbf{X}')df'\]

See [QuinoneroCR05] for detailed overview of Sparse approximate Gaussian Processes Regression.

Example

Imagine we have files with training and test data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) and CRegressionLabels as:

features_train = RealFeatures(f_feats_train)
features_test = RealFeatures(f_feats_test)
labels_train = RegressionLabels(f_labels_train)
labels_test = RegressionLabels(f_labels_test)
features_train = RealFeatures(f_feats_train);
features_test = RealFeatures(f_feats_test);
labels_train = RegressionLabels(f_labels_train);
labels_test = RegressionLabels(f_labels_test);
RealFeatures features_train = new RealFeatures(f_feats_train);
RealFeatures features_test = new RealFeatures(f_feats_test);
RegressionLabels labels_train = new RegressionLabels(f_labels_train);
RegressionLabels labels_test = new RegressionLabels(f_labels_test);
features_train = Modshogun::RealFeatures.new f_feats_train
features_test = Modshogun::RealFeatures.new f_feats_test
labels_train = Modshogun::RegressionLabels.new f_labels_train
labels_test = Modshogun::RegressionLabels.new f_labels_test
features_train <- RealFeatures(f_feats_train)
features_test <- RealFeatures(f_feats_test)
labels_train <- RegressionLabels(f_labels_train)
labels_test <- RegressionLabels(f_labels_test)
features_train = modshogun.RealFeatures(f_feats_train)
features_test = modshogun.RealFeatures(f_feats_test)
labels_train = modshogun.RegressionLabels(f_labels_train)
labels_test = modshogun.RegressionLabels(f_labels_test)
RealFeatures features_train = new RealFeatures(f_feats_train);
RealFeatures features_test = new RealFeatures(f_feats_test);
RegressionLabels labels_train = new RegressionLabels(f_labels_train);
RegressionLabels labels_test = new RegressionLabels(f_labels_test);
auto features_train = some<CDenseFeatures<float64_t>>(f_feats_train);
auto features_test = some<CDenseFeatures<float64_t>>(f_feats_test);
auto labels_train = some<CRegressionLabels>(f_labels_train);
auto labels_test = some<CRegressionLabels>(f_labels_test);

To fit the input (training) data \(\mathbf{X}\), we have to choose an appropriate CMeanFunction and CKernel and instantiate them. Here we use a basic CZeroMean and a CGaussianKernel with chosen width parameter.

width = 1.0
kernel = GaussianKernel(features_train, features_train, width)
mean_function = ZeroMean()
width = 1.0;
kernel = GaussianKernel(features_train, features_train, width);
mean_function = ZeroMean();
double width = 1.0;
GaussianKernel kernel = new GaussianKernel(features_train, features_train, width);
ZeroMean mean_function = new ZeroMean();
width = 1.0
kernel = Modshogun::GaussianKernel.new features_train, features_train, width
mean_function = Modshogun::ZeroMean.new 
width <- 1.0
kernel <- GaussianKernel(features_train, features_train, width)
mean_function <- ZeroMean()
width = 1.0
kernel = modshogun.GaussianKernel(features_train, features_train, width)
mean_function = modshogun.ZeroMean()
double width = 1.0;
GaussianKernel kernel = new GaussianKernel(features_train, features_train, width);
ZeroMean mean_function = new ZeroMean();
auto width = 1.0;
auto kernel = some<CGaussianKernel>(features_train, features_train, width);
auto mean_function = some<CZeroMean>();

We need to specify the inference method to find the posterior distribution of the function values \(\mathbf{f}\). Here we choose to perform variational inference for fully independent conditional training (FITC) with an instance of CFITCInferenceMethod. We use another feature instance for inducing points and add a simple subset for demonstration. The inference method is then created and we pass it the chosen kernel, the training features, the mean function, the labels, an instance of CGaussianLikelihood. We use a subset of the training data for inducing features.

inducing_points = np.zeros( (3), dtype='int32')
inducing_points[0] = 0
inducing_points[1] = 1
inducing_points[2] = 2
inducing_features = RealFeatures(f_feats_inducing)
inducing_features.add_subset(inducing_points)

gauss_likelihood = GaussianLikelihood()
inference_method = FITCInferenceMethod(kernel, features_train, mean_function, labels_train, gauss_likelihood, inducing_features)
inducing_points = zeros(1, 3, 'int32');
inducing_points(1) = 0;
inducing_points(2) = 1;
inducing_points(3) = 2;
inducing_features = RealFeatures(f_feats_inducing);
inducing_features.add_subset(inducing_points);

gauss_likelihood = GaussianLikelihood();
inference_method = FITCInferenceMethod(kernel, features_train, mean_function, labels_train, gauss_likelihood, inducing_features);
DoubleMatrix inducing_points = new DoubleMatrix(1, 3);
inducing_points.put(0, 0);
inducing_points.put(1, 1);
inducing_points.put(2, 2);
RealFeatures inducing_features = new RealFeatures(f_feats_inducing);
inducing_features.add_subset(inducing_points);

GaussianLikelihood gauss_likelihood = new GaussianLikelihood();
FITCInferenceMethod inference_method = new FITCInferenceMethod(kernel, features_train, mean_function, labels_train, gauss_likelihood, inducing_features);
inducing_points = NArray.sint(3)
inducing_points[0] = 0
inducing_points[1] = 1
inducing_points[2] = 2
inducing_features = Modshogun::RealFeatures.new f_feats_inducing
inducing_features.add_subset inducing_points

gauss_likelihood = Modshogun::GaussianLikelihood.new 
inference_method = Modshogun::FITCInferenceMethod.new kernel, features_train, mean_function, labels_train, gauss_likelihood, inducing_features
inducing_points <- IntVector(3)
inducing_points[1] = 0
inducing_points[2] = 1
inducing_points[3] = 2
inducing_features <- RealFeatures(f_feats_inducing)
inducing_features$add_subset(inducing_points)

gauss_likelihood <- GaussianLikelihood()
inference_method <- FITCInferenceMethod(kernel, features_train, mean_function, labels_train, gauss_likelihood, inducing_features)
inducing_points = modshogun.IntVector(3)
inducing_points[1] = 0
inducing_points[2] = 1
inducing_points[3] = 2
inducing_features = modshogun.RealFeatures(f_feats_inducing)
inducing_features:add_subset(inducing_points)

gauss_likelihood = modshogun.GaussianLikelihood()
inference_method = modshogun.FITCInferenceMethod(kernel, features_train, mean_function, labels_train, gauss_likelihood, inducing_features)
var inducing_points = new int[3];
inducing_points[0] = 0;
inducing_points[1] = 1;
inducing_points[2] = 2;
RealFeatures inducing_features = new RealFeatures(f_feats_inducing);
inducing_features.add_subset(inducing_points);

GaussianLikelihood gauss_likelihood = new GaussianLikelihood();
FITCInferenceMethod inference_method = new FITCInferenceMethod(kernel, features_train, mean_function, labels_train, gauss_likelihood, inducing_features);
auto inducing_points = SGVector<int32_t>(3);
inducing_points[0] = 0;
inducing_points[1] = 1;
inducing_points[2] = 2;
auto inducing_features = some<CDenseFeatures<float64_t>>(f_feats_inducing);
inducing_features->add_subset(inducing_points);

auto gauss_likelihood = some<CGaussianLikelihood>();
auto inference_method = some<CFITCInferenceMethod>(kernel, features_train, mean_function, labels_train, gauss_likelihood, inducing_features);

Finally we generate a CGaussianProcessRegression class to be trained.

gp_regression = GaussianProcessRegression(inference_method)
gp_regression = GaussianProcessRegression(inference_method);
GaussianProcessRegression gp_regression = new GaussianProcessRegression(inference_method);
gp_regression = Modshogun::GaussianProcessRegression.new inference_method
gp_regression <- GaussianProcessRegression(inference_method)
gp_regression = modshogun.GaussianProcessRegression(inference_method)
GaussianProcessRegression gp_regression = new GaussianProcessRegression(inference_method);
auto gp_regression = some<CGaussianProcessRegression>(inference_method);

Then we can train the model and evaluate the predictive distribution. We get predicted CRegressionLabels.

gp_regression.train()
labels_predict = gp_regression.apply_regression(features_test)
gp_regression.train();
labels_predict = gp_regression.apply_regression(features_test);
gp_regression.train();
RegressionLabels labels_predict = gp_regression.apply_regression(features_test);
gp_regression.train 
labels_predict = gp_regression.apply_regression features_test
gp_regression$train()
labels_predict <- gp_regression$apply_regression(features_test)
gp_regression:train()
labels_predict = gp_regression:apply_regression(features_test)
gp_regression.train();
RegressionLabels labels_predict = gp_regression.apply_regression(features_test);
gp_regression->train();
auto labels_predict = gp_regression->apply_regression(features_test);

We can compute the predictive variances as

variances = gp_regression.get_variance_vector(features_test)
variances = gp_regression.get_variance_vector(features_test);
DoubleMatrix variances = gp_regression.get_variance_vector(features_test);
variances = gp_regression.get_variance_vector features_test
variances <- gp_regression$get_variance_vector(features_test)
variances = gp_regression:get_variance_vector(features_test)
double[] variances = gp_regression.get_variance_vector(features_test);
auto variances = gp_regression->get_variance_vector(features_test);

Finally, we evaluate the CMeanSquaredError.

mserror = MeanSquaredError()
mse = mserror.evaluate(labels_predict, labels_test)
mserror = MeanSquaredError();
mse = mserror.evaluate(labels_predict, labels_test);
MeanSquaredError mserror = new MeanSquaredError();
double mse = mserror.evaluate(labels_predict, labels_test);
mserror = Modshogun::MeanSquaredError.new 
mse = mserror.evaluate labels_predict, labels_test
mserror <- MeanSquaredError()
mse <- mserror$evaluate(labels_predict, labels_test)
mserror = modshogun.MeanSquaredError()
mse = mserror:evaluate(labels_predict, labels_test)
MeanSquaredError mserror = new MeanSquaredError();
double mse = mserror.evaluate(labels_predict, labels_test);
auto mserror = some<CMeanSquaredError>();
auto mse = mserror->evaluate(labels_predict, labels_test);

References

Wikipedia: Gaussian_process

[QuinoneroCR05]J. Quiñonero-Candela and C. E. Rasmussen. A unifying view of sparse approximate gaussian process regression. Journal of Machine Learning Research, 6:1939–1959, 2005.