diff --git a/doc/tuto_GP_regression.rst b/doc/tuto_GP_regression.rst index e1c795ed..7b2af232 100644 --- a/doc/tuto_GP_regression.rst +++ b/doc/tuto_GP_regression.rst @@ -28,7 +28,7 @@ The first step is to define the covariance kernel we want to use for the model. noise = GPy.kern.white(D=1) kernel = Gaussian + noise -The parameter D stands for the dimension of the input space. Note that many other kernels are implemented such as: +The parameter ``D`` stands for the dimension of the input space. Note that many other kernels are implemented such as: * linear (``GPy.kern.linear``) * exponential kernel (``GPy.kern.exponential``) @@ -41,11 +41,26 @@ The inputs required for building the model are the observations and the kernel:: m = GPy.models.GP_regression(X,Y,kernel) -The functions ``print`` and ``plot`` can help us understand the model we have just build:: +The functions ``print`` and ``plot`` give an insight of the model we have just build. The code:: print m m.plot() +gives the following output: :: + + Marginal log-likelihood: -2.281e+01 + Name | Value | Constraints | Ties | Prior + ----------------------------------------------------------------- + rbf_variance | 1.0000 | | | + rbf_lengthscale | 1.0000 | | | + white_variance | 1.0000 | | | + +.. figure:: Figures/tuto_GP_regression_m1.png + :align: center + :height: 350px + + GP regression model before optimization of the parameters. The shaded region corresponds to 95% confidence intervals (ie +/- 2 standard deviation). + The default values of the kernel parameters may not be relevant for the current data (for example, the confidence intervals seems too wide on the previous figure). A common approach is find the values of the parameters that maximize the likelihood of the data. There are two steps for doing that with GPy: * Constrain the parameters of the kernel to ensure the kernel will always be a valid covariance structure (For example, we don\'t want some variances to be negative!). @@ -57,20 +72,34 @@ There are various ways to constrain the parameters of the kernel. The most basic but it is also possible to set a range on to constrain one parameter to be fixed. The parameter of ``m.constrain_positive`` is a regular expression that matches the name of the parameters to be constrained (as seen in ``print m``). For example, if we want the variance to be positive, the lengthscale to be in [1,10] and the noise variance to be fixed we can write:: - #m.unconstrain('') # Required if the model has been previously constrained + m.unconstrain('') # Required to remove the previous constrains m.constrain_positive('rbf_variance') m.constrain_bounded('lengthscale',1.,10. ) m.constrain_fixed('white',0.0025) -Once the constrains have bee imposed, the model can be optimized:: +Once the constrains have been imposed, the model can be optimized:: m.optimize() If we want to perform some restarts to try to improve the result of the optimization, we can use the optimize_restart function:: m.optimize_restarts(Nrestarts = 10) - m.plot() - print(m) + +Once again, we can use ``print(m)`` and ``m.plot()`` to look at the resulting model resulting model:: + + Marginal log-likelihood: 2.001e+01 + Name | Value | Constraints | Ties | Prior + ----------------------------------------------------------------- + rbf_variance | 0.8033 | (+ve) | | + rbf_lengthscale | 1.8033 | (1.0, 10.0) | | + white_variance | 0.0025 | Fixed | | + +.. figure:: Figures/tuto_GP_regression_m2.png + :align: center + :height: 350px + + GP regression model after optimization of the parameters. + 2 dimensional example ===================== @@ -102,4 +131,18 @@ Here is a 2 dimensional example:: m.plot() print(m) -The flag ``ARD=True`` in the definition of the Matern kernel specifies that we want one lengthscale parameter per dimension (ie the GP is not isotropic). +The flag ``ARD=True`` in the definition of the Matern kernel specifies that we want one lengthscale parameter per dimension (ie the GP is not isotropic). The output of the last 2 lines is:: + + Marginal log-likelihood: 2.893e+01 + Name | Value | Constraints | Ties | Prior + ------------------------------------------------------------------------- + Mat52_ARD_variance | 0.4094 | (+ve) | | + Mat52_ARD_lengthscale_0 | 2.1060 | (+ve) | | + Mat52_ARD_lengthscale_1 | 2.0546 | (+ve) | | + white_variance | 0.0012 | (+ve) | | + +.. figure:: Figures/tuto_GP_regression_m3.png + :align: center + :height: 350px + + Contour plot of the best predictor (posterior mean).