diff --git a/doc/tuto_GP_regression.rst b/doc/tuto_GP_regression.rst
index 24e10528..87744c85 100644
--- a/doc/tuto_GP_regression.rst
+++ b/doc/tuto_GP_regression.rst
@@ -23,9 +23,9 @@ Note that the observations Y include some noise.
 
 The first step is to define the covariance kernel we want to use for the model. We choose here a kernel based on Gaussian kernel (i.e. rbf or square exponential)::
 
-    kernel = GPy.kern.rbf(D=1, variance=1., lengthscale=1.)
+    kernel = GPy.kern.rbf(input_dim=1, variance=1., lengthscale=1.)
 
-The parameter ``D`` stands for the dimension of the input space. The parameters ``variance`` and ``lengthscale`` are optional. Note that many other kernels are implemented such as:
+The parameter ``D`` stands for the dimension of the input space. The parameters ``variance`` and ``lengthscale`` are optional. Many other kernels are implemented such as:
 
 * linear (``GPy.kern.linear``)
 * exponential kernel (``GPy.kern.exponential``)
@@ -50,7 +50,7 @@ gives the following output: ::
     -----------------------------------------------------------------
        rbf_variance    |  1.0000  |               |        |         
       rbf_lengthscale  |  1.0000  |               |        |         
-      noise variance   |  1.0000  |               |        |         
+      noise_variance   |  1.0000  |               |        |         
 
 .. figure::  Figures/tuto_GP_regression_m1.png
     :align:   center
@@ -65,14 +65,14 @@ The default values of the kernel parameters may not be relevant for the current
 
 There are various ways to constrain the parameters of the kernel. The most basic is to constrain all the parameters to be positive::
 
-    m.constrain_positive('')
+    m.ensure_default_constraints() # or similarly m.constrain_positive('')
 
 but it is also possible to set a range on to constrain one parameter to be fixed. The parameter of ``m.constrain_positive`` is a regular expression that matches the name of the parameters to be constrained (as seen in ``print m``). For example, if we want the variance to be positive, the lengthscale to be in [1,10] and the noise variance to be fixed we can write::
 
     m.unconstrain('')                            # Required to remove the previous constrains
-    m.constrain_positive('rbf_variance')
-    m.constrain_bounded('lengthscale',1.,10. )
-    m.constrain_fixed('noise',0.0025)
+    m.constrain_positive('.*rbf_variance')
+    m.constrain_bounded('.*lengthscale',1.,10. )
+    m.constrain_fixed('.*noise',0.0025)
 
 Once the constrains have been imposed, the model can be optimized::
 
@@ -80,7 +80,7 @@ Once the constrains have been imposed, the model can be optimized::
 
 If we want to perform some restarts to try to improve the result of the optimization, we can use the ``optimize_restart`` function::
 
-    m.optimize_restarts(Nrestarts = 10)
+    m.optimize_restarts(num_restarts = 10)
 
 Once again, we can use ``print(m)`` and ``m.plot()`` to look at the resulting model  resulting model::
 
@@ -89,7 +89,7 @@ Once again, we can use ``print(m)`` and ``m.plot()`` to look at the resulting mo
     -----------------------------------------------------------------
        rbf_variance    |  0.8151  |     (+ve)     |        |         
       rbf_lengthscale  |  1.8037  |  (1.0, 10.0)  |        |         
-      noise variance   |  0.0025  |     Fixed     |        |         
+      noise_variance   |  0.0025  |     Fixed     |        |         
 
 .. figure::  Figures/tuto_GP_regression_m2.png
     :align:   center
@@ -122,9 +122,7 @@ Here is a 2 dimensional example::
     m.constrain_positive('')
 
     # optimize and plot
-    pb.figure()
     m.optimize('tnc', max_f_eval = 1000)
-
     m.plot()
     print(m)
 
diff --git a/doc/tuto_creating_new_kernels.rst b/doc/tuto_creating_new_kernels.rst
index 24003ba2..6d30fe05 100644
--- a/doc/tuto_creating_new_kernels.rst
+++ b/doc/tuto_creating_new_kernels.rst
@@ -29,18 +29,18 @@ The header is similar to all kernels: ::
 
     class rational_quadratic(kernpart):
 
-**__init__(self,D, param1, param2, ...)**
+**__init__(self,input_dim, param1, param2, ...)**
 
 The implementation of this function in mandatory.
 
-For all kernparts the first parameter ``D`` corresponds to the dimension of the input space, and the following parameters stand for the parameterization of the kernel.
+For all kernparts the first parameter ``input_dim`` corresponds to the dimension of the input space, and the following parameters stand for the parameterization of the kernel.
 
-The following attributes are compulsory: ``self.D`` (the dimension, integer), ``self.name`` (name of the kernel, string), ``self.Nparam`` (number of parameters, integer). ::
+The following attributes are compulsory: ``self.input_dim`` (the dimension, integer), ``self.name`` (name of the kernel, string), ``self.num_params`` (number of parameters, integer). ::
 
-    def __init__(self,D,variance=1.,lengthscale=1.,power=1.):
-        assert D == 1, "For this kernel we assume D=1"
-        self.D = D
-        self.Nparam = 3
+    def __init__(self,input_dim,variance=1.,lengthscale=1.,power=1.):
+        assert input_dim == 1, "For this kernel we assume input_dim=1"
+        self.input_dim = input_dim
+        self.num_params = 3
         self.name = 'rat_quad'
         self.variance = variance
         self.lengthscale = lengthscale
@@ -50,7 +50,7 @@ The following attributes are compulsory: ``self.D`` (the dimension, integer), ``
 
 The implementation of this function in mandatory.
 
-This function returns a one dimensional array of length ``self.Nparam`` containing the value of the parameters. ::
+This function returns a one dimensional array of length ``self.num_params`` containing the value of the parameters. ::
 
     def _get_params(self):
         return np.hstack((self.variance,self.lengthscale,self.power))
@@ -59,7 +59,7 @@ This function returns a one dimensional array of length ``self.Nparam`` containi
 
 The implementation of this function in mandatory.
 
-The input is a one dimensional array of length ``self.Nparam`` containing the value of the parameters. The function has no output but it updates the values of the attribute associated to the parameters (such as ``self.variance``, ``self.lengthscale``, ...). ::
+The input is a one dimensional array of length ``self.num_params`` containing the value of the parameters. The function has no output but it updates the values of the attribute associated to the parameters (such as ``self.variance``, ``self.lengthscale``, ...). ::
 
     def _set_params(self,x):
         self.variance = x[0]
@@ -70,7 +70,7 @@ The input is a one dimensional array of length ``self.Nparam`` containing the va
 
 The implementation of this function in mandatory.
 
-It returns a list of strings of length ``self.Nparam`` corresponding to the parameter names. ::
+It returns a list of strings of length ``self.num_params`` corresponding to the parameter names. ::
 
     def _get_param_names(self):
         return ['variance','lengthscale','power']
@@ -79,7 +79,7 @@ It returns a list of strings of length ``self.Nparam`` corresponding to the para
 
 The implementation of this function in mandatory.
 
-This function is used to compute the covariance matrix associated with the inputs X, X2 (np.arrays with arbitrary number of line (say :math:`n_1`, :math:`n_2`) and ``self.D`` columns). This function does not returns anything but it adds the :math:`n_1 \times n_2` covariance matrix to the kernpart to the object ``target`` (a :math:`n_1 \times n_2` np.array). This trick allows to compute the covariance matrix of a kernel containing many kernparts with a limited memory use. ::
+This function is used to compute the covariance matrix associated with the inputs X, X2 (np.arrays with arbitrary number of line (say :math:`n_1`, :math:`n_2`) and ``self.input_dim`` columns). This function does not returns anything but it adds the :math:`n_1 \times n_2` covariance matrix to the kernpart to the object ``target`` (a :math:`n_1 \times n_2` np.array). This trick allows to compute the covariance matrix of a kernel containing many kernparts with a limited memory use. ::
 
     def K(self,X,X2,target):
         if X2 is None: X2 = X
@@ -100,7 +100,7 @@ This function is similar to ``K`` but it computes only the values of the kernel
 
 This function is required for the optimization of the parameters.
 
-Computes the derivative of the likelihood. As previously, the values are added to the object target which is a 1-dimensional np.array of length ``self.Nparam``. For example, if the kernel is parameterized by :math:`\sigma^2,\ \theta`, then :math:`\frac{dL}{d\sigma^2} = \frac{dL}{d K} \frac{dK}{d\sigma^2}` is added to the first element of target and :math:`\frac{dL}{d\theta} = \frac{dL}{d K} \frac{dK}{d\theta}` to the second. ::
+Computes the derivative of the likelihood. As previously, the values are added to the object target which is a 1-dimensional np.array of length ``self.input_dim``. For example, if the kernel is parameterized by :math:`\sigma^2,\ \theta`, then :math:`\frac{dL}{d\sigma^2} = \frac{dL}{d K} \frac{dK}{d\sigma^2}` is added to the first element of target and :math:`\frac{dL}{d\theta} = \frac{dL}{d K} \frac{dK}{d\theta}` to the second. ::
 
     def dK_dtheta(self,dL_dK,X,X2,target):
         if X2 is None: X2 = X
@@ -119,7 +119,7 @@ Computes the derivative of the likelihood. As previously, the values are added t
 
 This function is required for BGPLVM, sparse models and uncertain inputs.
 
-As previously, target is an ``self.Nparam`` array and :math:`\frac{dL}{d Kdiag} \frac{dKdiag}{dparam}` is added to each element. ::
+As previously, target is an ``self.num_params`` array and :math:`\frac{dL}{d Kdiag} \frac{dKdiag}{dparam}` is added to each element. ::
 
     def dKdiag_dtheta(self,dL_dKdiag,X,target):
         target[0] += np.sum(dL_dKdiag)
@@ -129,7 +129,7 @@ As previously, target is an ``self.Nparam`` array and :math:`\frac{dL}{d Kdiag}
 
 This function is required for GPLVM, BGPLVM, sparse models and uncertain inputs.
 
-Computes the derivative of the likelihood with respect to the inputs ``X`` (a :math:`n \times D` np.array). The result is added to target which is a :math:`n \times D` np.array. ::
+Computes the derivative of the likelihood with respect to the inputs ``X`` (a :math:`n \times d` np.array). The result is added to target which is a :math:`n \times d` np.array. ::
 
     def dK_dX(self,dL_dK,X,X2,target):
         """derivative of the covariance matrix with respect to X."""
@@ -169,9 +169,9 @@ The following line should be added in the preamble of the file::
 
 as well as the following block ::
 
-    def rational_quadratic(D,variance=1., lengthscale=1., power=1.):
-        part = rational_quadraticpart(D,variance, lengthscale, power)
-        return kern(D, [part])
+    def rational_quadratic(input_dim,variance=1., lengthscale=1., power=1.):
+        part = rational_quadraticpart(input_dim,variance, lengthscale, power)
+        return kern(input_dim, [part])
 
 
 Update initialization
diff --git a/doc/tuto_interacting_with_models.rst b/doc/tuto_interacting_with_models.rst
index 3031a5e1..3cea7fb7 100644
--- a/doc/tuto_interacting_with_models.rst
+++ b/doc/tuto_interacting_with_models.rst
@@ -18,6 +18,7 @@ All of the examples included in GPy return an instance
 of a model class, and therefore they can be called in 
 the following way: ::
 
+	import numpy as np
     import pylab as pb
     pb.ion()
     import GPy
@@ -91,19 +92,17 @@ we can define a new array of values and change the parameters as follows: ::
 If we call the function ``_get_params()`` again, we will obtain the new
 parameters we have just set.
 
-Parameters can be also set by name using the function ``_set()``. For example,
-lets change the lengthscale to .5: ::
+Parameters can be also set by name using dictionary notations. For example,
+let's change the lengthscale to .5: ::
 
-	m.set('rbf_lengthscale',.5)
+	m['rbf_lengthscale'] = .5
 
-``_set()`` function accepts regular expression as it first
-input, and therefore all parameters matching that regular 
-expression are set to the given value. In this case rather 
+Here, the matching accepts a regular expression and therefore all parameters matching that regular expression are set to the given value. In this case rather 
 than passing as second output a single value, we can also 
 use a list of arrays. For example, lets change the inducing 
 inputs: ::
 
-	m.set('iip',np.arange(-4,0))
+	m['iip'] = np.arange(-5,0)
 
 Getting the model's likelihood and gradients
 ===========================================
@@ -129,10 +128,9 @@ we have been changing the parameters, the gradients are far from zero now.
 Next we are going to show how to optimize the model setting different 
 restrictions on the parameters. 
 
-Once a constrain has been set on a parameter, it is not possible to
-define a new constraint for it unless we explicitly remove the previous
-one. The command to remove the constraints is ``unconstrain()``, and
-just as the ``set()`` command, it also accepts regular expression.
+Once a constrain has been set on a parameter, it is possible to remove it
+with the command ``unconstrain()``, and
+just as the previous matching commands, it also accepts regular expression.
 In this case we will remove all the constraints: ::
 
 	m.unconstrain('')
@@ -144,7 +142,7 @@ is to be positive. This is constraint is easily set
 with the function ``constrain_positive()``. Regular expressions
 are also accepted. ::
 
-    m.constrain_positive('var')
+    m.constrain_positive('.*var')
 
 For convenience, GPy also provides a catch all function 
 which ensures that anything which appears to require 
@@ -179,7 +177,7 @@ however for the sake of the example we will tie the white noise
 and the variance together. See `A kernel overview <tuto_kernel_overview.html>`_.
 for a proper use of the tying capabilities.::
 
-    m.tie_params('e_var')
+    m.tie_params('.*e_var')
 
 Optimizing the model
 ====================
diff --git a/doc/tuto_kernel_overview.rst b/doc/tuto_kernel_overview.rst
index 391881d8..6cc7b30d 100644
--- a/doc/tuto_kernel_overview.rst
+++ b/doc/tuto_kernel_overview.rst
@@ -13,8 +13,8 @@ First we import the libraries we will need ::
 
 For most kernels, the dimension is the only mandatory parameter to define a kernel object. However, it is also possible to specify the values of the parameters. For example, the three following commands are valid for defining a squared exponential kernel (ie rbf or Gaussian) ::
 
-    ker1 = GPy.kern.rbf(1)  # Equivalent to ker1 = GPy.kern.rbf(D=1, variance=1., lengthscale=1.)
-    ker2 = GPy.kern.rbf(D=1, variance = .75, lengthscale=2.)
+    ker1 = GPy.kern.rbf(1)  # Equivalent to ker1 = GPy.kern.rbf(input_dim=1, variance=1., lengthscale=1.)
+    ker2 = GPy.kern.rbf(input_dim=1, variance = .75, lengthscale=2.)
     ker3 = GPy.kern.rbf(1, .5, .5)
 
 A ``print`` and a ``plot`` functions are implemented to represent kernel objects. The commands ::
@@ -144,9 +144,9 @@ When calling one of these functions, the parameters to constrain can either by s
     k = k1 + k2 + k3
     print k
 
-    k.constrain_positive('var')
+    k.constrain_positive('.*var')
     k.constrain_fixed(np.array([1]),1.75)
-    k.tie_params('len')
+    k.tie_params('.*len')
     k.unconstrain('white')
     k.constrain_bounded('white',lower=1e-5,upper=.5)
     print k
@@ -212,7 +212,6 @@ Note the ties between the parameters of ``Kanova`` that reflect the links betwee
 
     # Create GP regression model
     m = GPy.models.GP_regression(X,Y,Kanova)
-    pb.figure(figsize=(5,5))
     m.plot()
 
 .. figure::  Figures/tuto_kern_overview_mANOVA.png