start of psi2 crossterms

This commit is contained in:
James Hensman 2013-02-26 14:49:00 +00:00
parent d9b03044ac
commit 4d79c3c97d
2 changed files with 87 additions and 10 deletions

View file

@ -378,6 +378,26 @@ class kern(parameterised):
slices1, slices2 = self._process_slices(slices1,slices2)
[p.psi2(Z[s2,i_s],mu[s1,i_s],S[s1,i_s],target[s1,s2,s2]) for p,i_s,s1,s2 in zip(self.parts,self.input_slices,slices1,slices2)]
#compute the "cross" terms
for p1, p2 in itertools.combinations(self.parts,2):
#white doesn;t compine with anything
if p1.name=='white' or p2.name=='white':
pass
#rbf X bias
elif p1.name=='bias' and p2.name=='rbf':
target += p1.variance*(p2._psi1[:,:,None]+p2._psi1[:,None,:])
elif p2.name=='bias' and p1.name=='rbf':
target += p2.variance*(p1._psi1[:,:,None]+p1._psi1[:,None,:])
#rbf X linear
elif p1.name=='linear' and p2.name=='rbf':
raise NotImplementedError #TODO
elif p2.name=='linear' and p1.name=='rbf':
raise NotImplementedError #TODO
else:
raise NotImplementedError, "psi2 cannot be computed for this kernel"
# "crossterms". Here we are recomputing psi1 for white (we don't need to), but it's
@ -402,6 +422,31 @@ class kern(parameterised):
target = np.zeros(self.Nparam)
[p.dpsi2_dtheta(partial[s1,s2,s2],Z[s2,i_s],mu[s1,i_s],S[s1,i_s],target[ps]) for p,i_s,s1,s2,ps in zip(self.parts,self.input_slices,slices1,slices2,self.param_slices)]
#compute the "cross" terms
#TODO: better looping
for i1, i2 in itertools.combinations(range(len(self.parts)),2):
p1,p2 = self.parts[i1], self.parts[i2]
ipsl1, ipsl2 = self.input_slices[i1], self.input_slices[i2]
ps1, ps2 = self.param_slices[i1], self.param_slices[i2]
#white doesn;t compine with anything
if p1.name=='white' or p2.name=='white':
pass
#rbf X bias
elif p1.name=='bias' and p2.name=='rbf':
p2.dpsi1_dtheta(partial.sum(1)*p1.variance,Z,mu,S,target[ps2])
p1.dpsi1_dtheta(partial.sum(1)*p2._psi1,Z,mu,S,target[ps1])
elif p2.name=='bias' and p1.name=='rbf':
p1.dpsi1_dtheta(partial.sum(1)*p2.variance,Z,mu,S,target[ps1])
p2.dpsi1_dtheta(partial.sum(1)*p1._psi1,Z,mu,S,target[ps2])
#rbf X linear
elif p1.name=='linear' and p2.name=='rbf':
raise NotImplementedError #TODO
elif p2.name=='linear' and p1.name=='rbf':
raise NotImplementedError #TODO
else:
raise NotImplementedError, "psi2 cannot be computed for this kernel"
# # "crossterms"
# # 1. get all the psi1 statistics
# psi1_matrices = [np.zeros((mu.shape[0], Z.shape[0])) for p in self.parts]
@ -429,6 +474,26 @@ class kern(parameterised):
target = np.zeros_like(Z)
[p.dpsi2_dZ(partial[s1,s2,s2],Z[s2,i_s],mu[s1,i_s],S[s1,i_s],target[s2,i_s]) for p,i_s,s1,s2 in zip(self.parts,self.input_slices,slices1,slices2)]
#compute the "cross" terms
#TODO: slices (need to iterate around the input slices also...)
for p1, p2 in itertools.combinations(self.parts,2):
#white doesn;t compine with anything
if p1.name=='white' or p2.name=='white':
pass
#rbf X bias
elif p1.name=='bias' and p2.name=='rbf':
target += p2.dpsi1_dX(partial.sum(1)*p1.variance,Z,mu,S)
elif p2.name=='bias' and p1.name=='rbf':
target += p1.dpsi1_dZ(partial.sum(2)*p2.variance,Z,mu,S)
#rbf X linear
elif p1.name=='linear' and p2.name=='rbf':
raise NotImplementedError #TODO
elif p2.name=='linear' and p1.name=='rbf':
raise NotImplementedError #TODO
else:
raise NotImplementedError, "psi2 cannot be computed for this kernel"
return target
def dpsi2_dmuS(self,partial,Z,mu,S,slices1=None,slices2=None):

View file

@ -1,12 +1,11 @@
Fails in weird ways if you pass a integer as the input instead of a double to the kernel.
The Matern kernels (at least the 52) still is working in the ARD manner which means it wouldn't run for very large input dimension. Needs to be fixed to match the RBF.
Implementing new covariances is too complicated at the moment. We need a barebones example of what to implement and where. Commenting in the covariance matrices needs to be improved. It's not clear to a user what all the psi parts are for. Maybe we need a cut down and simplified example to help with this (perhaps a cut down version of the RBF?). And then we should provide a simple list of what you need to do to get a new kernel going.
TODO
Missing kernels: polynomial, rational quadratic.
TODO
Kernel implementations are far to obscure. Need to be easily readable for a first time user.
Duplicate.
Need an implementation of scaled conjugate gradients for the optimizers.
@ -15,21 +14,30 @@ Need an implementation of gradient descent for the optimizers (works well with G
Need Carl Rasmussen's permission to add his conjugate gradients algorithm. In fact, we can just provide a hook for it, and post a separate python implementation of his algorithm.
Change get_param and set_param to get_params and set_params
FIXED
Get constrain param by default inside model creation.
Randomize doesn't seem to cover a wide enough range for restarts ... try it for a model where inputs are widely spaced apart and length scale is too short. Sampling from N(0,1) is too conservative. Dangerous for people who naively use restarts. Since we have the model we could maybe come up with some sensible heuristics for setting these things. Maybe we should also consider having '.initialize()'. If we can't do this well we should disable the restart method.
Tolerances for optimizers, do we need to introduce some standardization? At the moment does each have its own defaults?
Do all optimizers work only in terms of function evaluations? Do we need to check for one that uses iterations?
Upstream: Waiting for the new scipy, where the optimisers have been unified.
Tolerances for optimizers, do we need to introduce some standardization? At the moment does each have its own defaults?
Upstream, as above
Change Youter to YYT (Youter doesn't mean anything for matrices).
FIXED
Bug when running classification.crescent_data()
A dictionary for parameter storage? So we can go through names easily?
Wontfix. Dictionaries bring up all kinds of problems since they're not ordered.
When computing kernel.K for kernels like rbf, you can't compute a version with rbf.K(X) you have to do rbf.K(X, X)
FIXED
the predict method for GP_regression returns a covariance matrix which is a bad idea as this takes a lot to compute, it's also confusing for first time users. Should only be returned if the user explicitly requests it.
FIXED
A flag on covariance functions that indicates when they are not associated with an underlying function (like white noise or a coregionalization matrix).
@ -37,6 +45,10 @@ Diagonal noise covariance function
Long term: automatic Lagrange multiplier calculation for optimizers: constrain two parameters in an unusual way and the model automatically does the Lagrangian. Also augment the parameters with new ones, so define data variance to be white noise plus RBF variance and optimize over that and signal to noise ratio ... for example constrain the sum of variances to equal the known variance of the data.
When computing kernel.K for kernels like rbf, you can't compute a version with rbf.K(X) you have to do rbf.K(X, X)
Randomize doesn't seem to cover a wide enough range for restarts ... try it for a model where inputs are widely spaced apart and length scale is too short. Sampling from N(0,1) is too conservative. Dangerous for people who naively use restarts. Since we have the model we could maybe come up with some sensible heuristics for setting these things. Maybe we should also consider having '.initialize()'. If we can't do this well we should disable the restart method.
the predict method for GP_regression returns a covariance matrix which is a bad idea as this takes a lot to compute, it's also confusing for first time users. Should only be returned if the user explicitly requests it.
Fails in weird ways if you pass a integer as the input instead of a double to the kernel.
FIXED
The Matern kernels (at least the 52) still is working in the ARD manner which means it wouldn't run for very large input dimension. Needs to be fixed to match the RBF.
FIXED