Like bigback, Tpros is a program that is run from the unix prompt and which reads its information from a spec file and from the command line. Tpros is considerably more flexible than bigback. The user can arrange the spec file information in a logical hierarchy of files, with a master file which includes other sub-files. And minor changes to the operation of the program can be effected by adding command-line arguments, rather than having to create new spec files.
As in bigback, data files are plain ascii files arranged in columns. Training sets and test sets should be arranged in separate files.
Whereas bigback was accompanied by a second set of programs called generate**, which were used to create predictions from a trained neural network, Tpros is a single program which can be run in several modes, one of which creates predictions. The other two modes optimize the hyperparameters for a given training data set, and optimize the *inputs* so as to maximize the expectation of some function of the *output*.
TprosThe most important arguments which you may want to change on the command line are as follows:
-tf specify target file -gf specify grid file -ntdat specify number of data points -nv specify initial noise variance -tol specify macopt tolerance -opt optimise hyperparameters -NOopt don't optimise hyperparameters -hi specify hyperparameter input file -ho specify hyperparameter output file -ninter specify number of interpolation points -int_tf specify interpolation target file -int_gf specify interpolation grid file -int_of specify interpolation output file -int_eb calculate error bars
BIGBACK name: TPROS name: targets targets inputs grid file (gf) sigma_w hyperparameters test set interpolation `generate' interpolation output file output file sigma_nu^2 noise variance (nv) linmintol tol
-tf target file ] these options define the -gf grid file ] training data -ntdat number of data points ] -opt optimise hyperparameters -ho hyperparameter output file -ninter number of interpolation points ] these options -int_tf interpolation target file ] define the test -int_gf interpolation grid file ] set. -int_of interpolation output file ] these files will -int_eb calculate error bars ] contain the predictions on the test setI would then visualize the predictions on the test set and inspect the hyperparameter values written in the hyperparameter output file. I would also look at the time-series of the hyperparameters during the optimization (written in hyp.dump, if requested) which would allow me to check that the optimization had converged, and check whether I was wasting computer time.
In the hyperparameter output file for a problem with 14 inputs you will find the following:
# Length Scales for Gaussian no.1 0.251116 0.501787 0.642249 0.580670 0.701428 0.569069 0.670451 0.482904 0.643245 0.576890 0.569316 0.713589 0.684511 0.520552 # Theta_1 parameter(s) 0.040519 # Theta_2 parameter 0.000000 # log(noise variance) -4.348184 # Theta_0 parameter 0.000000The meanings of these hyperparameters are very similar indeed to the following bigback hyperparameters:
1/sigma_w (1) 1/sigma_w .... .... 1/sigma_w (14) for the inputs # 1/sigma_w(out) # ------ ignore # log(sigma_nu) # ----- ignoreSo to make a plot of relevance versus input number, you could simply plot the first 14 hyperparameters.
set logscale y set yrange [0.1:10] # or something like that plot "h" thru 1/x u 1 w box
-tf target file ] these options define the -gf grid file ] training data -ntdat number of data points ] -NOopt don't optimise hyperparameters -hi hyperparameter input file <<<<< read in the optimized hyperparameters -ninter number of interpolation points ] these now define the files -int_gf interpolation grid file ] that contain the new input values -int_of interpolation output file ] these files will -int_eb calculate error bars ] contain the predictions on the new inputs