Study¶

Note

A study object can be created by calling jcmwave.optimizer.create_study().

class jcmwave.client.Study(host, study_id, session)¶

This class provides methods for controlling a numerical optimization study. Example:

def objective(x1,x2): 
    observation = study.new_observation()
    observation.add(x1**2+x2**2)
    return observation
study.set_parameters(max_iter=30, num_parallel=3)   

#Start optimization loop
study.set_objective(objective)   
study.run()   

#Alternatively, one can explicitely define the optimization loop
def acquire(suggestion):
   try: observation = objective(suggestion.kwargs)
   except: study.clear_suggestion(suggestion.id, 'Objective failed')
   else: study.add_observation(observation, suggestion.id)

while (not study.is_done()):
    suggestion = study.get_suggestion()
    t = Threading.thread(target=acquire, args=(suggestion,))
    t.start()       

add_many(samples, observations)¶

Adds many observations to the study. Example:

study.add_observation(observations, samples)

Parameters:	samples (list) – List of samples. E.g. `[{'x1': 0.1, 'x2': 0.2},{'x1': 0.3, 'x2': 0.4}]` observations (list) – List of `Observation()` objects for each sample (see `new_observation()`)

add_observation(observation, suggestion_id=None, sample=None)¶

Adds an observation to the study. Example:

study.add_observation(observation, suggestion.id)

Parameters:	observation – `Observation()` object with added values (see `new_observation()`) suggestion_id (int) – Id of the corresponding suggestion if it exists. sample (dict) – If the observation does not belong to an open suggestion, the corresponding sample must be provided. E.g. `{'x1': 0.1, 'x2': 0.2}`

clear_suggestion(suggestion_id, message='')¶

If the calculation of an objective value for a certain suggestion fails, the suggestion can be cleared from the study. Example:

study.clear_suggestion(suggestion.id, 'Computation failed')

Note

The study only creates num_parallel suggestions (see set_parameters()) until it waits for an observation to be added (see add_observation()) or a suggestion to be cleared.

Parameters:	suggestion_id (int) – Id of the suggestion to be cleared. message (str) – An optional message that is printed out.

driver_info()¶

Get driver-specific information. Example:

data = study.driver_info()

Returns:	Dictionary with multiple entries. For a description of the entries, see the documentation of the driver.

get_data_table()¶

Get table with data of the acquisitions. Example:

data = study.get_data_table()

Returns:

Dictionary with entries

objective_value:
iteration:	List of iteration number
datetime:	List of dates and times of the creation of the corresponding suggestion.
cummin:	List of cummulative minima for each iteration.
	List of the objective values aquired at each iteration.
parameters:	Dictionary containing a list of parameter values for each parameter name.

get_minima(**kwargs)¶

Get a list of local minima and their sensitivities on the parameters (i.e. their width) The minima are found using the Gaussian process only. Example:

study.get_minima(n_output=10)

Note

This function is only available for studies using a Bayesian driver, e.g. “BayesOptimization” (default driver).

Parameters:

n_samples (int) – Number of initial samples for searching (default: automatic determination).
n_output (int) – Maximum number of minima that are returned (Default: 10)
epsilon (float) – Parameter used for identifying identical minima (i.e. minima with distance < length scale * epsilon) and minima with non-vanishing gradient (e.g. minima at the boundary of the search space) (default: 0.2)
delta (float) – parameter used for approximating second derivatives (default: 0.2)
min_dist (float) – To increase the performance, it is possible to use a sparsified Gaussian process. One can define the minimal distance between the datapoints in this Gaussian process in units of the length scale. (default: 0.0)
n_observations (int) – Number of observations from the start used to build up the regression model. The model is configured with the corresponding hyperparameters at the last added observations. This parameter can be used to determine the convergence of the outcome with the number of observations. The parameter cannot be used together with min_dist. (default: None, means all observations are taken into account)

Returns:

A list of dictionaries with information about local minimas with the objective value, the uncertaitny of the objective value, the parameter values and the width in each parameter direction (i.e. standard deviation after a fit to a gaussian)

get_statistics(funcs=None, **params)¶

Determines statistice of the objective function which can be optionally weighted with the functions “funcs”. By default the probability density of the parameters is a uniform distribution in the whole parameter domain. Other parameter distributions can be defined via study.set_parameters(distribution = dist).

Example:

study.set_parameters(distribution = [
 {name="param1", dist="normal", mean=1.0, variance=2.0},
 {name="param3", dist="uniform", domain=[0,1]},
 {name="param5", dist="beta", alpha=1.5, beta=0.8}])
study.get_statistics(funcs=['1.0','x1','x1+x2'],abs_precision=0.001)

Note

This function is only available for studies using a Bayesian driver, e.g. “BayesOptimization” (default driver).

Note

For the Monta Carlo integration, only samples fulfilling the contraints of the parameters are used.

Parameters:

funcs (list|function) – A function string or a list of functions strings. For functs=f the value of g(r) = objective(r)*f(r) is analyzed For functs=[f_1,f_2,…] a list of function [g_i(r)=objective(r)*f_i(r)] is analyzed.
abs_precsion (float) – The Monte Carlo integration is stopped when the absolute precision of the mean value or the uncertainty of the mean value is smaller than abs_precision. (Default: 1e-9)
rel_precsion (float) – The Monte Carlo integration is stopped when the relative precision of the mean value or the relative uncertainty of the mean value is smaller than rel_precision. (Default: 1e-3)
max_time (float) – The Monte Carlo integration is stopped when the time max_time has passed. (Default: inf)
max_iter (int) – The Monte Carlo integration is stopped after max_iter samples. (Default: 1e5)
compute_uncertainity (bool) – Whether the uncertainty of the integral is computed based on the uncertainty of the Gaussian-process predictions. (Default: True)
min_dist (float) – To increase the performance, it is possible to use a sparsified Gaussian process. One can define the minimal distance between the datapoints in this Gaussian process in units of the length scale. (default: 0.0)
n_observations (int) – Number of observations from the start used to build up the regression model. The model is configured with the corresponding hyperparameters at the last added observations. This parameter can be used to determine the convergence of the outcome with the number of observations. The parameter cannot be used together with min_dist. (default: None, means all observations are taken into account)

Returns:

A dictionary with the entries

mean: Expectation value <g> of the (weighted) objective under the: parameter distribution
variance: Variance <g^2> - <g>^2 of the (weighted) objective under the: parameter distribution
uncertainty_mean: Uncertainty of the mean value determined from the: uncertainty of the Gaussian process regression.
lower_quantile: 16% quantile of (weighted) objective values under the: parameter distribution
median: 50% quantile of (weighted) objective values under the: parameter distribution
upper_quantile: 84% quantile of (weighted) objective values under the: parameter distribution
num_sampling_points: Number of sampling points that were used in the: Monte Carlo integration. The numerical uncertainty of the computed mean value is Sqrt(variance/N).

get_suggestion()¶

Get a new suggestion to be evaluated by the user. Example:

suggestion = study.get_suggestion()

Returns:

Suggestion() object with properties

id:	Id of the suggestion
kwargs:	Keyword arguments of the parameters. E.g. `{'x1': 0.1, 'x2':0.2}`

Warning

The function has to wait until the number of open suggestions is smaller than num_parallel before receiving a new suggestion. This can cause a deadlock if no observation is added by an independent thread.

info()¶

Get information about the status of the study. Example:

info = study.info()

Returns:

Dictionary with entries

open_suggestions:
num_parallel:	Number of parallel observations set.
is_done:	True if the study has finished (i.e. some stopping criterion was met)
status:	Status message
	List of open suggestions
min_params:	Parameters of the found minimum. E.g. `{'x1': 0.1, 'x2': 0.7}`
min_objective:	Minimum value found.
num_dim:	Number of variable dimensions of the search space.

is_done()¶

Checks if the study has finished. Example:

if study.is_done(): break

Returns:	True if some stopping critereon set by `Study.set_parameters()` was met.

Note

Before returning true, the function call waits until all open suggestions have been added to the study.

k_space_info(sigma)¶

Get information on a sample in k-space. Example:

info = study.k_space_info(sigma=[0.1,0.5])

Note

This function is only available for the driver “KSpaceRegression”.

Parameters: sigma (list) – List of length two with x and y sigma coordinate. I.e. sigma = [kx/k0,ky/k0] with k0 = sqrt(kx^2+ky^2+kz^2).

Returns:

Dictionary with entries

max_entropy_bloch_family:
bloch_family:	List of sigma values of other members of the same Bloch family.
	Subset of Bloch family with the largest differential entropy of the Gaussian process posterior distribution.
bloch_family_in_symmetry_cone:
	List of sigma values of other members of the same Bloch family that are inside the symmetry cone.
symmetry_family:
	List of all other sigma values obtained by performing symmetry operations on sigma (e.g. reflections on symmetry axes).
mapped_to_symmetry_cone:
	sigma value in symmetry cone obtained by performing symmetry operations on sigma (e.g. reflections on symmetry axes).

new_observation()¶

Create a new observation object. Example:

observation = study.new_observation()
observation.add(1.2)
observation.add(0.1, derivative='x1')

Returns:

Observation() object with the method add() that has the arguments

derivative (optional):
value:	Observed value of objective function
	Name of the derivative paramter. E.g. for `derivative='x1'`, the value is interpreted as the derivative of the objective with respect to `x1`.

optimize_hyperparameters(**params)¶

Optimize the hyperparameters of the Gaussian process manually. This is usually done automatically. See the documentation of the driver ‘BayesOptimization’ for parameters steering this process. Example:

study.optimize_hyperparameters()

Note

This function is only available for specific drivers with a machine learning model, e.g. “BayesOptimization” (default driver).

Parameters:	n_samples (int) – Number of initial start samples for optimization (default: automatic determination) n_observations (int) – Number of observations from the start used to build up the regression model. This can be used if the hyperparameter optimiation for all observations takes too long. (default: all observations are taken into account)
Returns:	A dictionary with entries log-likelihood: Values of maximized log-likelihood hyperparameters: A list with the values of all optimized hyperparameters

predict(samples, derivatives=False)¶

Predict the value and the uncertainty of the objective function. Example:

study.predict(samples=[[1,0,0],[2,0,1]])

Note

This function is only available for studies using a Bayesian driver, e.g. “BayesOptimization” (default driver).

Parameters:

samples (list) – List of samples, i.e. list with parameter values.
derivatives (bool) – Whether derivatives of the means and uncertainties are computed.
min_dist (float) – To increase the performance for making many predictions at once, it is possible to us a sparsified Gaussian process. One can define the minimal distance between the datapoints in this Gaussian process in units of the length scalesss. (default: 0.0)

Returns:

A dictionary with the respective lists of predictions

run()¶

Run the acquisition loop after the objective has been set (see set_objective()). The acquisition loop stops after a stopping critereon has been met (see set_parameters()). Example:

study.run()

run_mcmc(**kwargs)¶

Runs a Markov Chain Monte Carlo (MCMC) sampling over the posterior probability density of the parameters. This method should be run only after the minimization of the likelihood is completed. Example:

study.run()
dist = [{name="param1", dist="normal", "mean"=1.0, "variance"=2.0},
        {name="param2", dist="gamma", "alpha"=1.2, "beta"=0.5},
        {name="param3", dist="uniform"}]
study.set_parameters(distribution=dist)
samples = study.run_mcmc()

Note

The function is only available for the BayesLeastSquare driver.

Parameters:

num_walkers (int) – Number of walkers (default: automaticcally chosen).
max_iter (int) – Maximum absolute chain length (default: 1e4).
chain_length_tau (int) – Maximum chain length in units of the correlation time tau (default: 100).
multi_modal (bool) – If true, a more explorative sampling strategy is used (default: false).
append (bool) – If true, the samples are appended to the samples of the previous MCMC run (default: false).
min_dist (float) – To increase the performance of the sampling, it is possible to us a sparsified Gaussian process. One can define the minimal distance between the datapoints in this Gaussian process in units of the length scalesss. (default: 0.0)
max_sigma_dist (float) – If set, the sampling is restricted to a a distance max_sigma_dist * sigma to the maximum likelihood estimate. E.g. max_digma_dist=3.0 means that only the 99.7% p robability region of each parameter is sampled. (default: inf)
marginalize_uncertainties (bool) – If true, the mean value of the likelihood is determined by marginalizing over the uncertainties of the Gaussian process regression. This is more reliable in parameter regions with fewer function acquisitions but leads also to a slower MCMC sampling. (default: false).

Returns:

A dictionary with the following entries:

lower_uncertainties:
samples:	The drawn samples without “burnin” samples thinned by half of the correlation time.
medians:	The medians of all random parameters
	The distances between the medians and the 16% quantile of all random parameters
upper_uncertainties:
	The distance between the medians and the 84% quantile of all random parameters

set_objective(objective)¶

Set the objective function to be minimized. Example:

def objective(x1,x2): 
    observation = study.new_observation()
    observation.add(x1**2+x2**2)
    return observation
study.set_objective(objective)

Parameters:	objective (func) – Function handle for a function of the variable parameters that returns a corresponding Observation() object.

set_parameters(**kwargs)¶

Sets parameters for the optimization run. Example:

study.set_parameters(max_iter=100, num_parallel=5)

Parameters:	num_parallel (int) – Number of parallel observations of the objective function (default: 1). max_iter (int) – Maximum number of evaluations of the objective 2 function (default: inf). max_time (int) – Maximum optimization time in seconds (default: inf).

Note

The full list of parameters depends on the chosen driver. For a parameter description, see the documentation of the driver.

start_clock()¶

The optimization stops after the time max_time (see set_parameters()). This function resets the clock to zero. Example:

study.start_clock()

Note

The clock is also set to zero by calling set_parameters.