Study¶
Note
A study
object can be created by calling jcmwave.optimizer.create_study()
.
-
class
jcmwave.client.
Study
(host, study_id, session)¶ This class provides methods for controlling a numerical optimization study. Example:
def objective(x1,x2): observation = study.new_observation() observation.add(x1**2+x2**2) return observation study.set_parameters(max_iter=30, num_parallel=3) #Start optimization loop study.set_objective(objective) study.run() #Alternatively, one can explicitely define the optimization loop def acquire(suggestion): try: observation = objective(suggestion.kwargs) except: study.clear_suggestion(suggestion.id, 'Objective failed') else: study.add_observation(observation, suggestion.id) while (not study.is_done()): suggestion = study.get_suggestion() t = Threading.thread(target=acquire, args=(suggestion,)) t.start()
-
add_many
(samples, observations)¶ Adds many observations to the study. Example:
study.add_observation(observations, samples)
Parameters: - samples (list) – List of samples.
E.g.
[{'x1': 0.1, 'x2': 0.2},{'x1': 0.3, 'x2': 0.4}]
- observations (list) – List of
Observation()
objects for each sample (seenew_observation()
)
- samples (list) – List of samples.
E.g.
-
add_observation
(observation, suggestion_id=None, sample=None)¶ Adds an observation to the study. Example:
study.add_observation(observation, suggestion.id)
Parameters: - observation –
Observation()
object with added values (seenew_observation()
) - suggestion_id (int) – Id of the corresponding suggestion if it exists.
- sample (dict) – If the observation does not belong to an open suggestion,
the corresponding sample must be provided.
E.g.
{'x1': 0.1, 'x2': 0.2}
- observation –
-
clear_suggestion
(suggestion_id, message='')¶ If the calculation of an objective value for a certain suggestion fails, the suggestion can be cleared from the study. Example:
study.clear_suggestion(suggestion.id, 'Computation failed')
Note
The study only creates
num_parallel
suggestions (seeset_parameters()
) until it waits for an observation to be added (seeadd_observation()
) or a suggestion to be cleared.Parameters: - suggestion_id (int) – Id of the suggestion to be cleared.
- message (str) – An optional message that is printed out.
-
driver_info
()¶ Get driver-specific information. Example:
data = study.driver_info()
Returns: Dictionary with multiple entries. For a description of the entries, see the documentation of the driver.
-
get_data_table
()¶ Get table with data of the acquisitions. Example:
data = study.get_data_table()
Returns: Dictionary with entries iteration: List of iteration number datetime: List of dates and times of the creation of the corresponding suggestion. cummin: List of cummulative minima for each iteration. objective_value: List of the objective values aquired at each iteration. parameters: Dictionary containing a list of parameter values for each parameter name.
-
get_minima
(**kwargs)¶ Get a list of local minima and their sensitivities on the parameters (i.e. their width) The minima are found using the Gaussian process only. Example:
study.get_minima(n_output=10)
Note
This function is only available for studies using a Bayesian driver, e.g. “BayesOptimization” (default driver).
Parameters: - n_samples (int) – Number of initial samples for searching (default: automatic determination).
- n_output (int) – Maximum number of minima that are returned (Default: 10)
- epsilon (float) – Parameter used for identifying identical minima (i.e. minima with distance < length scale * epsilon) and minima with non-vanishing gradient (e.g. minima at the boundary of the search space) (default: 0.2)
- delta (float) – parameter used for approximating second derivatives (default: 0.2)
- min_dist (float) – To increase the performance, it is possible to use a sparsified Gaussian process. One can define the minimal distance between the datapoints in this Gaussian process in units of the length scale. (default: 0.0)
- n_observations (int) – Number of observations from the start used to build up the regression model. The model is configured with the corresponding hyperparameters at the last added observations. This parameter can be used to determine the convergence of the outcome with the number of observations. The parameter cannot be used together with min_dist. (default: None, means all observations are taken into account)
Returns: A list of dictionaries with information about local minimas with the objective value, the uncertaitny of the objective value, the parameter values and the width in each parameter direction (i.e. standard deviation after a fit to a gaussian)
-
get_statistics
(funcs=None, **params)¶ Determines statistice of the objective function which can be optionally weighted with the functions “funcs”. By default the probability density of the parameters is a uniform distribution in the whole parameter domain. Other parameter distributions can be defined via study.set_parameters(distribution = dist).
Example:
study.set_parameters(distribution = [ {name="param1", dist="normal", mean=1.0, variance=2.0}, {name="param3", dist="uniform", domain=[0,1]}, {name="param5", dist="beta", alpha=1.5, beta=0.8}]) study.get_statistics(funcs=['1.0','x1','x1+x2'],abs_precision=0.001)
Note
This function is only available for studies using a Bayesian driver, e.g. “BayesOptimization” (default driver).
Note
For the Monta Carlo integration, only samples fulfilling the contraints of the parameters are used.
Parameters: - funcs (list|function) – A function string or a list of functions strings. For functs=f the value of g(r) = objective(r)*f(r) is analyzed For functs=[f_1,f_2,…] a list of function [g_i(r)=objective(r)*f_i(r)] is analyzed.
- abs_precsion (float) – The Monte Carlo integration is stopped when the absolute precision of the mean value or the uncertainty of the mean value is smaller than abs_precision. (Default: 1e-9)
- rel_precsion (float) – The Monte Carlo integration is stopped when the relative precision of the mean value or the relative uncertainty of the mean value is smaller than rel_precision. (Default: 1e-3)
- max_time (float) – The Monte Carlo integration is stopped when the time max_time has passed. (Default: inf)
- max_iter (int) – The Monte Carlo integration is stopped after max_iter samples. (Default: 1e5)
- compute_uncertainity (bool) – Whether the uncertainty of the integral is computed based on the uncertainty of the Gaussian-process predictions. (Default: True)
- min_dist (float) – To increase the performance, it is possible to use a sparsified Gaussian process. One can define the minimal distance between the datapoints in this Gaussian process in units of the length scale. (default: 0.0)
- n_observations (int) – Number of observations from the start used to build up the regression model. The model is configured with the corresponding hyperparameters at the last added observations. This parameter can be used to determine the convergence of the outcome with the number of observations. The parameter cannot be used together with min_dist. (default: None, means all observations are taken into account)
Returns: A dictionary with the entries
- mean: Expectation value <g> of the (weighted) objective under the
parameter distribution
- variance: Variance <g^2> - <g>^2 of the (weighted) objective under the
parameter distribution
- uncertainty_mean: Uncertainty of the mean value determined from the
uncertainty of the Gaussian process regression.
- lower_quantile: 16% quantile of (weighted) objective values under the
parameter distribution
- median: 50% quantile of (weighted) objective values under the
parameter distribution
- upper_quantile: 84% quantile of (weighted) objective values under the
parameter distribution
- num_sampling_points: Number of sampling points that were used in the
Monte Carlo integration. The numerical uncertainty of the computed mean value is Sqrt(variance/N).
-
get_suggestion
()¶ Get a new suggestion to be evaluated by the user. Example:
suggestion = study.get_suggestion()
Returns: Suggestion()
object with propertiesid: Id of the suggestion kwargs: Keyword arguments of the parameters. E.g. {'x1': 0.1, 'x2':0.2}
Warning
The function has to wait until the number of open suggestions is smaller than
num_parallel
before receiving a new suggestion. This can cause a deadlock if no observation is added by an independent thread.
-
info
()¶ Get information about the status of the study. Example:
info = study.info()
Returns: Dictionary with entries num_parallel: Number of parallel observations set. is_done: True if the study has finished (i.e. some stopping criterion was met) status: Status message open_suggestions: List of open suggestions min_params: Parameters of the found minimum. E.g. {'x1': 0.1, 'x2': 0.7}
min_objective: Minimum value found. num_dim: Number of variable dimensions of the search space.
-
is_done
()¶ Checks if the study has finished. Example:
if study.is_done(): break
Returns: True if some stopping critereon set by Study.set_parameters()
was met.Note
Before returning true, the function call waits until all open suggestions have been added to the study.
-
k_space_info
(sigma)¶ Get information on a sample in k-space. Example:
info = study.k_space_info(sigma=[0.1,0.5])
Note
This function is only available for the driver “KSpaceRegression”.
Parameters: sigma (list) – List of length two with x and y sigma coordinate. I.e. sigma = [kx/k0,ky/k0]
withk0 = sqrt(kx^2+ky^2+kz^2)
.Returns: Dictionary with entries bloch_family: List of sigma values of other members of the same Bloch family. max_entropy_bloch_family: Subset of Bloch family with the largest differential entropy of the Gaussian process posterior distribution. bloch_family_in_symmetry_cone: List of sigma values of other members of the same Bloch family that are inside the symmetry cone. symmetry_family: List of all other sigma values obtained by performing symmetry operations on sigma (e.g. reflections on symmetry axes). mapped_to_symmetry_cone: sigma value in symmetry cone obtained by performing symmetry operations on sigma (e.g. reflections on symmetry axes).
-
new_observation
()¶ Create a new observation object. Example:
observation = study.new_observation() observation.add(1.2) observation.add(0.1, derivative='x1')
Returns: Observation()
object with the methodadd()
that has the argumentsvalue: Observed value of objective function derivative (optional): Name of the derivative paramter. E.g. for derivative='x1'
, the value is interpreted as the derivative of the objective with respect tox1
.
-
optimize_hyperparameters
(**params)¶ Optimize the hyperparameters of the Gaussian process manually. This is usually done automatically. See the documentation of the driver ‘BayesOptimization’ for parameters steering this process. Example:
study.optimize_hyperparameters()
Note
This function is only available for specific drivers with a machine learning model, e.g. “BayesOptimization” (default driver).
Parameters: - n_samples (int) – Number of initial start samples for optimization (default: automatic determination)
- n_observations (int) – Number of observations from the start used to build up the regression model. This can be used if the hyperparameter optimiation for all observations takes too long. (default: all observations are taken into account)
Returns: A dictionary with entries log-likelihood: Values of maximized log-likelihood hyperparameters: A list with the values of all optimized hyperparameters
-
predict
(samples, derivatives=False)¶ Predict the value and the uncertainty of the objective function. Example:
study.predict(samples=[[1,0,0],[2,0,1]])
Note
This function is only available for studies using a Bayesian driver, e.g. “BayesOptimization” (default driver).
Parameters: - samples (list) – List of samples, i.e. list with parameter values.
- derivatives (bool) – Whether derivatives of the means and uncertainties are computed.
- min_dist (float) – To increase the performance for making many predictions at once, it is possible to us a sparsified Gaussian process. One can define the minimal distance between the datapoints in this Gaussian process in units of the length scalesss. (default: 0.0)
Returns: A dictionary with the respective lists of predictions
-
run
()¶ Run the acquisition loop after the objective has been set (see
set_objective()
). The acquisition loop stops after a stopping critereon has been met (seeset_parameters()
). Example:study.run()
-
run_mcmc
(**kwargs)¶ Runs a Markov Chain Monte Carlo (MCMC) sampling over the posterior probability density of the parameters. This method should be run only after the minimization of the likelihood is completed. Example:
study.run() dist = [{name="param1", dist="normal", "mean"=1.0, "variance"=2.0}, {name="param2", dist="gamma", "alpha"=1.2, "beta"=0.5}, {name="param3", dist="uniform"}] study.set_parameters(distribution=dist) samples = study.run_mcmc()
Note
The function is only available for the BayesLeastSquare driver.
Parameters: - num_walkers (int) – Number of walkers (default: automaticcally chosen).
- max_iter (int) – Maximum absolute chain length (default: 1e4).
- chain_length_tau (int) – Maximum chain length in units of the correlation time tau (default: 100).
- multi_modal (bool) – If true, a more explorative sampling strategy is used (default: false).
- append (bool) – If true, the samples are appended to the samples of the previous MCMC run (default: false).
- min_dist (float) – To increase the performance of the sampling, it is possible to us a sparsified Gaussian process. One can define the minimal distance between the datapoints in this Gaussian process in units of the length scalesss. (default: 0.0)
- max_sigma_dist (float) – If set, the sampling is restricted to a a distance max_sigma_dist * sigma to the maximum likelihood estimate. E.g. max_digma_dist=3.0 means that only the 99.7% p robability region of each parameter is sampled. (default: inf)
- marginalize_uncertainties (bool) – If true, the mean value of the likelihood is determined by marginalizing over the uncertainties of the Gaussian process regression. This is more reliable in parameter regions with fewer function acquisitions but leads also to a slower MCMC sampling. (default: false).
Returns: A dictionary with the following entries:
samples: The drawn samples without “burnin” samples thinned by half of the correlation time. medians: The medians of all random parameters lower_uncertainties: The distances between the medians and the 16% quantile of all random parameters upper_uncertainties: The distance between the medians and the 84% quantile of all random parameters
-
set_objective
(objective)¶ Set the objective function to be minimized. Example:
def objective(x1,x2): observation = study.new_observation() observation.add(x1**2+x2**2) return observation study.set_objective(objective)
Parameters: objective (func) – Function handle for a function of the variable parameters that returns a corresponding Observation() object.
-
set_parameters
(**kwargs)¶ Sets parameters for the optimization run. Example:
study.set_parameters(max_iter=100, num_parallel=5)
Parameters: - num_parallel (int) – Number of parallel observations of the objective function (default: 1).
- max_iter (int) – Maximum number of evaluations of the objective 2 function (default: inf).
- max_time (int) – Maximum optimization time in seconds (default: inf).
Note
The full list of parameters depends on the chosen driver. For a parameter description, see the documentation of the driver.
-
start_clock
()¶ The optimization stops after the time
max_time
(seeset_parameters()
). This function resets the clock to zero. Example:study.start_clock()
Note
The clock is also set to zero by calling set_parameters.
-