Detailed API documentation¶

This page documents the different modules and functions in the perfectns package.

settings¶

Defines base class which holds settings for performing perfect nested sampling. This includes likelihood and prior objects as well as the parameters controlling how the calculation is performed - for example the number of live points and whether or not dynamic nested sampling is to be used.

class perfectns.settings.PerfectNSSettings(**kwargs)[source]¶

Controls how Perfect Nested Sampling is performed.

For more details of the standard nested sampling and dynamic nested sampling algorithms see:

1) ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al. 2019). 2) ‘Sampling errors in nested sampling parameter estimation’ (Higson et al. 2018).

Parameters

n_dim: int: Dimension of the likelihood and prior.
prior: object: A prior object. See priors.py for more details.
likelihood: object: A likelihood object. See likelihoods.py for more details.
nlive_const: int: The number of live points for standard nested sampling.
termination_fraction: float: Standard nested sampling runs will terminate when the posterior mass remaining (estimated from the live points) is less than termination_fraction times the posterior mass (evidence) in the dead points.
dynamic_goal: float or None: Determines is dynamic nested sampling is used, and if so how improvements in evidence and parameter estimation accuracy are weighted.
n_samples_max: int or None: Number of samples after which dynamic nested sampling will terminate. If this is None the number of samples to use is chosen using nlive_const.
ninit: int: Number of live points used in dynamic nested sampling for the initial exploratory standard nested sampling run.
nbatch: int: Number of threads dynamic nested sampling adds before each recalculation of points’ importances to the calculation.
dynamic_fraction: float: Dynamic nested sampling adds more samples wherever points’ importances are greater than dynamic_fraction times the largest importance.
tuned_dynamic_p: bool: Determines if dynamic nested sampling is tuned for a specific parameter estimation problem.

Methods

`get_settings_dict`(self)	Returns a dictionary containing settings information.
`logl_given_logx`(self, logx)	Maps input logx values to loglikelihoods.
`logl_given_r`(self, r)	Maps input radial coordinates to loglikelihood values.
`logx_given_logl`(self, logl)	Maps input loglikelihoods to logx values.
`logx_given_r`(self, r)	Maps input radial coordinates to logx values.
`logz_analytic`(self)	If available gives an analytically calculated value for the log evidence for the likelihood and prior (useful for checking results).
`r_given_logl`(self, logl)	Maps input loglikelihood values to radial coordinates.
`r_given_logx`(self, logx)	Maps input logx values to radial coordinates.
`save_name`(self[, include_dg, …])	Make a standard save name format for a given settings configoration.

get_settings_dict(self)[source]¶

Returns a dictionary containing settings information. The names and parameters of the likelihoods and priors are stored instead of the objects themselves so they can be saved with pickle.

Returns

settings_dict: dict

logl_given_logx(self, logx)[source]¶

Maps input logx values to loglikelihoods.

Parameters

logx: float or numpy array
n_dim: int

Returns

logl: same type and size as logx

logl_given_r(self, r)[source]¶

Maps input radial coordinates to loglikelihood values.

Parameters

r: float or numpy array
n_dim: int

Returns

logl: same type and size as r

logx_given_logl(self, logl)[source]¶

Maps input loglikelihoods to logx values.

Parameters

logl: float or numpy array
n_dim: int

Returns

logx: same type and size as logl

logx_given_r(self, r)[source]¶

Maps input radial coordinates to logx values.

Parameters

r: float or numpy array
n_dim: int

Returns

logx: same type and size as r

logz_analytic(self)[source]¶

If available gives an analytically calculated value for the log evidence for the likelihood and prior (useful for checking results).

This functionality is stored in the likelihood object. If it has not been set up then an error message is printed and nothing is returned.

Returns

float or None: True value of logz or None if it is not available.

r_given_logl(self, logl)[source]¶

Maps input loglikelihood values to radial coordinates.

Parameters

r: float or numpy array
n_dim: int

Returns

logl: same type and size as r

r_given_logx(self, logx)[source]¶

Maps input logx values to radial coordinates.

Parameters

logx: float or numpy array
n_dim: int

Returns

r: same type and size as logx

save_name(self, include_dg=True, include_samples_max=False)[source]¶

Make a standard save name format for a given settings configoration.

Parameters

include_dg: bool, optional: Whether or not to include the dynamic_goal setting in save_name.
include_samples_max: bool, optional: Whether or not to include the n_samples_max setting in save_name.

Returns

save_name: str

nested_sampling¶

Functions which perform standard and dynamic nested sampling runs and generate samples for use in evidence calculations and parameter estimation.

Nested sampling runs are stored a format compatible with the nestcheck package.

perfectns.nested_sampling.dict_given_samples_array(samples, thread_min_max)[source]¶

Converts an array of information about samples back into a dictionary.

Parameters

samples: numpy array: Numpy array containing columns [logl, r, logx, thread label, change in nlive at sample, (thetas)] with each row representing a single sample.
thread_min_max’: numpy array, optional: 2d array with a row for each thread containing the likelihoods at which it begins and ends. Needed to calculate nlive_array (otherwise this is set to None).

Returns

ns_run: dict: Nested sampling run dictionary corresponding to the samples array. Contains keys: ‘logl’, ‘r’, ‘logx’, ‘thread_label’, ‘nlive_array’, ‘theta’ N.B. this does not contain a record of the run’s settings.

perfectns.nested_sampling.generate_dynamic_run(settings)[source]¶

Generate a dynamic nested sampling run. For details of the dynamic nested sampling algorithm, see ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

The run terminates when the number of samples reaches some limit settings.n_samples_max. If this is not set, the function will estimate the number of samples that a standard nested sampling run with settings.nlive_const would use from the number of samples in the initial exploratory run.

Parameters

settings: PerfectNSSettings object: settings.dynamic_goal controls whether the algorithm aims to increase parameter estimation accuracy (dynamic_goal=1), evidence accuracy (dynamic_goal=0) or places some weight on both.

Returns

dict: Nested sampling run dictionary containing information about the run’s posterior samples and a record of the settings used. See docstring for generate_ns_run for more details.

perfectns.nested_sampling.generate_ns_run(settings, random_seed=None)[source]¶

Performs perfect nested sampling calculation and returns a nested sampling run in the form of a dictionary.

This function is just a wrapper around the generate_standard_run (performs standard nested sampling) and generate_dynamic_run (performs dynamic nested sampling) which are chosen depending on the input settings.

Parameters

settings: PerfectNSSettings object
random_seed: None, bool or int, optional: Set numpy random seed. Default is to use None (so a random seed is chosen from the computer’s internal state) to ensure reliable results when multiprocessing. Can set to an integer or to False to not edit the seed.

Returns

dict

Nested sampling run dictionary containing information about the run’s posterior samples and a record of the settings used. These are as separate arrays giving values for points in order of increasing likelihood. Keys:

‘settings’: dict recording settings used. ‘logl’: 1d array of log likelihoods. ‘r’: 1d array of radial coordinates. ‘logx’: 1d array of logx coordinates. ‘theta’: 2d array, each row is sample coordinate. The number of

co-ordinates saved is controlled by settings.dims_to_sample.

‘nlive_array’: 1d array of the local number of live points at each
sample.

‘thread_min_max’: 2d array containing likelihoods at which each
nested sampling thread begins and ends.

‘thread_labels’: 1d array listing the threads each sample belongs
to.

perfectns.nested_sampling.generate_single_thread(settings, logx_end, thread_label, logx_start=0, keep_final_point=True)[source]¶

Generates a samples array for a thread (single live point run) between logx_start and logx_end. Settings argument specifies how the calculation is done.

Parameters

settings: PerfectNSSettings object
logx_end: float
thread_label: int: Index labelling the thread.
logx_start: float, optional
keep_final_point: bool, optional: See generate_thread_logx docstring.

perfectns.nested_sampling.generate_standard_run(settings, is_dynamic_initial_run=False)[source]¶

Performs standard nested sampling using the likelihood and prior specified in settings.

The run terminates when the estimated posterior mass contained in the live points is less than settings.termination_fraction. The evidence in the remaining live points is estimated as

Z_{live} = average likelihood of live points * prior volume remaining

Parameters

settings: PerfectNSSettings object
is_dynamic_initial_run: bool, optional: Set to True if this is the initial exploratory run in dynamic nested sampling.

Returns

run: dict: Nested sampling run dictionary containing information about the run’s posterior samples and a record of the settings used. See docstring for generate_ns_run for more details.

perfectns.nested_sampling.generate_thread_logx(logx_end, logx_start=0, keep_final_point=True)[source]¶

Generate logx co-ordinates of a new nested sampling thread (single live point run).

Parameters

logx_end: float: Logx value at which run terminates.
logx_start: float, optional.: Logx value at which run starts. 0 corresponds to sampling from the whole prior.
keep_final_point: bool, optional: If False, the final point with logx less than logx_end is removed.

Returns

logx_list: list of floats

perfectns.nested_sampling.get_run_data(settings, n_repeat, \*\*kwargs)[source]¶

Tests if runs with the specified settings are already cached. If not the runs are generated and saved.

Parameters

settings: PerfectNSSettings object
n_repeat: int: Number of nested sampling runs to generate.
parallel: bool, optional: Should runs be generated in parallel?
max_workers: int or None, optional: Number of processes. If max_workers is None then concurrent.futures.ProcessPoolExecutor defaults to using the number of processors of the machine. N.B. If max_workers=None and running on supercomputer clusters with multiple nodes, this may default to the number of processors on a single node and therefore there will be no speedup from multiple nodes (must specify manually in this case).
load: bool, optional: Should previously saved runs be loaded? If False, new runs are generated.
save: bool, optional: Should any new runs generated be saved?
cache_dir: str, optional: Directory for caching
overwrite_existing: bool, optional: if a file exists already but we generate new run data, should we overwrite the existing file when saved?
check_loaded_settings: bool, optional: if we load a cached file, should we check if the loaded file’s settings match the current settings (and generate fresh runs if they do not)?
random_seeds: list, optional: random_seed arguments for each call of generate_ns_run.

Returns

run_list: list of n_repeat nested sampling runs.

perfectns.nested_sampling.min_max_importance(importance, samples, settings)[source]¶

Find the logl and logx values at which to start and end additional dynamic nested sampling threads.

Parameters

importance: 1d numpy array: Relative importances of samples.
samples: 2d numpy array: See dict_given_samples_arrry docstring for details of columns.
settings: PerfectNSSettings object

Returns

list of two floats: Contains logl_min and logl_max defining the start and end of the region from which new points should be sampled.
list of two floats: Logx values corresponding to logl_min and logl_max.

perfectns.nested_sampling.p_importance(theta, w_relative, tuned_dynamic_p=False, tuning_type='theta1')[source]¶

Calculate the relative importance of each point for parameter estimation.

For more details see ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

Parameters

theta: 2d numpy array: Each row gives parameter values of samples.
w_relative: 1d numpy array: Relative point weights.
tuned_dynamic_p: bool, optional: Whether or not to tune for a specific parameter estimation problem. See the dynamic nested sampling paper for more details.
tuning_type: str, optional: Which parameter estimation problem to tune for. Only used if tuned_dynamic_p is True. So far only set up to tune for the mean of the first parameter.

Returns

importance: 1d numpy array: Relative point importances. Normalised so the biggest value in the array is equal to 1.

perfectns.nested_sampling.point_importance(samples, thread_min_max, settings, simulate=False)[source]¶

Calculate the relative importance of each point for use in the dynamic nested sampling algorithm.

For more details see ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

Parameters

samples: 2d numpy array: See dict_given_samples_arrry docstring for details of columns.
thread_min_max: 2d numpy array: First column is starting logl of each thread and second column is ending logl.
settings: PerfectNSSettings object
simulate: bool, optional: Passed to nestcheck.ns_run_utils.get_logw.

Returns

importance: 1d numpy array: Relative point importances of the rows of the input samples array. Normalised so the biggest value in the array is equal to 1.

perfectns.nested_sampling.samples_array_given_run(ns_run)[source]¶

Converts information on samples in a nested sampling run dictionary into a numpy array representation. This allows fast addition of more samples and recalculation of nlive.

Parameters

ns_run: dict: Nested sampling run dictionary. Contains keys: ‘logl’, ‘r’, ‘logx’, ‘thread_label’, ‘nlive_array’, ‘theta’

Returns

samples: numpy array: Numpy array containing columns [logl, r, logx, thread label, change in nlive at sample, (thetas)] with each row representing a single sample.

perfectns.nested_sampling.z_importance(w_relative, nlive)[source]¶

Calculate the relative importance of each point for evidence calculation.

For more details see ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

Parameters

w_relative: 1d numpy array: Relative point weights.
nlive: 1d numpu array: Number of live points.

Returns

importance: 1d numpy array: Relative point importances. Normalised so the biggest value in the array is equal to 1.

likelihoods¶

Classes representing spherically symmetric likelihoods.

Each likelihood class must contain a member function giving the log likelihood as a function of the radial coordinate r = |theta| and the number of dimensions

def logl_given_r(self, r, n_dim):
…

The number of dimensions is not stored in this class but in the PerfectNSSettings object to ensure it is the same for the likelihood and the prior.

Likelihood classes may also optionally contain the inverse function

def r_given_logl(self, logl, n_dim):
…

(although this is not needed to generate nested sampling runs). Another optional function is the analytic value of the log evidence for some given prior and dimension, which is useful for checking results:

def logz_analytic(self, prior, n_dim):
…

class perfectns.likelihoods.Cauchy(likelihood_scale=1)[source]¶

Spherically symmetric Cauchy likelihood.

Methods

`logl_given_r`(self, r, n_dim)	Get loglikelihood values for input radial coordinates.
`r_given_logl`(self, logl, n_dim)	Get the radial coordinates corresponding to the input loglikelihood values.

logl_given_r(self, r, n_dim)[source]¶

Get loglikelihood values for input radial coordinates.

Parameters

r: float or numpy array: Radial coordinates.
n_dim: int: Number of dimensions.

Returns

logl: same type and size as r: Loglikelihood values.

r_given_logl(self, logl, n_dim)[source]¶

Get the radial coordinates corresponding to the input loglikelihood values.

Parameters

logl: float or numpy array: Loglikelihood values.
n_dim: int: Number of dimensions.

Returns

r: same type and size as logl: Radial coordinates.

class perfectns.likelihoods.ExpPower(likelihood_scale=1, power=2)[source]¶

Spherically symmetric exponential power likelihood. When power=1, this is the same as the Gaussian likelihood.

Methods

`logl_given_r`(self, r, n_dim)	Get loglikelihood values for input radial coordinates.
`r_given_logl`(self, logl, n_dim)	Get the radial coordinates corresponding to the input loglikelihood values.

logl_given_r(self, r, n_dim)[source]¶

Get loglikelihood values for input radial coordinates.

Parameters

r: float or numpy array: Radial coordinates.
n_dim: int: Number of dimensions.

Returns

logl: same type and size as r: Loglikelihood values.

r_given_logl(self, logl, n_dim)[source]¶

Get the radial coordinates corresponding to the input loglikelihood values.

Parameters

logl: float or numpy array: Loglikelihood values.
n_dim: int: Number of dimensions.

Returns

r: same type and size as logl: Radial coordinates.

class perfectns.likelihoods.Gaussian(likelihood_scale=1.0)[source]¶

Spherically symmetric Gaussian likelihood.

Methods

`logl_given_r`(self, r, n_dim)	Get loglikelihood values for input radial coordinates.
`logz_analytic`(self, prior, n_dim)	Returns analytic value of the log evidence for the input prior and dimension if it is available.
`r_given_logl`(self, logl, n_dim)	Get the radial coordinates corresponding to the input loglikelihood values.

logl_given_r(self, r, n_dim)[source]¶

Get loglikelihood values for input radial coordinates.

Parameters

r: float or numpy array: Radial coordinates.
n_dim: int: Number of dimensions.

Returns

logl: same type and size as r: Loglikelihood values.

logz_analytic(self, prior, n_dim)[source]¶

Returns analytic value of the log evidence for the input prior and dimension if it is available.

If not set up for this prior an AssertionError is thrown (this is caught in functions which check analytical values where they are available).

Parameters

prior: object
n_dim: int: Number of dimensions.

Returns

float: Analytic value of log Z for this likelihood given the prior and number of dimensions.

r_given_logl(self, logl, n_dim)[source]¶

Get the radial coordinates corresponding to the input loglikelihood values.

Parameters

logl: float or numpy array: Loglikelihood values.
n_dim: int: Number of dimensions.

Returns

r: same type and size as logl: Radial coordinates.

priors¶

Classes representing spherically symmetric priors.

Each prior class must contain a member function giving the radial coordinate r = |theta| as a function of the log fraction of the prior volume remaining and the dimension

def r_given_logx(self, logx, n_dim):
…

The number of dimensions is not stored in this class but in the PerfectNSSettings object to ensure it is the same for the likelihood and the prior.

Prior classes may also optionally contain the inverse function

def logx_given_r(self, r, n_dim):
…

(although this is not needed to generate nested sampling runs).

class perfectns.priors.Gaussian(prior_scale)[source]¶

Spherically symmetric Gaussian prior.

Methods

`logx_given_r`(self, r, n_dim)	Maps input radial coordinates to logx values.
`r_given_logx`(self, logx, n_dim)	Maps input logx values to radial coordinates.

logx_given_r(self, r, n_dim)[source]¶

Maps input radial coordinates to logx values.

Parameters

r: float or numpy array
n_dim: int

Returns

logx: same type and size as r

r_given_logx(self, logx, n_dim)[source]¶

Maps input logx values to radial coordinates.

Parameters

logx: float or numpy array
n_dim: int

Returns

r: same type and size as logx

class perfectns.priors.GaussianCached(prior_scale, **kwargs)[source]¶

Spherically symmetric uniform prior. The scipy inverse gamma function is not numerically stable so cache r_given_logx by using the mpmath logx_given_r and linearly interpolating.

Methods

`check_cache`(self, n_dim)	Helper function which checks that the input dimension matches that of the cached interpolation function, and if needed recalculates it.
`logx_given_r`(self, r, n_dim)	Maps input radial coordinates to logx values.
`r_given_logx`(self, logx, n_dim)	Maps input logx values to radial coordinates.

check_cache(self, n_dim)[source]¶

Helper function which checks that the input dimension matches that of the cached interpolation function, and if needed recalculates it.

Parameters

n_dim: int: Number of dimensions

logx_given_r(self, r, n_dim)[source]¶

Maps input radial coordinates to logx values.

Parameters

r: float or numpy array
n_dim: int

Returns

logx: same type and size as r

r_given_logx(self, logx, n_dim)[source]¶

Maps input logx values to radial coordinates.

Parameters

logx: float or numpy array
n_dim: int

Returns

r: same type and size as logx

class perfectns.priors.Uniform(prior_scale)[source]¶

Spherically symmetric uniform prior.

Methods

`logx_given_r`(self, r, n_dim)	Maps input radial coordinates to logx values.
`r_given_logx`(self, logx, n_dim)	Maps input logx values to radial coordinates.

logx_given_r(self, r, n_dim)[source]¶

Maps input radial coordinates to logx values.

Parameters

r: float or numpy array
n_dim: int

Returns

logx: same type and size as r

r_given_logx(self, logx, n_dim)[source]¶

Maps input logx values to radial coordinates.

Parameters

logx: float or numpy array
n_dim: int

Returns

r: same type and size as logx

estimators¶

Contains classes representing quantities which can be calculated from nested sampling run.

Each estimator class should contain a mandatory member function returning the value of the estimator for a nested sampling run:

def __call__(self, ns_run, logw=None, simulate=False):
…

This allows logw to be provided if many estimators are being calculated from the same run so logw is only calculated once. Otherwise logw is calculated from the run if required.

They may also optionally contain a function giving its analytical value for some given set of calculation settings (for use in checking results):

def analytical(self, settings):
…

as well as helper functions. Estimators should also contain class variables:

name: str
used for results tables.

latex_name: str
used for plotting results diagrams.

class perfectns.estimators.CountSamples[source]¶

Number of samples in run.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

class perfectns.estimators.EstimatorBase(func, **kwargs)[source]¶

Base class for estimators.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

class perfectns.estimators.LogZ[source]¶

Natural log of Bayesian evidence.

Methods

`__call__`(self, \args, \\*kwargs)	Returns estimator value for run.
`analytical`(settings)	Returns analytical value of estimator given settings.

static analytical(settings)[source]¶: Returns analytical value of estimator given settings.

class perfectns.estimators.ParamCred(probability, param_ind=0)[source]¶

One-tailed credible interval on the value of a single parameter (component of theta). By symmetry all parameters have the same distribution.

Methods

`__call__`(self, \args, \\*kwargs)	Returns estimator value for run.
`analytical`(self, settings)	Returns analytical value of estimator given settings.

analytical(self, settings)[source]¶: Returns analytical value of estimator given settings.

class perfectns.estimators.ParamMean(param_ind=0)[source]¶

Mean of a single parameter (single component of theta). By symmetry all parameters have the same distribution.

Methods

`__call__`(self, \args, \\*kwargs)	Returns estimator value for run.
`analytical`(settings)	Returns analytical value of estimator given settings.
`ftilde`(logx, settings)	Helper function for finding the analytic value of the estimator.

static analytical(settings)[source]¶: Returns analytical value of estimator given settings.

static ftilde(logx, settings)[source]¶

Helper function for finding the analytic value of the estimator. See “check_by_integrating” docstring for more details.

ftilde(X) is mean of f(theta) on the iso-likelihood contour L(theta) = L(X).

class perfectns.estimators.ParamSquaredMean(param_ind=0)[source]¶

Mean of the square of single parameter (second moment of its posterior distribution). By symmetry all parameters have the same distribution.

Methods

`__call__`(self, \args, \\*kwargs)	Returns estimator value for run.
`analytical`(self, settings)	Returns analytical value of estimator given settings.
`ftilde`(logx, settings)	Helper function for finding the analytic value of the estimator.

analytical(self, settings)[source]¶: Returns analytical value of estimator given settings.

static ftilde(logx, settings)[source]¶

Helper function for finding the analytic value of the estimator. See “check_by_integrating” docstring for more details.

ftilde(X) is mean of f(theta) on the iso-likelihood contour L(theta) = L(X).

class perfectns.estimators.RCred(probability, from_theta=False)[source]¶

One-tailed credible interval on the value of |theta|.

Methods

__call__(self, ns_run[, logw, simulate])

Returns estimator value for run.

class perfectns.estimators.RMean(from_theta=False)[source]¶

Mean of |theta| (the radial distance from the centre).

Methods

`__call__`(self, ns_run[, logw, simulate])	Overwrite __call__ from nestcheck r_mean to allow use of perfectns’ ‘r’ key if from_theta=False, and to check all the dimensions have been sampled if from_theta=True.
`analytical`(self, settings)	Returns analytical value of estimator given settings.
`ftilde`(logx, settings)	Helper function for finding the analytic value of the estimator.

analytical(self, settings)[source]¶: Returns analytical value of estimator given settings.

static ftilde(logx, settings)[source]¶

Helper function for finding the analytic value of the estimator. See “check_by_integrating” docstring for more details.

ftilde(X) is mean of f(theta) on the iso-likelihood contour L(theta) = L(X).

class perfectns.estimators.Z[source]¶

Bayesian evidence.

Methods

`__call__`(self, \args, \\*kwargs)	Returns estimator value for run.
`analytical`(settings)	Returns analytical value of estimator given settings.

static analytical(settings)[source]¶: Returns analytical value of estimator given settings.

perfectns.estimators.check_by_integrating(ftilde, settings)[source]¶

Return the true value of the estimator using numerical integration.

Chopin and Robert (2010) show that the expectation of some function f(theta) is given by the integral

int L(X) X ftilde(X) dX / Z,

where ftilde(X) is mean of f(theta) on the iso-likelihood contour L(theta) = L(X).

Parameters

ftilde: function
settings: PerfectNSSettings object

Returns

float: The estimator’s true value.

perfectns.estimators.check_integrand(logx, ftilde, settings)[source]¶

Helper function to return integrand L(X) X ftilde(X) for checking estimator values by numerical integration. Note that the integral must be normalised by multiplying by a factor of (1/Z).

Parameters

logx: 1d numpy array: Values on which to evaluate integrand.
ftilde: function: See check_by_integrating docstring for more details.
settings: PerfectNSSettings object

Returns

1d numpy array: Integrand evaluated at input logx coordinates.

perfectns.estimators.get_true_estimator_values(estimators, settings)[source]¶

Calculates analytic values for estimators given the likelihood and prior in settings. If there is no method for calculating the values then np.nan is returned.

Parameters

estimators: estimator object or list of estimator objects
settings: PerfectNSSettings object

Returns

output: np.array of size (len(estimators),) if estimators is a list, float: otherwise.

plots¶

Plotting functions.

perfectns.plots.cdf_given_logx(estimator, value, logx, settings)[source]¶

Calculate CDF at where each column represents the CDF of the distribution of ftheta values on some iso-likelihood contour L = L(X).

Parameters

estimator: estimator object: Function whose values we are getting the CDF of.
value: numpy array: Function values at which to evaluate the CDF.
logx: numpy array of same size and shape as value.: Logx values specifying contours - we calculate the CDF on each contour.
settings: PerfectNSSettings object

Returns

cdf: numpy array of same size and shape as value and logx

perfectns.plots.plot_dynamic_nlive(dynamic_goals, settings_in, \*\*kwargs)[source]¶

Plot the allocations of live points as a function of logX for different dynamic_goal settings. Plots also include analytically calculated distributions of relative posterior mass and relative posterior mass remaining.

Parameters

dynamic_goals: list of ints or None: dynamic_goal setting values to plot.
settings_in: PerfectNSSettings object
tuned_dynamic_ps: list of bools, optional: tuned_dynamic_ps settings corresponding to each dynamic goal settings. Defaults to False for all dynamic goals.
logx_min: float, optional: Lower limit of logx axis. If not specified this is set to the lowest logx reached by any of the runs.
load: bool, optional: Should the nested sampling runs be loaded from cache if available?
save: bool, optional: Should the nested sampling runs be cached?
ymax: bool, optional: Maximum value for plot’s nlive axis (yaxis).
n_run: int, optional: How many runs to plot for each dynamic goal.
npoints: int, optional: How many points to have in the logx array used to calculate and plot analytical weights.
figsize: tuple, optional: Size of figure in inches.

Returns

fig: matplotlib figure

perfectns.plots.plot_parameter_logx_diagram(settings, ftheta, \*\*kwargs)[source]¶

Plots parameter estimation diagram of the type described in Section 3.1 and shown in Figure 3 of “Sampling errors in nested sampling parameter estimation” (Higson et al., 2018).

Parameters

settings: PerfectNSSettings object
ftheta: estimator object: function of parameters to plot
ymin: float, optional: y axis (ftheta) min.
ymax: float, optional: y axis (ftheta) max.
ylabel: str, optional: y axis (ftheta) label.
logx_min: float, optional: Lower limit of logx axis.
x_points: int, optional: How many logx points to use in the plots.
y_points: int, optional
figsize: tuple, optional: Size of figure in inches.

Returns

fig: matplotlib figure

perfectns.plots.plot_rel_posterior_mass(likelihood_list, prior, dim_list, logx, \*\*kwargs)[source]¶

Plot analytic distributions of the relative posterior mass for different likelihoods and dimensionalities as a function of logx (this is proportional to L(X)X, where L(X) is the likelihood).

Parameters

logx: 1d numpy array: Logx values at which to calculate posterior mass.
likelihood_list: list of perfectns likelihood objects: Likelihoods to plot.
prior: perfectns prior object
dim_list: list of ints: Dimensions to plot for each likelihood.
figsize: tuple, optional: Size of figure in inches.
linestyles: list, optional: List of different linestyles to use for each likelihood.

Returns

fig: matplotlib figure: Figure showing relative posterior masses of input likelihoods

perfectns.plots.posterior_cdf(estimator, values, logx, settings)[source]¶

Calculates the 1d posterior cumulative distribution function (CDF) for some estimator given the likelihood and prior settings.

Parameters

estimator: estimator object: Function whose values we are getting the CDF of.
value: 1d numpy array: Function values at which to evaluate the CDF.
logx: 1d numpy array: Logx values over which to numericallly marginalise the probability distribution.
settings: PerfectNSSettings object

Returns

cdf: numpy array of same size and shape as values

perfectns.plots.sigma_given_cdf(cdf)[source]¶

Maps cdf values in [0,1] to number of standard deviations from the median. scipy.special.erfinv is defined in [-1,+1] mapping to [-inf,+inf]. We want to map a CDF to the number of sigma from the median - i.e. from [0,+1] to [0,+inf] - so we need the argument (2*cdf - 1), and to take abs to get a positive answer. erf is definted in terms of int e^(-t^2) which corresponds to a Gaussian with sigma = 1/sqrt(2). So we must multiply sigma_temp by sqrt(2)

Parameters

cdf: float or numpy array

Returns

float or numpy array: Same type and shape as input cdf.

results_tables¶

Functions used to generate results tables.

Used for results in ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

perfectns.results_tables.get_bootstrap_results(n_run, n_simulate, estimator_list, settings, \*\*kwargs)[source]¶

Generate data frame showing the standard deviations of the results of repeated calculations and estimated sampling errors from bootstrap resampling.

This function was used for Table 5 in ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019). See the paper for more details.

Parameters

n_run: int: how many runs to use
n_simulate: int: how many times to resample the nested sampling run in each bootstrap standard deviation estimate.
estimator_list: list of estimator objects
settings: PerfectNSSettings object
load: bool, optional: should run data and results be loaded if available?
save: bool, optional: should run data and results be saved?
parallel: bool, optional
cache_dir: str, optional: Directory to use for caching.
add_sim_method: bool, optional: should we also calculate standard deviations using the simulated weights method for comparison with bootstrap resampling? This method is inaccurate for parameter estimation.
n_simulate_ci: int, optional: how many times to resample the nested sampling run in each bootstrap credible interval estimate. These may require more simulations than the standard deviation estimate.
run_random_seeds: list, optional: list of random seeds to use for generating runs.
n_run_ci: int, optional: how many runs to use for each credible interval estimate. You may want to set this to lower than n_run if n_simulate_ci is large as otherwise the credible interval estimate may take a long time.
cred_int: float, optional: one-tailed credible interval to calculate
max_workers: int or None, optional: Number of processes. If max_workers is None then concurrent.futures.ProcessPoolExecutor defaults to using the number of processors of the machine. N.B. If max_workers=None and running on supercomputer clusters with multiple nodes, this may default to the number of processors on a single node and therefore there will be no speedup from multiple nodes (must specify manually in this case).

Returns

results: pandas data frame

results data frame. Contains two columns for each estimator - the second column (with ‘_unc’ appended to the title) shows the numerical uncertainty in the first column. Contains rows:

true values: analytical values of estimators for this likelihood
and posterior if available

repeats mean: mean calculation result repeats std: standard deviation of calculation results bs std / repeats std: mean bootstrap standard deviation estimate as

a fraction of the standard deviation of repeated results.

bs estimate % variation: standard deviation of bootstrap estimates
as a percentage of the mean estimate.

[only if add sim method is True]:

sim std / repeats std: as for ‘bs std / repeats std’ but with
simulation method standard deviation estimates.

sim estimate % variation: as for ‘bs estimate % variation’ but
with simulation method standard deviation estimates.

bs [cred_int] CI: mean bootstrap credible interval estimate. bs +-1std % coverage: % of calculation results falling within +- 1

mean bootstrap standard deviation estimate of the mean.

bs [cred_int] CI % coverage: % of calculation results which are
less than the mean bootstrap credible interval estimate.

perfectns.results_tables.get_dynamic_results(n_run, dynamic_goals_in, estimator_list_in, settings_in, \*\*kwargs)[source]¶

Generate data frame showing the standard deviations of the results of repeated calculations and efficiency gains (ratios of variances of results calculations) from different dynamic goals. To make the comparison fair, for dynamic nested sampling settings.n_samples_max is set to slightly below the mean number of samples used by standard nested sampling.

This function was used for Tables 1, 2, 3 and 4, as well as to generate the results shown in figures 6 and 7 of ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019). See the paper for a more detailed description.

Parameters

n_run: int: how many runs to use
dynamic_goals_in: list of floats: which dynamic goals to test
estimator_list_in: list of estimator objects
settings_in: PerfectNSSettings object
load: bool, optional: should run data and results be loaded if available?
save: bool, optional: should run data and results be saved?
overwrite_existing: bool, optional: if a file exists already but we generate new run data, should we overwrite the existing file when saved?
run_random_seeds: list, optional: list of random seeds to use for generating runs.
parallel: bool, optional
cache_dir: str, optional: Directory to use for caching.
tuned_dynamic_ps: list of bools, same length as dynamic_goals_in, optional
max_workers: int or None, optional: Number of processes. If max_workers is None then concurrent.futures.ProcessPoolExecutor defaults to using the number of processors of the machine. N.B. If max_workers=None and running on supercomputer clusters with multiple nodes, this may default to the number of processors on a single node and therefore there will be no speedup from multiple nodes (must specify manually in this case).

Returns

results: pandas data frame

results data frame. Contains rows:

mean [dynamic goal]: mean calculation result for standard nested
sampling and dynamic nested sampling with each input dynamic goal.

std [dynamic goal]: standard deviation of results for standard
nested sampling and dynamic nested sampling with each input dynamic goal.

gain [dynamic goal]: the efficiency gain (computational speedup)
from dynamic nested sampling compared to standard nested sampling. This equals (variance of standard results) / (variance of dynamic results); see the dynamic nested sampling paper for more details.

perfectns.results_tables.merged_dynamic_results(dim_scale_list, likelihood_list, settings, estimator_list, \*\*kwargs)[source]¶

Wrapper for running get_dynamic_results for many different likelihood, dimension and prior scales, and merging the output into a single data frame. See get_dynamic_results doccumentation for more details.

Parameters

dim_scale_list: list of tuples: (dim, prior_scale) pairs to run
likelihood_list: list of likelihood objects
settings_in: PerfectNSSettings object
estimator_list: list of estimator objects
n_run: int, optional: number of runs for use with each setting.
dynamic_goals_in: list of floats, optional: which dynamic goals to test
(remaining kwargs passed to get_dynamic_results)

Returns

results: pandas data frame

maths_functions¶

Mathematical functions.

perfectns.maths_functions.analytic_logx_terminate(settings)[source]¶

Find logx_terminate analytically by assuming all likelihood at very low X approximately equals the maximum likelihood. This approximation breaks down in very high dimensions.

Parameters

settings: PerfectNSSettings object

perfectns.maths_functions.gaussian_logx_given_r(r, sigma, n_dim)[source]¶

Returns logx coordinate corresponding to r values for a Gaussian prior with the specificed standard deviation and dimension

Uses mpmath package for arbitary precision.

Parameters

r: float or numpy array: Radial coordinates at which to evaluate logx.
sigma: float: Standard deviation of Gaussian.
n_dim: int: Number of dimensions.

Returns

logx: float or numpy array: Logx coordinates corresponding to input radial coordinates.

perfectns.maths_functions.gaussian_r_given_logx(logx, sigma, n_dim)[source]¶

Returns r coordinate corresponding to logx values for a Gaussian prior with the specificed standard deviation and dimension.

This uses scipy.special.gammaincinv and requires exponentiating logx, so numerical errors occur with very low logx values.

Parameters

logx: float or numpy array: Logx coordinates at which to work out r.
sigma: float: Standard deviation of Gaussian.
n_dim: int: Number of dimensions.

Returns

float or numpy array: Radial coordinates corresponding to input log X values.

perfectns.maths_functions.log_cauchy_given_r(r, sigma, n_dim)[source]¶

Returns the natural log of a normalised, uncorrelated Cauchy distribution with 1 degree of freedom.

Parameters

r: float or numpy array
sigma: float
n_dim: int

Returns

logl: float or numpy array: Loglikelihood values corresponding to input radial coordinates.

perfectns.maths_functions.log_exp_power_given_r(r, sigma, n_dim, b=0.5)[source]¶

Returns the natural log of an exponential power distribution. This equals a Gaussian distribution when b=1.

Parameters

r: float or numpy array
sigma: float
n_dim: int
b: float, optional

Returns

logl: float or numpy array: Loglikelihood values corresponding to input radial coordinates.

perfectns.maths_functions.log_gaussian_given_r(r, sigma, n_dim)[source]¶

Returns the natural log of a normalised, uncorrelated Gaussian likelihood with equal variance in all n_dim dimensions.

Parameters

r: float or numpy array
sigma: float
n_dim: int

Returns

logl: float or numpy array: Loglikelihood values corresponding to input radial coordinates.

perfectns.maths_functions.logx_terminate_bound(logl_max, termination_fraction, logz_analytic)[source]¶

Find a lower bound logx_terminate analytically by assuming all likelihood at very low X approximately equals the maximum likelihood. This approximation breaks down in very high dimensions and the true logx terminate required will be larger.

We want:

Z_term = termination_fraction * Z_analytic: = int_0^Xterm L(X) dX approx= Xterm L_max,

so logx_term = log(termination_fraction) + log(Z_analytic) - logl_max

Parameters

logl_max: float: Maximum loglikelihood.
termination_frac: float: Fraction of posterior mass remaining at termination.
logz_analytic: float: True value of log evidence.

Returns

float

perfectns.maths_functions.nsphere_logvol(dim, radius=1.0)[source]¶

Returns the natural log of the hypervolume of a unit nsphere of specified dimension. Useful for very high dimensions. Formulae is from https://en.wikipedia.org/wiki/N-sphere#Volume_and_surface_area

Parameters

dim: int
radius: float, optional

Returns

float

perfectns.maths_functions.nsphere_logx_given_r(r, r_max, n_dim)[source]¶

Finds logx assuming the prior is an n-dimensional sphere co-centred with a spherically symmetric likelihood. This will return an answer of the same type (float or numpy array) as the input {r}.

Parameters

r: float or numpy array
r_max: float
n_dim: int

Returns

logx: float or numpy array.: Logx coordinates corresponding to input radial coordinates.

perfectns.maths_functions.nsphere_r_given_logx(logx, r_max, n_dim)[source]¶

Finds r coordinates given input logx values for a uniform prior within an n-dimensional sphere co-centred with a spherically symmetric likelihood. This will return an answer of the same type (float or numpy array) as the input {logx}.

Parameters

logx: float or numpy array
r_max: float: Boundry of the spherically symmetric prior.
n_dim: int: Number of dimensions
Returns
r: float or numpy array: Radial coordinates corresponding to logx.

perfectns.maths_functions.r_given_log_cauchy(logl, sigma, n_dim)[source]¶

Returns the radius for a given logl of a normalised, uncorrelated Cauchy distribution with 1 degree of freedom.

Parameters

logl: float or numpy array
sigma: float
n_dim: int

Returns

r: float or numpy array: Radial coordinates corresponding to input logl values.

perfectns.maths_functions.r_given_log_exp_power(logl, sigma, n_dim, b=0.5)[source]¶

Returns the natural log of an exponential power distribution. This equals a Gaussian distribution when b=1.

Parameters

logl: float or numpy array
sigma: float
n_dim: int
b: float, optional

Returns

r: float or numpy array: Radial coordinates corresponding to input logl values.

perfectns.maths_functions.r_given_log_gaussian(logl, sigma, n_dim)[source]¶

Returns the radius of a given logl for a normalised, uncorrelated Gaussian with equal variance in all n_dim dimensions.

Parameters

logl: float or numpy array
sigma: float
n_dim: int

Returns

r: float or numpy array: Radial coordinates corresponding to input logl values.

perfectns.maths_functions.sample_nsphere_shells(r, n_dim, n_sample)[source]¶

Wrapper calling either sample_nsphere_shells_normal or sample_nsphere_shells_beta depending on the dimension and number of samples required.

See the docstrings of sample_nsphere_shells_normal and sample_nsphere_shells_beta for more information.

perfectns.maths_functions.sample_nsphere_shells_beta(r, n_dim, n_sample=None)[source]¶

Given some 1d numpy array of radial coordinates r, return a numpy array of sample coordinates on the hyperspherical shells with each radial coordinate.

Sample single parameters on n_dim-dimensional sphere independently, as as described in Section 3.2 of ‘Sampling Errors in nested sampling parameter estimation’ (Higson et al., 2018).

NB if each parameter is sampled independently and combined into a vector, that vector will not have a magnitude r.

The n_sample argument can be used to set the number of parameters for which samples are returned (by symmetry they all have the same distribution). This is useful for saving memory in high dimensions.

This function is much quicker than sample_nsphere_shells_normal when n_dim is high and n_sample is low.

Parameters

r: 1d numpy array
n_dim: int
n_sample: int or None, optional: Number of parameters to include in output for each sample.

Returns

thetas: 2d numpy array: Each row is a random sample from the shells defined by the corresponding input r coordinate.

perfectns.maths_functions.sample_nsphere_shells_normal(r, n_dim, n_sample=None)[source]¶

Given some 1d numpy array of radial coordinates r, return a numpy array of sample coordinates on the hyperspherical shells with each radial coordinate.

This works by using the symmetry of the normal distribution to sample isotropically, then normalising each set of samples to lie on a spherical shell of the correct radius.

The n_sample argument can be used to set the number of parameters for which samples are returned (by symmetry they all have the same distribution). This is useful for saving memory in high dimensions.

Parameters

r: 1d numpy array
n_dim: int
n_sample: int or None, optional: Number of parameters to include in output for each sample.

Returns

thetas: 2d numpy array: Each row is a random sample from the shells defined by the corresponding input r coordinate.

cached_gaussian_prior¶

Contains helper functions for the ‘gaussian_cached’ prior.

This is needed as the ‘gaussian’ prior suffers from overflow errors and numerical instability for very low values of X, which are reached in high dimensional problems.

perfectns.cached_gaussian_prior.interp_r_logx_dict(n_dim, prior_scale, \*\*kwargs)[source]¶

Generate a dictionary containing arrays of logx and r values for use in interpolation, as well as a record of the settings used.

Parameters

n_dim: int: Number of dimensions.
prior_scale: float: Gaussian prior’s standard deviation.
logx_min: float: Values for interpolation are generated between logx_min and logx_max.
save_dict: bool, optional: Whether or not to cache the output dictionary.
cache_dir: str, optional: Directory in which to cache output if save_dict is True.
interp_density: float: Number of points to include in interpolation arrays per unit of logx (the number of points is cast to int so this can be a fraction).
logx_max: float, optional: Values for interpolation are generated between logx_min and logx_max.

Returns

interp_dict: dict