Detailed API documentation

This page documents the different modules and functions in the perfectns package.

settings

Defines base class which holds settings for performing perfect nested sampling. This includes likelihood and prior objects as well as the parameters controlling how the calculation is performed - for example the number of live points and whether or not dynamic nested sampling is to be used.

class perfectns.settings.PerfectNSSettings(**kwargs)[source]

Controls how Perfect Nested Sampling is performed.

For more details of the standard nested sampling and dynamic nested sampling algorithms see:

1) ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al. 2019). 2) ‘Sampling errors in nested sampling parameter estimation’ (Higson et al. 2018).

Parameters
n_dim: int

Dimension of the likelihood and prior.

prior: object

A prior object. See priors.py for more details.

likelihood: object

A likelihood object. See likelihoods.py for more details.

nlive_const: int

The number of live points for standard nested sampling.

termination_fraction: float

Standard nested sampling runs will terminate when the posterior mass remaining (estimated from the live points) is less than termination_fraction times the posterior mass (evidence) in the dead points.

dynamic_goal: float or None

Determines is dynamic nested sampling is used, and if so how improvements in evidence and parameter estimation accuracy are weighted.

n_samples_max: int or None

Number of samples after which dynamic nested sampling will terminate. If this is None the number of samples to use is chosen using nlive_const.

ninit: int

Number of live points used in dynamic nested sampling for the initial exploratory standard nested sampling run.

nbatch: int

Number of threads dynamic nested sampling adds before each recalculation of points’ importances to the calculation.

dynamic_fraction: float

Dynamic nested sampling adds more samples wherever points’ importances are greater than dynamic_fraction times the largest importance.

tuned_dynamic_p: bool

Determines if dynamic nested sampling is tuned for a specific parameter estimation problem.

Methods

get_settings_dict(self)

Returns a dictionary containing settings information.

logl_given_logx(self, logx)

Maps input logx values to loglikelihoods.

logl_given_r(self, r)

Maps input radial coordinates to loglikelihood values.

logx_given_logl(self, logl)

Maps input loglikelihoods to logx values.

logx_given_r(self, r)

Maps input radial coordinates to logx values.

logz_analytic(self)

If available gives an analytically calculated value for the log evidence for the likelihood and prior (useful for checking results).

r_given_logl(self, logl)

Maps input loglikelihood values to radial coordinates.

r_given_logx(self, logx)

Maps input logx values to radial coordinates.

save_name(self[, include_dg, …])

Make a standard save name format for a given settings configoration.

get_settings_dict(self)[source]

Returns a dictionary containing settings information. The names and parameters of the likelihoods and priors are stored instead of the objects themselves so they can be saved with pickle.

Returns
settings_dict: dict
logl_given_logx(self, logx)[source]

Maps input logx values to loglikelihoods.

Parameters
logx: float or numpy array
n_dim: int
Returns
logl: same type and size as logx
logl_given_r(self, r)[source]

Maps input radial coordinates to loglikelihood values.

Parameters
r: float or numpy array
n_dim: int
Returns
logl: same type and size as r
logx_given_logl(self, logl)[source]

Maps input loglikelihoods to logx values.

Parameters
logl: float or numpy array
n_dim: int
Returns
logx: same type and size as logl
logx_given_r(self, r)[source]

Maps input radial coordinates to logx values.

Parameters
r: float or numpy array
n_dim: int
Returns
logx: same type and size as r
logz_analytic(self)[source]

If available gives an analytically calculated value for the log evidence for the likelihood and prior (useful for checking results).

This functionality is stored in the likelihood object. If it has not been set up then an error message is printed and nothing is returned.

Returns
float or None

True value of logz or None if it is not available.

r_given_logl(self, logl)[source]

Maps input loglikelihood values to radial coordinates.

Parameters
r: float or numpy array
n_dim: int
Returns
logl: same type and size as r
r_given_logx(self, logx)[source]

Maps input logx values to radial coordinates.

Parameters
logx: float or numpy array
n_dim: int
Returns
r: same type and size as logx
save_name(self, include_dg=True, include_samples_max=False)[source]

Make a standard save name format for a given settings configoration.

Parameters
include_dg: bool, optional

Whether or not to include the dynamic_goal setting in save_name.

include_samples_max: bool, optional

Whether or not to include the n_samples_max setting in save_name.

Returns
save_name: str

nested_sampling

Functions which perform standard and dynamic nested sampling runs and generate samples for use in evidence calculations and parameter estimation.

Nested sampling runs are stored a format compatible with the nestcheck package.

perfectns.nested_sampling.dict_given_samples_array(samples, thread_min_max)[source]

Converts an array of information about samples back into a dictionary.

Parameters
samples: numpy array

Numpy array containing columns [logl, r, logx, thread label, change in nlive at sample, (thetas)] with each row representing a single sample.

thread_min_max’: numpy array, optional

2d array with a row for each thread containing the likelihoods at which it begins and ends. Needed to calculate nlive_array (otherwise this is set to None).

Returns
ns_run: dict

Nested sampling run dictionary corresponding to the samples array. Contains keys: ‘logl’, ‘r’, ‘logx’, ‘thread_label’, ‘nlive_array’, ‘theta’ N.B. this does not contain a record of the run’s settings.

perfectns.nested_sampling.generate_dynamic_run(settings)[source]

Generate a dynamic nested sampling run. For details of the dynamic nested sampling algorithm, see ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

The run terminates when the number of samples reaches some limit settings.n_samples_max. If this is not set, the function will estimate the number of samples that a standard nested sampling run with settings.nlive_const would use from the number of samples in the initial exploratory run.

Parameters
settings: PerfectNSSettings object

settings.dynamic_goal controls whether the algorithm aims to increase parameter estimation accuracy (dynamic_goal=1), evidence accuracy (dynamic_goal=0) or places some weight on both.

Returns
dict

Nested sampling run dictionary containing information about the run’s posterior samples and a record of the settings used. See docstring for generate_ns_run for more details.

perfectns.nested_sampling.generate_ns_run(settings, random_seed=None)[source]

Performs perfect nested sampling calculation and returns a nested sampling run in the form of a dictionary.

This function is just a wrapper around the generate_standard_run (performs standard nested sampling) and generate_dynamic_run (performs dynamic nested sampling) which are chosen depending on the input settings.

Parameters
settings: PerfectNSSettings object
random_seed: None, bool or int, optional

Set numpy random seed. Default is to use None (so a random seed is chosen from the computer’s internal state) to ensure reliable results when multiprocessing. Can set to an integer or to False to not edit the seed.

Returns
dict

Nested sampling run dictionary containing information about the run’s posterior samples and a record of the settings used. These are as separate arrays giving values for points in order of increasing likelihood. Keys:

‘settings’: dict recording settings used. ‘logl’: 1d array of log likelihoods. ‘r’: 1d array of radial coordinates. ‘logx’: 1d array of logx coordinates. ‘theta’: 2d array, each row is sample coordinate. The number of

co-ordinates saved is controlled by settings.dims_to_sample.

‘nlive_array’: 1d array of the local number of live points at each

sample.

‘thread_min_max’: 2d array containing likelihoods at which each

nested sampling thread begins and ends.

‘thread_labels’: 1d array listing the threads each sample belongs

to.

perfectns.nested_sampling.generate_single_thread(settings, logx_end, thread_label, logx_start=0, keep_final_point=True)[source]

Generates a samples array for a thread (single live point run) between logx_start and logx_end. Settings argument specifies how the calculation is done.

Parameters
settings: PerfectNSSettings object
logx_end: float
thread_label: int

Index labelling the thread.

logx_start: float, optional
keep_final_point: bool, optional

See generate_thread_logx docstring.

perfectns.nested_sampling.generate_standard_run(settings, is_dynamic_initial_run=False)[source]

Performs standard nested sampling using the likelihood and prior specified in settings.

The run terminates when the estimated posterior mass contained in the live points is less than settings.termination_fraction. The evidence in the remaining live points is estimated as

Z_{live} = average likelihood of live points * prior volume remaining

Parameters
settings: PerfectNSSettings object
is_dynamic_initial_run: bool, optional

Set to True if this is the initial exploratory run in dynamic nested sampling.

Returns
run: dict

Nested sampling run dictionary containing information about the run’s posterior samples and a record of the settings used. See docstring for generate_ns_run for more details.

perfectns.nested_sampling.generate_thread_logx(logx_end, logx_start=0, keep_final_point=True)[source]

Generate logx co-ordinates of a new nested sampling thread (single live point run).

Parameters
logx_end: float

Logx value at which run terminates.

logx_start: float, optional.

Logx value at which run starts. 0 corresponds to sampling from the whole prior.

keep_final_point: bool, optional

If False, the final point with logx less than logx_end is removed.

Returns
logx_list: list of floats
perfectns.nested_sampling.get_run_data(settings, n_repeat, \*\*kwargs)[source]

Tests if runs with the specified settings are already cached. If not the runs are generated and saved.

Parameters
settings: PerfectNSSettings object
n_repeat: int

Number of nested sampling runs to generate.

parallel: bool, optional

Should runs be generated in parallel?

max_workers: int or None, optional

Number of processes. If max_workers is None then concurrent.futures.ProcessPoolExecutor defaults to using the number of processors of the machine. N.B. If max_workers=None and running on supercomputer clusters with multiple nodes, this may default to the number of processors on a single node and therefore there will be no speedup from multiple nodes (must specify manually in this case).

load: bool, optional

Should previously saved runs be loaded? If False, new runs are generated.

save: bool, optional

Should any new runs generated be saved?

cache_dir: str, optional

Directory for caching

overwrite_existing: bool, optional

if a file exists already but we generate new run data, should we overwrite the existing file when saved?

check_loaded_settings: bool, optional

if we load a cached file, should we check if the loaded file’s settings match the current settings (and generate fresh runs if they do not)?

random_seeds: list, optional

random_seed arguments for each call of generate_ns_run.

Returns
run_list

list of n_repeat nested sampling runs.

perfectns.nested_sampling.min_max_importance(importance, samples, settings)[source]

Find the logl and logx values at which to start and end additional dynamic nested sampling threads.

Parameters
importance: 1d numpy array

Relative importances of samples.

samples: 2d numpy array

See dict_given_samples_arrry docstring for details of columns.

settings: PerfectNSSettings object
Returns
list of two floats

Contains logl_min and logl_max defining the start and end of the region from which new points should be sampled.

list of two floats

Logx values corresponding to logl_min and logl_max.

perfectns.nested_sampling.p_importance(theta, w_relative, tuned_dynamic_p=False, tuning_type='theta1')[source]

Calculate the relative importance of each point for parameter estimation.

For more details see ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

Parameters
theta: 2d numpy array

Each row gives parameter values of samples.

w_relative: 1d numpy array

Relative point weights.

tuned_dynamic_p: bool, optional

Whether or not to tune for a specific parameter estimation problem. See the dynamic nested sampling paper for more details.

tuning_type: str, optional

Which parameter estimation problem to tune for. Only used if tuned_dynamic_p is True. So far only set up to tune for the mean of the first parameter.

Returns
importance: 1d numpy array

Relative point importances. Normalised so the biggest value in the array is equal to 1.

perfectns.nested_sampling.point_importance(samples, thread_min_max, settings, simulate=False)[source]

Calculate the relative importance of each point for use in the dynamic nested sampling algorithm.

For more details see ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

Parameters
samples: 2d numpy array

See dict_given_samples_arrry docstring for details of columns.

thread_min_max: 2d numpy array

First column is starting logl of each thread and second column is ending logl.

settings: PerfectNSSettings object
simulate: bool, optional

Passed to nestcheck.ns_run_utils.get_logw.

Returns
importance: 1d numpy array

Relative point importances of the rows of the input samples array. Normalised so the biggest value in the array is equal to 1.

perfectns.nested_sampling.samples_array_given_run(ns_run)[source]

Converts information on samples in a nested sampling run dictionary into a numpy array representation. This allows fast addition of more samples and recalculation of nlive.

Parameters
ns_run: dict

Nested sampling run dictionary. Contains keys: ‘logl’, ‘r’, ‘logx’, ‘thread_label’, ‘nlive_array’, ‘theta’

Returns
samples: numpy array

Numpy array containing columns [logl, r, logx, thread label, change in nlive at sample, (thetas)] with each row representing a single sample.

perfectns.nested_sampling.z_importance(w_relative, nlive)[source]

Calculate the relative importance of each point for evidence calculation.

For more details see ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

Parameters
w_relative: 1d numpy array

Relative point weights.

nlive: 1d numpu array

Number of live points.

Returns
importance: 1d numpy array

Relative point importances. Normalised so the biggest value in the array is equal to 1.

likelihoods

Classes representing spherically symmetric likelihoods.

Each likelihood class must contain a member function giving the log likelihood as a function of the radial coordinate r = |theta| and the number of dimensions

def logl_given_r(self, r, n_dim):

The number of dimensions is not stored in this class but in the PerfectNSSettings object to ensure it is the same for the likelihood and the prior.

Likelihood classes may also optionally contain the inverse function

def r_given_logl(self, logl, n_dim):

(although this is not needed to generate nested sampling runs). Another optional function is the analytic value of the log evidence for some given prior and dimension, which is useful for checking results:

def logz_analytic(self, prior, n_dim):

class perfectns.likelihoods.Cauchy(likelihood_scale=1)[source]

Spherically symmetric Cauchy likelihood.

Methods

logl_given_r(self, r, n_dim)

Get loglikelihood values for input radial coordinates.

r_given_logl(self, logl, n_dim)

Get the radial coordinates corresponding to the input loglikelihood values.

logl_given_r(self, r, n_dim)[source]

Get loglikelihood values for input radial coordinates.

Parameters
r: float or numpy array

Radial coordinates.

n_dim: int

Number of dimensions.

Returns
logl: same type and size as r

Loglikelihood values.

r_given_logl(self, logl, n_dim)[source]

Get the radial coordinates corresponding to the input loglikelihood values.

Parameters
logl: float or numpy array

Loglikelihood values.

n_dim: int

Number of dimensions.

Returns
r: same type and size as logl

Radial coordinates.

class perfectns.likelihoods.ExpPower(likelihood_scale=1, power=2)[source]

Spherically symmetric exponential power likelihood. When power=1, this is the same as the Gaussian likelihood.

Methods

logl_given_r(self, r, n_dim)

Get loglikelihood values for input radial coordinates.

r_given_logl(self, logl, n_dim)

Get the radial coordinates corresponding to the input loglikelihood values.

logl_given_r(self, r, n_dim)[source]

Get loglikelihood values for input radial coordinates.

Parameters
r: float or numpy array

Radial coordinates.

n_dim: int

Number of dimensions.

Returns
logl: same type and size as r

Loglikelihood values.

r_given_logl(self, logl, n_dim)[source]

Get the radial coordinates corresponding to the input loglikelihood values.

Parameters
logl: float or numpy array

Loglikelihood values.

n_dim: int

Number of dimensions.

Returns
r: same type and size as logl

Radial coordinates.

class perfectns.likelihoods.Gaussian(likelihood_scale=1.0)[source]

Spherically symmetric Gaussian likelihood.

Methods

logl_given_r(self, r, n_dim)

Get loglikelihood values for input radial coordinates.

logz_analytic(self, prior, n_dim)

Returns analytic value of the log evidence for the input prior and dimension if it is available.

r_given_logl(self, logl, n_dim)

Get the radial coordinates corresponding to the input loglikelihood values.

logl_given_r(self, r, n_dim)[source]

Get loglikelihood values for input radial coordinates.

Parameters
r: float or numpy array

Radial coordinates.

n_dim: int

Number of dimensions.

Returns
logl: same type and size as r

Loglikelihood values.

logz_analytic(self, prior, n_dim)[source]

Returns analytic value of the log evidence for the input prior and dimension if it is available.

If not set up for this prior an AssertionError is thrown (this is caught in functions which check analytical values where they are available).

Parameters
prior: object
n_dim: int

Number of dimensions.

Returns
float

Analytic value of log Z for this likelihood given the prior and number of dimensions.

r_given_logl(self, logl, n_dim)[source]

Get the radial coordinates corresponding to the input loglikelihood values.

Parameters
logl: float or numpy array

Loglikelihood values.

n_dim: int

Number of dimensions.

Returns
r: same type and size as logl

Radial coordinates.

priors

Classes representing spherically symmetric priors.

Each prior class must contain a member function giving the radial coordinate r = |theta| as a function of the log fraction of the prior volume remaining and the dimension

def r_given_logx(self, logx, n_dim):

The number of dimensions is not stored in this class but in the PerfectNSSettings object to ensure it is the same for the likelihood and the prior.

Prior classes may also optionally contain the inverse function

def logx_given_r(self, r, n_dim):

(although this is not needed to generate nested sampling runs).

class perfectns.priors.Gaussian(prior_scale)[source]

Spherically symmetric Gaussian prior.

Methods

logx_given_r(self, r, n_dim)

Maps input radial coordinates to logx values.

r_given_logx(self, logx, n_dim)

Maps input logx values to radial coordinates.

logx_given_r(self, r, n_dim)[source]

Maps input radial coordinates to logx values.

Parameters
r: float or numpy array
n_dim: int
Returns
logx: same type and size as r
r_given_logx(self, logx, n_dim)[source]

Maps input logx values to radial coordinates.

Parameters
logx: float or numpy array
n_dim: int
Returns
r: same type and size as logx
class perfectns.priors.GaussianCached(prior_scale, **kwargs)[source]

Spherically symmetric uniform prior. The scipy inverse gamma function is not numerically stable so cache r_given_logx by using the mpmath logx_given_r and linearly interpolating.

Methods

check_cache(self, n_dim)

Helper function which checks that the input dimension matches that of the cached interpolation function, and if needed recalculates it.

logx_given_r(self, r, n_dim)

Maps input radial coordinates to logx values.

r_given_logx(self, logx, n_dim)

Maps input logx values to radial coordinates.

check_cache(self, n_dim)[source]

Helper function which checks that the input dimension matches that of the cached interpolation function, and if needed recalculates it.

Parameters
n_dim: int

Number of dimensions

logx_given_r(self, r, n_dim)[source]

Maps input radial coordinates to logx values.

Parameters
r: float or numpy array
n_dim: int
Returns
logx: same type and size as r
r_given_logx(self, logx, n_dim)[source]

Maps input logx values to radial coordinates.

Parameters
logx: float or numpy array
n_dim: int
Returns
r: same type and size as logx
class perfectns.priors.Uniform(prior_scale)[source]

Spherically symmetric uniform prior.

Methods

logx_given_r(self, r, n_dim)

Maps input radial coordinates to logx values.

r_given_logx(self, logx, n_dim)

Maps input logx values to radial coordinates.

logx_given_r(self, r, n_dim)[source]

Maps input radial coordinates to logx values.

Parameters
r: float or numpy array
n_dim: int
Returns
logx: same type and size as r
r_given_logx(self, logx, n_dim)[source]

Maps input logx values to radial coordinates.

Parameters
logx: float or numpy array
n_dim: int
Returns
r: same type and size as logx

estimators

Contains classes representing quantities which can be calculated from nested sampling run.

Each estimator class should contain a mandatory member function returning the value of the estimator for a nested sampling run:

def __call__(self, ns_run, logw=None, simulate=False):

This allows logw to be provided if many estimators are being calculated from the same run so logw is only calculated once. Otherwise logw is calculated from the run if required.

They may also optionally contain a function giving its analytical value for some given set of calculation settings (for use in checking results):

def analytical(self, settings):

as well as helper functions. Estimators should also contain class variables:

name: str

used for results tables.

latex_name: str

used for plotting results diagrams.

class perfectns.estimators.CountSamples[source]

Number of samples in run.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

class perfectns.estimators.EstimatorBase(func, **kwargs)[source]

Base class for estimators.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

class perfectns.estimators.LogZ[source]

Natural log of Bayesian evidence.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

analytical(settings)

Returns analytical value of estimator given settings.

static analytical(settings)[source]

Returns analytical value of estimator given settings.

class perfectns.estimators.ParamCred(probability, param_ind=0)[source]

One-tailed credible interval on the value of a single parameter (component of theta). By symmetry all parameters have the same distribution.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

analytical(self, settings)

Returns analytical value of estimator given settings.

analytical(self, settings)[source]

Returns analytical value of estimator given settings.

class perfectns.estimators.ParamMean(param_ind=0)[source]

Mean of a single parameter (single component of theta). By symmetry all parameters have the same distribution.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

analytical(settings)

Returns analytical value of estimator given settings.

ftilde(logx, settings)

Helper function for finding the analytic value of the estimator.

static analytical(settings)[source]

Returns analytical value of estimator given settings.

static ftilde(logx, settings)[source]

Helper function for finding the analytic value of the estimator. See “check_by_integrating” docstring for more details.

ftilde(X) is mean of f(theta) on the iso-likelihood contour L(theta) = L(X).

class perfectns.estimators.ParamSquaredMean(param_ind=0)[source]

Mean of the square of single parameter (second moment of its posterior distribution). By symmetry all parameters have the same distribution.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

analytical(self, settings)

Returns analytical value of estimator given settings.

ftilde(logx, settings)

Helper function for finding the analytic value of the estimator.

analytical(self, settings)[source]

Returns analytical value of estimator given settings.

static ftilde(logx, settings)[source]

Helper function for finding the analytic value of the estimator. See “check_by_integrating” docstring for more details.

ftilde(X) is mean of f(theta) on the iso-likelihood contour L(theta) = L(X).

class perfectns.estimators.RCred(probability, from_theta=False)[source]

One-tailed credible interval on the value of |theta|.

Methods

__call__(self, ns_run[, logw, simulate])

Returns estimator value for run.

class perfectns.estimators.RMean(from_theta=False)[source]

Mean of |theta| (the radial distance from the centre).

Methods

__call__(self, ns_run[, logw, simulate])

Overwrite __call__ from nestcheck r_mean to allow use of perfectns’ ‘r’ key if from_theta=False, and to check all the dimensions have been sampled if from_theta=True.

analytical(self, settings)

Returns analytical value of estimator given settings.

ftilde(logx, settings)

Helper function for finding the analytic value of the estimator.

analytical(self, settings)[source]

Returns analytical value of estimator given settings.

static ftilde(logx, settings)[source]

Helper function for finding the analytic value of the estimator. See “check_by_integrating” docstring for more details.

ftilde(X) is mean of f(theta) on the iso-likelihood contour L(theta) = L(X).

class perfectns.estimators.Z[source]

Bayesian evidence.

Methods

__call__(self, \*args, \*\*kwargs)

Returns estimator value for run.

analytical(settings)

Returns analytical value of estimator given settings.

static analytical(settings)[source]

Returns analytical value of estimator given settings.

perfectns.estimators.check_by_integrating(ftilde, settings)[source]

Return the true value of the estimator using numerical integration.

Chopin and Robert (2010) show that the expectation of some function f(theta) is given by the integral

int L(X) X ftilde(X) dX / Z,

where ftilde(X) is mean of f(theta) on the iso-likelihood contour L(theta) = L(X).

Parameters
ftilde: function
settings: PerfectNSSettings object
Returns
float

The estimator’s true value.

perfectns.estimators.check_integrand(logx, ftilde, settings)[source]

Helper function to return integrand L(X) X ftilde(X) for checking estimator values by numerical integration. Note that the integral must be normalised by multiplying by a factor of (1/Z).

Parameters
logx: 1d numpy array

Values on which to evaluate integrand.

ftilde: function

See check_by_integrating docstring for more details.

settings: PerfectNSSettings object
Returns
1d numpy array

Integrand evaluated at input logx coordinates.

perfectns.estimators.get_true_estimator_values(estimators, settings)[source]

Calculates analytic values for estimators given the likelihood and prior in settings. If there is no method for calculating the values then np.nan is returned.

Parameters
estimators: estimator object or list of estimator objects
settings: PerfectNSSettings object
Returns
output: np.array of size (len(estimators),) if estimators is a list, float

otherwise.

plots

Plotting functions.

perfectns.plots.cdf_given_logx(estimator, value, logx, settings)[source]

Calculate CDF at where each column represents the CDF of the distribution of ftheta values on some iso-likelihood contour L = L(X).

Parameters
estimator: estimator object

Function whose values we are getting the CDF of.

value: numpy array

Function values at which to evaluate the CDF.

logx: numpy array of same size and shape as value.

Logx values specifying contours - we calculate the CDF on each contour.

settings: PerfectNSSettings object
Returns
cdf: numpy array of same size and shape as value and logx
perfectns.plots.plot_dynamic_nlive(dynamic_goals, settings_in, \*\*kwargs)[source]

Plot the allocations of live points as a function of logX for different dynamic_goal settings. Plots also include analytically calculated distributions of relative posterior mass and relative posterior mass remaining.

Parameters
dynamic_goals: list of ints or None

dynamic_goal setting values to plot.

settings_in: PerfectNSSettings object
tuned_dynamic_ps: list of bools, optional

tuned_dynamic_ps settings corresponding to each dynamic goal settings. Defaults to False for all dynamic goals.

logx_min: float, optional

Lower limit of logx axis. If not specified this is set to the lowest logx reached by any of the runs.

load: bool, optional

Should the nested sampling runs be loaded from cache if available?

save: bool, optional

Should the nested sampling runs be cached?

ymax: bool, optional

Maximum value for plot’s nlive axis (yaxis).

n_run: int, optional

How many runs to plot for each dynamic goal.

npoints: int, optional

How many points to have in the logx array used to calculate and plot analytical weights.

figsize: tuple, optional

Size of figure in inches.

Returns
fig: matplotlib figure
perfectns.plots.plot_parameter_logx_diagram(settings, ftheta, \*\*kwargs)[source]

Plots parameter estimation diagram of the type described in Section 3.1 and shown in Figure 3 of “Sampling errors in nested sampling parameter estimation” (Higson et al., 2018).

Parameters
settings: PerfectNSSettings object
ftheta: estimator object

function of parameters to plot

ymin: float, optional

y axis (ftheta) min.

ymax: float, optional

y axis (ftheta) max.

ylabel: str, optional

y axis (ftheta) label.

logx_min: float, optional

Lower limit of logx axis.

x_points: int, optional

How many logx points to use in the plots.

y_points: int, optional
figsize: tuple, optional

Size of figure in inches.

Returns
fig: matplotlib figure
perfectns.plots.plot_rel_posterior_mass(likelihood_list, prior, dim_list, logx, \*\*kwargs)[source]

Plot analytic distributions of the relative posterior mass for different likelihoods and dimensionalities as a function of logx (this is proportional to L(X)X, where L(X) is the likelihood).

Parameters
logx: 1d numpy array

Logx values at which to calculate posterior mass.

likelihood_list: list of perfectns likelihood objects

Likelihoods to plot.

prior: perfectns prior object
dim_list: list of ints

Dimensions to plot for each likelihood.

figsize: tuple, optional

Size of figure in inches.

linestyles: list, optional

List of different linestyles to use for each likelihood.

Returns
fig: matplotlib figure

Figure showing relative posterior masses of input likelihoods

perfectns.plots.posterior_cdf(estimator, values, logx, settings)[source]

Calculates the 1d posterior cumulative distribution function (CDF) for some estimator given the likelihood and prior settings.

Parameters
estimator: estimator object

Function whose values we are getting the CDF of.

value: 1d numpy array

Function values at which to evaluate the CDF.

logx: 1d numpy array

Logx values over which to numericallly marginalise the probability distribution.

settings: PerfectNSSettings object
Returns
cdf: numpy array of same size and shape as values
perfectns.plots.sigma_given_cdf(cdf)[source]

Maps cdf values in [0,1] to number of standard deviations from the median. scipy.special.erfinv is defined in [-1,+1] mapping to [-inf,+inf]. We want to map a CDF to the number of sigma from the median - i.e. from [0,+1] to [0,+inf] - so we need the argument (2*cdf - 1), and to take abs to get a positive answer. erf is definted in terms of int e^(-t^2) which corresponds to a Gaussian with sigma = 1/sqrt(2). So we must multiply sigma_temp by sqrt(2)

Parameters
cdf: float or numpy array
Returns
float or numpy array

Same type and shape as input cdf.

results_tables

Functions used to generate results tables.

Used for results in ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019).

perfectns.results_tables.get_bootstrap_results(n_run, n_simulate, estimator_list, settings, \*\*kwargs)[source]

Generate data frame showing the standard deviations of the results of repeated calculations and estimated sampling errors from bootstrap resampling.

This function was used for Table 5 in ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019). See the paper for more details.

Parameters
n_run: int

how many runs to use

n_simulate: int

how many times to resample the nested sampling run in each bootstrap standard deviation estimate.

estimator_list: list of estimator objects
settings: PerfectNSSettings object
load: bool, optional

should run data and results be loaded if available?

save: bool, optional

should run data and results be saved?

parallel: bool, optional
cache_dir: str, optional

Directory to use for caching.

add_sim_method: bool, optional

should we also calculate standard deviations using the simulated weights method for comparison with bootstrap resampling? This method is inaccurate for parameter estimation.

n_simulate_ci: int, optional

how many times to resample the nested sampling run in each bootstrap credible interval estimate. These may require more simulations than the standard deviation estimate.

run_random_seeds: list, optional

list of random seeds to use for generating runs.

n_run_ci: int, optional

how many runs to use for each credible interval estimate. You may want to set this to lower than n_run if n_simulate_ci is large as otherwise the credible interval estimate may take a long time.

cred_int: float, optional

one-tailed credible interval to calculate

max_workers: int or None, optional

Number of processes. If max_workers is None then concurrent.futures.ProcessPoolExecutor defaults to using the number of processors of the machine. N.B. If max_workers=None and running on supercomputer clusters with multiple nodes, this may default to the number of processors on a single node and therefore there will be no speedup from multiple nodes (must specify manually in this case).

Returns
results: pandas data frame

results data frame. Contains two columns for each estimator - the second column (with ‘_unc’ appended to the title) shows the numerical uncertainty in the first column. Contains rows:

true values: analytical values of estimators for this likelihood

and posterior if available

repeats mean: mean calculation result repeats std: standard deviation of calculation results bs std / repeats std: mean bootstrap standard deviation estimate as

a fraction of the standard deviation of repeated results.

bs estimate % variation: standard deviation of bootstrap estimates

as a percentage of the mean estimate.

[only if add sim method is True]:
sim std / repeats std: as for ‘bs std / repeats std’ but with

simulation method standard deviation estimates.

sim estimate % variation: as for ‘bs estimate % variation’ but

with simulation method standard deviation estimates.

bs [cred_int] CI: mean bootstrap credible interval estimate. bs +-1std % coverage: % of calculation results falling within +- 1

mean bootstrap standard deviation estimate of the mean.

bs [cred_int] CI % coverage: % of calculation results which are

less than the mean bootstrap credible interval estimate.

perfectns.results_tables.get_dynamic_results(n_run, dynamic_goals_in, estimator_list_in, settings_in, \*\*kwargs)[source]

Generate data frame showing the standard deviations of the results of repeated calculations and efficiency gains (ratios of variances of results calculations) from different dynamic goals. To make the comparison fair, for dynamic nested sampling settings.n_samples_max is set to slightly below the mean number of samples used by standard nested sampling.

This function was used for Tables 1, 2, 3 and 4, as well as to generate the results shown in figures 6 and 7 of ‘Dynamic nested sampling: an improved algorithm for nested sampling parameter estimation and evidence calculation’ (Higson et al., 2019). See the paper for a more detailed description.

Parameters
n_run: int

how many runs to use

dynamic_goals_in: list of floats

which dynamic goals to test

estimator_list_in: list of estimator objects
settings_in: PerfectNSSettings object
load: bool, optional

should run data and results be loaded if available?

save: bool, optional

should run data and results be saved?

overwrite_existing: bool, optional

if a file exists already but we generate new run data, should we overwrite the existing file when saved?

run_random_seeds: list, optional

list of random seeds to use for generating runs.

parallel: bool, optional
cache_dir: str, optional

Directory to use for caching.

tuned_dynamic_ps: list of bools, same length as dynamic_goals_in, optional
max_workers: int or None, optional

Number of processes. If max_workers is None then concurrent.futures.ProcessPoolExecutor defaults to using the number of processors of the machine. N.B. If max_workers=None and running on supercomputer clusters with multiple nodes, this may default to the number of processors on a single node and therefore there will be no speedup from multiple nodes (must specify manually in this case).

Returns
results: pandas data frame

results data frame. Contains rows:

mean [dynamic goal]: mean calculation result for standard nested

sampling and dynamic nested sampling with each input dynamic goal.

std [dynamic goal]: standard deviation of results for standard

nested sampling and dynamic nested sampling with each input dynamic goal.

gain [dynamic goal]: the efficiency gain (computational speedup)

from dynamic nested sampling compared to standard nested sampling. This equals (variance of standard results) / (variance of dynamic results); see the dynamic nested sampling paper for more details.

perfectns.results_tables.merged_dynamic_results(dim_scale_list, likelihood_list, settings, estimator_list, \*\*kwargs)[source]

Wrapper for running get_dynamic_results for many different likelihood, dimension and prior scales, and merging the output into a single data frame. See get_dynamic_results doccumentation for more details.

Parameters
dim_scale_list: list of tuples

(dim, prior_scale) pairs to run

likelihood_list: list of likelihood objects
settings_in: PerfectNSSettings object
estimator_list: list of estimator objects
n_run: int, optional

number of runs for use with each setting.

dynamic_goals_in: list of floats, optional

which dynamic goals to test

(remaining kwargs passed to get_dynamic_results)
Returns
results: pandas data frame

maths_functions

Mathematical functions.

perfectns.maths_functions.analytic_logx_terminate(settings)[source]

Find logx_terminate analytically by assuming all likelihood at very low X approximately equals the maximum likelihood. This approximation breaks down in very high dimensions.

Parameters
settings: PerfectNSSettings object
perfectns.maths_functions.gaussian_logx_given_r(r, sigma, n_dim)[source]

Returns logx coordinate corresponding to r values for a Gaussian prior with the specificed standard deviation and dimension

Uses mpmath package for arbitary precision.

Parameters
r: float or numpy array

Radial coordinates at which to evaluate logx.

sigma: float

Standard deviation of Gaussian.

n_dim: int

Number of dimensions.

Returns
logx: float or numpy array

Logx coordinates corresponding to input radial coordinates.

perfectns.maths_functions.gaussian_r_given_logx(logx, sigma, n_dim)[source]

Returns r coordinate corresponding to logx values for a Gaussian prior with the specificed standard deviation and dimension.

This uses scipy.special.gammaincinv and requires exponentiating logx, so numerical errors occur with very low logx values.

Parameters
logx: float or numpy array

Logx coordinates at which to work out r.

sigma: float

Standard deviation of Gaussian.

n_dim: int

Number of dimensions.

Returns
float or numpy array

Radial coordinates corresponding to input log X values.

perfectns.maths_functions.log_cauchy_given_r(r, sigma, n_dim)[source]

Returns the natural log of a normalised, uncorrelated Cauchy distribution with 1 degree of freedom.

Parameters
r: float or numpy array
sigma: float
n_dim: int
Returns
logl: float or numpy array

Loglikelihood values corresponding to input radial coordinates.

perfectns.maths_functions.log_exp_power_given_r(r, sigma, n_dim, b=0.5)[source]

Returns the natural log of an exponential power distribution. This equals a Gaussian distribution when b=1.

Parameters
r: float or numpy array
sigma: float
n_dim: int
b: float, optional
Returns
logl: float or numpy array

Loglikelihood values corresponding to input radial coordinates.

perfectns.maths_functions.log_gaussian_given_r(r, sigma, n_dim)[source]

Returns the natural log of a normalised, uncorrelated Gaussian likelihood with equal variance in all n_dim dimensions.

Parameters
r: float or numpy array
sigma: float
n_dim: int
Returns
logl: float or numpy array

Loglikelihood values corresponding to input radial coordinates.

perfectns.maths_functions.logx_terminate_bound(logl_max, termination_fraction, logz_analytic)[source]

Find a lower bound logx_terminate analytically by assuming all likelihood at very low X approximately equals the maximum likelihood. This approximation breaks down in very high dimensions and the true logx terminate required will be larger.

We want:

Z_term = termination_fraction * Z_analytic

= int_0^Xterm L(X) dX approx= Xterm L_max,

so logx_term = log(termination_fraction) + log(Z_analytic) - logl_max

Parameters
logl_max: float

Maximum loglikelihood.

termination_frac: float

Fraction of posterior mass remaining at termination.

logz_analytic: float

True value of log evidence.

Returns
float
perfectns.maths_functions.nsphere_logvol(dim, radius=1.0)[source]

Returns the natural log of the hypervolume of a unit nsphere of specified dimension. Useful for very high dimensions. Formulae is from https://en.wikipedia.org/wiki/N-sphere#Volume_and_surface_area

Parameters
dim: int
radius: float, optional
Returns
float
perfectns.maths_functions.nsphere_logx_given_r(r, r_max, n_dim)[source]

Finds logx assuming the prior is an n-dimensional sphere co-centred with a spherically symmetric likelihood. This will return an answer of the same type (float or numpy array) as the input {r}.

Parameters
r: float or numpy array
r_max: float
n_dim: int
Returns
logx: float or numpy array.

Logx coordinates corresponding to input radial coordinates.

perfectns.maths_functions.nsphere_r_given_logx(logx, r_max, n_dim)[source]

Finds r coordinates given input logx values for a uniform prior within an n-dimensional sphere co-centred with a spherically symmetric likelihood. This will return an answer of the same type (float or numpy array) as the input {logx}.

Parameters
logx: float or numpy array
r_max: float

Boundry of the spherically symmetric prior.

n_dim: int

Number of dimensions

Returns
r: float or numpy array

Radial coordinates corresponding to logx.

perfectns.maths_functions.r_given_log_cauchy(logl, sigma, n_dim)[source]

Returns the radius for a given logl of a normalised, uncorrelated Cauchy distribution with 1 degree of freedom.

Parameters
logl: float or numpy array
sigma: float
n_dim: int
Returns
r: float or numpy array

Radial coordinates corresponding to input logl values.

perfectns.maths_functions.r_given_log_exp_power(logl, sigma, n_dim, b=0.5)[source]

Returns the natural log of an exponential power distribution. This equals a Gaussian distribution when b=1.

Parameters
logl: float or numpy array
sigma: float
n_dim: int
b: float, optional
Returns
r: float or numpy array

Radial coordinates corresponding to input logl values.

perfectns.maths_functions.r_given_log_gaussian(logl, sigma, n_dim)[source]

Returns the radius of a given logl for a normalised, uncorrelated Gaussian with equal variance in all n_dim dimensions.

Parameters
logl: float or numpy array
sigma: float
n_dim: int
Returns
r: float or numpy array

Radial coordinates corresponding to input logl values.

perfectns.maths_functions.sample_nsphere_shells(r, n_dim, n_sample)[source]

Wrapper calling either sample_nsphere_shells_normal or sample_nsphere_shells_beta depending on the dimension and number of samples required.

See the docstrings of sample_nsphere_shells_normal and sample_nsphere_shells_beta for more information.

perfectns.maths_functions.sample_nsphere_shells_beta(r, n_dim, n_sample=None)[source]

Given some 1d numpy array of radial coordinates r, return a numpy array of sample coordinates on the hyperspherical shells with each radial coordinate.

Sample single parameters on n_dim-dimensional sphere independently, as as described in Section 3.2 of ‘Sampling Errors in nested sampling parameter estimation’ (Higson et al., 2018).

NB if each parameter is sampled independently and combined into a vector, that vector will not have a magnitude r.

The n_sample argument can be used to set the number of parameters for which samples are returned (by symmetry they all have the same distribution). This is useful for saving memory in high dimensions.

This function is much quicker than sample_nsphere_shells_normal when n_dim is high and n_sample is low.

Parameters
r: 1d numpy array
n_dim: int
n_sample: int or None, optional

Number of parameters to include in output for each sample.

Returns
thetas: 2d numpy array

Each row is a random sample from the shells defined by the corresponding input r coordinate.

perfectns.maths_functions.sample_nsphere_shells_normal(r, n_dim, n_sample=None)[source]

Given some 1d numpy array of radial coordinates r, return a numpy array of sample coordinates on the hyperspherical shells with each radial coordinate.

This works by using the symmetry of the normal distribution to sample isotropically, then normalising each set of samples to lie on a spherical shell of the correct radius.

The n_sample argument can be used to set the number of parameters for which samples are returned (by symmetry they all have the same distribution). This is useful for saving memory in high dimensions.

Parameters
r: 1d numpy array
n_dim: int
n_sample: int or None, optional

Number of parameters to include in output for each sample.

Returns
thetas: 2d numpy array

Each row is a random sample from the shells defined by the corresponding input r coordinate.

cached_gaussian_prior

Contains helper functions for the ‘gaussian_cached’ prior.

This is needed as the ‘gaussian’ prior suffers from overflow errors and numerical instability for very low values of X, which are reached in high dimensional problems.

perfectns.cached_gaussian_prior.interp_r_logx_dict(n_dim, prior_scale, \*\*kwargs)[source]

Generate a dictionary containing arrays of logx and r values for use in interpolation, as well as a record of the settings used.

Parameters
n_dim: int

Number of dimensions.

prior_scale: float

Gaussian prior’s standard deviation.

logx_min: float

Values for interpolation are generated between logx_min and logx_max.

save_dict: bool, optional

Whether or not to cache the output dictionary.

cache_dir: str, optional

Directory in which to cache output if save_dict is True.

interp_density: float

Number of points to include in interpolation arrays per unit of logx (the number of points is cast to int so this can be a fraction).

logx_max: float, optional

Values for interpolation are generated between logx_min and logx_max.

Returns
interp_dict: dict