mystic.math module documentation

approx module

tools for measuring equality

almostEqual(x, y, tol=1e-18, rel=1e-07)

Returns True if two arrays are element-wise equal within a tolerance.

The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.

Parameters
  • a, b (array_like) – Input arrays to compare.

  • rtol (float) – The relative tolerance parameter (see Notes).

  • atol (float) – The absolute tolerance parameter (see Notes).

Returns

y – Returns True if the two arrays are equal within the given tolerance; False otherwise. If either array contains NaN, then False is returned.

Return type

bool

Notes

If the following equation is element-wise True, then almostEqual returns True.

absolute(a - b) <= (atol + rtol * absolute(b))

Examples

>>> almostEqual([1e10,1.2345678], [1e10,1.2345677])
True
>>> almostEqual([1e10,1.234], [1e10,1.235])
False
approx_equal(x, y, *args, **kwargs)

Return True if x and y are approximately equal, otherwise False.

Parameters
  • x (object) – first object to compare

  • y (object) – second object to compare

  • tol (float, default=1e-18) – absolute error

  • rel (float, default=1e-7) – relative error

Returns

True if x and y are equal within tolerance, otherwise returns False.

Notes

If x and y are floats, return True if y is within either absolute error tol or relative error rel of x. You can disable either the absolute or relative check by passing None as tol or rel (but not both).

For any other objects, x and y are checked in that order for a method __approx_equal__, and the result of that is returned as a bool. Any optional arguments are passed to the __approx_equal__ method.

__approx_equal__ can return NotImplemented to signal it doesn’t know how to perform the specific comparison, in which case the other object is checked instead. If neither object has the method, or both defer by returning NotImplemented, then fall back on the same numeric comparison that is used for floats.

Examples

>>> approx_equal(1.2345678, 1.2345677)
True
>>> approx_equal(1.234, 1.235)
False
tolerance(x, tol=1e-15, rel=1e-15)

relative plus absolute difference

compressed module

helpers for compressed format for measures

binary(n)

converts an int to binary (returned as a string) Hence, int(binary(x), base=2) == x

binary2coords(binaries, positions, **kwds)

convert a list of binary strings to product measure coordinates

differs_by_one(ith, binaries, all=True, index=True)

get the binary string that differs by exactly one index

Inputs:

ith = the target index binaries = a list of binary strings all = if False, return only the results for indices < i index = if True, return the index of the results (not results themselves)

index2binary(index, npts=None)

convert a list of integers to a list of binary strings

discrete module

Classes for discrete measure data objects. Includes point_mass, measure, product_measure, and scenario classes.

bounded_mean(mean_x, samples, xmin, xmax, wts=None)
compose(samples, weights=None)

Generate a product_measure object from a nested list of N x 1D discrete measure positions and a nested list of N x 1D weights. If weights are not provided, a uniform distribution with norm = 1.0 will be used.

decompose(c)

Decomposes a product_measure object into a nested list of N x 1D discrete measure positions and a nested list of N x 1D weights.

flatten(c)

Flattens a product_measure object into a list.

impose_feasible(cutoff, data, guess=None, **kwds)

impose shortness on a given scenario

This function attempts to minimize the infeasibility between observed data and a scenario of synthetic data by perforing an optimization on w,x,y over the given bounds.

Parameters
  • cutoff (float) – maximum acceptable deviation from shortness

  • data (mystic.math.discrete.scenario) – a dataset of observed points

  • guess (mystic.math.discrete.scenario, default=None) – the synthetic points

  • tol (float, default=0.0) – maximum acceptable optimizer termination for sum(infeasibility).

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function x' = constraints(x), where x is a scenario that has been converted into a list of parameters (e.g. with scenario.flatten), and x' is the list of parameters after the encoded constaints have been satisfied.

Returns

a scenario with desired shortness

Notes

Here, tol is used to set the optimization termination for minimizing the sum(infeasibility), while cutoff is used in defining the deviation from shortness for observed x,y and synthetic x',y'.

guess can be either a scenario providing initial guess at feasibility, or a tuple of the dimensions of the desired scenario, where initial values will be chosen at random.

impose_valid(cutoff, model, guess=None, **kwds)

impose model validity on a given scenario

This function attempts to minimize the graph distance between reality (data), y = G(x), and an approximating function, y' = F(x'), by perforing an optimization on w,x,y over the given bounds.

Parameters
  • cutoff (float) – maximum acceptable model invalidity |y - F(x')|.

  • model (func) – the model function, y' = F(x').

  • guess (scenario, default=None) – a scenario, defines y = G(x).

  • hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm

  • xtol (float, default=0.0) – maximum acceptable pointwise graphical distance between model and reality.

  • tol (float, default=0.0) – maximum acceptable optimizer termination for sum(graphical distances).

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function x' = constraints(x), where x is a scenario that has been converted into a list of parameters (e.g. with scenario.flatten), and x' is the list of parameters after the encoded constaints have been satisfied.

Returns

a scenario with the desired model validity

Notes

xtol defines the n-dimensional base of a pilar of height cutoff, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization to find a set of points where the model is valid. Here, tol is used to set the optimization termination for minimizing the sum(graphical_distances), while cutoff is used in defining the graphical distance between x,y and x',F(x').

guess can be either a scenario providing initial guess at validity, or a tuple of the dimensions of the desired scenario, where initial values will be chosen at random.

mean_y_norm_wts_constraintsFactory(target, pts)

factory for a constraints function that: - imposes a mean on scenario values - normalizes weights

norm_wts_constraintsFactory(pts)

factory for a constraints function that: - normalizes weights

unflatten(params, npts)

Map a list of random variables to N x 1D discrete measures in a product_measure object.

class point_mass(position, weight=1.0)

Bases: object

a point mass object with weight and position

Parameters
  • position (tuple(float)) – position of the point mass

  • weight (float, default=1.0) – weight of the point mass

Variables
  • position (tuple(float)) – position of the point mass

  • weight (float) – weight of the point mass

property rms

square root of the sum of squared position

Type

readonly

class measure(iterable=(), /)

Bases: list

a 1-d collection of point masses forming a ‘discrete measure’

Parameters

iterable (list) – a list of mystic.math.discrete.point_mass objects

Notes

  • assumes only contains mystic.math.discrete.point_mass objects

  • assumes measure.n = len(measure.positions) == len(measure.weights)

  • relies on constraints to impose notions such as sum(weights) == 1.0

property center_mass

sum of weights * positions

ess_maximum(f, tol=0.0)

calculate the maximum for the support of a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the maximum value of f over all measure positions with support

ess_minimum(f, tol=0.0)

calculate the minimum for the support of a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the minimum value of f over all measure positions with support

ess_ptp(f, tol=0.0)

calculate the spread for the support of a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the spread of the values of f over all measure positions with support

expect(f)

calculate the expectation for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the expectation of f over all measure positions

expect_var(f)

calculate the expected variance for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the expected variance of f over all measure positions

property mass

the sum of the weights

Type

readonly

maximum(f)

calculate the maximum for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the maximum value of f over all measure positions

minimum(f)

calculate the minimum for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the minimum value of f over all measure positions

normalize()

normalize the weights to 1.0

Parameters

None

Returns

None

property npts

the number of point masses in the measure

Type

readonly

property positions

a list of positions for all point masses in the measure

ptp(f)

calculate the spread for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the spread of the values of f over all measure positions

property range

|max - min| for the positions

set_expect(expected, f, bounds=None, constraints=None, **kwds)

impose an expectation on the measure by adjusting the positions

Parameters
  • expected (float) – target expected mean

  • f (func) – a function that takes a list and returns a number

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.

  • tol (float, default=None) – maximum allowable deviation from expected

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

None

Notes

Expectation E is calculated by minimizing mean(f(x)) - expected, over the given bounds, and will terminate when E is found within deviation tol of the target mean expected. If tol is not provided, then a relative deviation of 1% of expected will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter.

set_expect_mean_and_var(expected, f, bounds=None, constraints=None, **kwds)

impose expected mean and var on the measure by adjusting the positions

Parameters
  • expected (tuple(float)) – (expected mean, expected var)

  • f (func) – a function that takes a list and returns a number

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.

  • tol (float, default=None) – maximum allowable deviation from expected

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

None

Notes

Expected mean E and expected variance R are calculated by minimizing the sum of the absolute values of mean(f(x)) - m and variance(f(x)) - v over the given bounds, and will terminate when E and R are found within tolerance tol of the target mean m and variance v, respectively. If tol is not provided, then a relative deviation of 1% of max(m,v) will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter

set_expect_var(expected, f, bounds=None, constraints=None, **kwds)

impose an expected variance on the measure by adjusting the positions

Parameters
  • expected (float) – target expected variance

  • f (func) – a function that takes a list and returns a number

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.

  • tol (float, default=None) – maximum allowable deviation from expected

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

None

Notes

Expected var E is calculated by minimizing var(f(x)) - expected, over the given bounds, and will terminate when E is found within deviation tol of the target variance expected. If tol is not provided, then a relative deviation of 1% of expected will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter.

support(tol=0)

get the positions with non-zero weight (i.e. support)

Parameters

tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the list of positions with support

support_index(tol=0)

get the indices where there is support (i.e. non-zero weight)

Parameters

tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the list of indices where there is support

property var

mean(|positions - mean(positions)|**2)

property weights

a list of weights for all point masses in the measure

class product_measure(iterable=(), /)

Bases: list

a N-d measure-theoretic product of discrete measures

Parameters

iterable (list) – a list of mystic.math.discrete.measure objects

Notes

  • all measures are treated as if they are orthogonal

  • assumes only contains mystic.math.discrete.measure objects

  • assumes len(product_measure.positions) == len(product_measure.weights)

  • relies on constraints to impose notions such as sum(weights) == 1.0

  • relies on constraints to impose expectation (within acceptable deviation)

  • positions are (xi,yi,zi) with weights (wxi,wyi,wzi), where weight wxi at xi should be the same for each (yj,zk). Similarly for each wyi and wzi.

property center_mass

sum of weights * positions

differs_by_one(ith, all=True, index=True)

get the coordinates where the associated binary string differs by exactly one index

Parameters
  • ith (int) – the target index

  • all (bool, default=True) – if False, only return results for indices < i

  • index (bool, default=True) – if True, return the indices of the results instead of the results themselves

Returns

the coordinates where the associated binary string differs by one, or if index is True, return the corresponding indices

ess_maximum(f, tol=0.0)

calculate the maximum for the support of a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the maximum value of f over all measure positions with support

ess_minimum(f, tol=0.0)

calculate the minimum for the support of a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the minimum value of f over all measure positions with support

ess_ptp(f, tol=0.0)

calculate the spread for the support of a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the spread of values of f over all measure positions with support

expect(f)

calculate the expectation for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the expectation of f over all measure positions

expect_var(f)

calculate the expected variance for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the expected variance of f over all measure positions

flatten()

convert a product measure to a single list of parameters

Parameters

None

Returns

a list of parameters

Notes

Given product_measure.pts = (M, N, ...), then the returned list is params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params will have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired product measure.

load(params, pts)

load the product measure from a list of parameters

Parameters
  • params (list(float)) – parameters corresponding to N 1D discrete measures

  • pts (tuple(int)) – number of point masses in each of the discrete measures

Returns

the product measure itself

Return type

self (measure)

Notes

To append len(pts) new discrete measures to the product measure, it is assumed params either corresponds to the correct number of weights and positions specified by pts, or params has additional values (typically output values) which will be ignored. It is assumed that len(params) >= 2 * sum(product_measure.pts).

Given the value of pts = (M, N, ...), it is assumed that params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params should have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired product measure.

property mass

a list of weight norms

Type

readonly

maximum(f)

calculate the maximum for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the maximum value of f over all measure positions

minimum(f)

calculate the minimum for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the minimum value of f over all measure positions

property npts

the total number of point masses in the product measure

Type

readonly

pof(f)

calculate probability of failure for a given function

Parameters

f (func) – a function returning True for ‘success’ and False for ‘failure’

Returns

the probabilty of failure, a float in [0.0,1.0]

Notes

  • the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0 or False)

property pos

a list of positions for each discrete mesure

Type

readonly

property positions

a list of positions for all point masses in the product measure

ptp(f)

calculate the spread for a given function

Parameters

f (func) – a function that takes a list and returns a number

Returns

the spread for values of f over all measure positions

property pts

the number of point masses for each discrete mesure

Type

readonly

sampled_expect(f, npts=10000)

use sampling to calculate expected value for a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures

Returns

the expected value, a float

Notes

  • the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

sampled_maximum(f, npts=10000)

use sampling to calculate ess_maximum for a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures

Returns

the ess_maximum, a float

Notes

  • the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

sampled_minimum(f, npts=10000)

use sampling to calculate ess_minimum for a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures

Returns

the sampled ess_minimum, a float

Notes

  • the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

sampled_pof(f, npts=10000)

use sampling to calculate probability of failure for a given function

Parameters
  • f (func) – a function returning True for ‘success’ and False for ‘failure’

  • npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures

Returns

the probabilty of failure, a float in [0.0,1.0]

Notes

  • the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0 or False)

sampled_ptp(f, npts=10000)

use sampling to calculate ess_|maximum - minimum| for a given function

Parameters
  • f (func) – a function that takes a list and returns a number

  • npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures

Returns

the sampled |ess_maximum - ess_minimum|, a float

Notes

  • the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

sampled_support(npts=10000)

randomly select support points from the underlying discrete measures

Parameters

npts (int, default=10000) – the number of sampled points

Returns

a list of len(product measure) lists, each of length len(npts)

select(*index, **kwds)

generate product measure positions for the selected position indices

Parameters

index (tuple(int)) – tuple of position indicies

Returns

a list of product measure positions for the selected indices

Examples

>>> r
[[9, 8], [1, 3], [4, 2]]
>>> r.select(*range(r.npts))
[(9, 1, 4), (8, 1, 4), (9, 3, 4), (8, 3, 4), (9, 1, 2), (8, 1, 2), (9, 3, 2), (8, 3, 2)]
>>>
>>> _pack(r)
[(9, 1, 4), (8, 1, 4), (9, 3, 4), (8, 3, 4), (9, 1, 2), (8, 1, 2), (9, 3, 2), (8, 3, 2)]

Notes

This only works for product measures of dimension 2^K

set_expect(expected, f, bounds=None, constraints=None, **kwds)

impose an expectation on the measure by adjusting the positions

Parameters
  • expected (float) – target expected mean

  • f (func) – a function that takes a list and returns a number

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.

  • tol (float, default=None) – maximum allowable deviation from expected

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

None

Notes

Expectation E is calculated by minimizing mean(f(x)) - expected, over the given bounds, and will terminate when E is found within deviation tol of the target mean expected. If tol is not provided, then a relative deviation of 1% of expected will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter.

set_expect_mean_and_var(expected, f, bounds=None, constraints=None, **kwds)

impose expected mean and var on the measure by adjusting the positions

Parameters
  • expected (tuple(float)) – (expected mean, expected var)

  • f (func) – a function that takes a list and returns a number

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.

  • tol (float, default=None) – maximum allowable deviation from expected

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

None

Notes

Expected mean E and expected variance R are calculated by minimizing the sum of the absolute values of mean(f(x)) - m and variance(f(x)) - v over the given bounds, and will terminate when E and R are found within tolerance tol of the target mean m and variance v, respectively. If tol is not provided, then a relative deviation of 1% of max(m,v) will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter

set_expect_var(expected, f, bounds=None, constraints=None, **kwds)

impose an expected variance on the measure by adjusting the positions

Parameters
  • expected (float) – target expected variance

  • f (func) – a function that takes a list and returns a number

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.

  • tol (float, default=None) – maximum allowable deviation from expected

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

None

Notes

Expected var E is calculated by minimizing var(f(x)) - expected, over the given bounds, and will terminate when E is found within deviation tol of the target variance expected. If tol is not provided, then a relative deviation of 1% of expected will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter.

support(tol=0)

get the positions with non-zero weight (i.e. support)

Parameters

tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the list of positions with support

support_index(tol=0)

get the indices where there is support (i.e. non-zero weight)

Parameters

tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns

the list of indices where there is support

update(params)

update the product measure from a list of parameters

Parameters

params (list(float)) – parameters corresponding to N 1D discrete measures

Returns

the product measure itself

Return type

self (measure)

Notes

The dimensions of the product measure will not change upon update, and it is assumed params either corresponds to the correct number of weights and positions for the existing product_measure, or params has additional values (typically output values) which will be ignored. It is assumed that len(params) >= 2 * sum(product_measure.pts).

If product_measure.pts = (M, N, ...), then it is assumed that params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params should have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired product measure.

property weights

a list of weights for all point masses in the product measure

property wts

a list of weights for each discrete mesure

Type

readonly

class scenario(pm=None, values=None)

Bases: mystic.math.discrete.product_measure

a N-d product measure with associated data values

A scenario is a measure-theoretic product of discrete measures that also includes a list of associated values, with the values corresponding to measured or synthetic data for each measure position. Each point mass in the product measure is paired with a value, and thus, essentially, a scenario is equivalent to a mystic.math.legacydata.dataset stored in a product_measure representation.

Parameters
  • pm (mystic.math.discrete.product_measure, default=None) – a product measure

  • values (list(float), default=None) – values associated with each position

Notes

  • all measures are treated as if they are orthogonal

  • relies on constraints to impose notions such as sum(weights) == 1.0

  • relies on constraints to impose expectation (within acceptable deviation)

  • positions are (xi,yi,zi) with weights (wxi,wyi,wzi), where weight wxi at xi should be the same for each (yj,zk). Similarly for each wyi and wzi.

flatten(all=True)

convert a scenario to a single list of parameters

Parameters

all (bool, default=True) – if True, append the scenario values

Returns

a list of parameters

Notes

Given scenario.pts = (M, N, ...), then the returned list is params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params will have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the scenario. If all is True, then the scenario.values will be appended to the list of parameters.

load(params, pts)

load the scenario from a list of parameters

Parameters
  • params (list(float)) – parameters corresponding to N 1D discrete measures

  • pts (tuple(int)) – number of point masses in each of the discrete measures

Returns

the scenario itself

Return type

self (scenario)

Notes

To append len(pts) new discrete measures to the scenario, it is assumed params either corresponds to the correct number of weights and positions specified by pts, or params has additional values which will be saved as the scenario.values. It is assumed that len(params) >= 2 * sum(scenario.pts).

Given the value of pts = (M, N, ...), it is assumed that params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params should have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired scenario. Any remaining parameters will be treated as scenario.values.

mean_value()

calculate the mean of the associated values for a scenario

Parameters

None

Returns

the weighted mean of the scenario values

pof_value(f)

calculate probability of failure for a given function

Parameters

f (func) – a function returning True for ‘success’ and False for ‘failure’

Returns

the probabilty of failure, a float in [0.0,1.0]

Notes

  • the function f should take a list of values (for example, scenario.values) and return a single value (e.g. 0.0 or False)

set_feasible(data, cutoff=0.0, bounds=None, constraints=None, with_self=True, **kwds)

impose shortness with respect to the given data points

This function attempts to minimize the infeasibility between observed data and the scenario of synthetic data by perforing an optimization on w,x,y over the given bounds.

Parameters
  • data (mystic.math.discrete.scenario) – a dataset of observed points

  • cutoff (float, default=0.0) – maximum acceptable deviation from shortness

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function x' = constraints(x), where x is a scenario that has been converted into a list of parameters (e.g. with scenario.flatten), and x' is the list of parameters after the encoded constaints have been satisfied.

  • with_self (bool, default=True) – if True, shortness is also self-consistent

  • tol (float, default=0.0) – maximum acceptable optimizer termination for sum(infeasibility).

Returns

None

Notes

  • both scenario.positions and scenario.values may be adjusted.

  • if with_self is True, shortness will be measured not only from the scenario to the given data, but also between scenario datapoints.

set_mean_value(m)

set the mean for the associated values of a scenario

Parameters

m (float) – the target weighted mean of the scenario values

Returns

None

set_valid(model, cutoff=0.0, bounds=None, constraints=None, **kwds)

impose model validity on a scenario by adjusting positions and values

This function attempts to minimize the graph distance between reality (data), y = G(x), and an approximating function, y' = F(x'), by perforing an optimization on w,x,y over the given bounds.

Parameters
  • model (func) – a model y' = F(x') that approximates reality y = G(x)

  • cutoff (float, default=0.0) – acceptable model invalidity |y - F(x')|

  • bounds (tuple, default=None) – (all lower bounds, all upper bounds)

  • constraints (func, default=None) – a function x' = constraints(x), where x is a scenario that has been converted into a list of parameters (e.g. with scenario.flatten), and x' is the list of parameters after the encoded constaints have been satisfied.

  • hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm

  • xtol (float, default=0.0) – maximum acceptable pointwise graphical distance between model and reality.

  • tol (float, default=0.0) – maximum acceptable optimizer termination for sum(graphical distances).

Returns

None

Notes

xtol defines the n-dimensional base of a pilar of height cutoff, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization to find a set of points where the model is valid. Here, tol is used to set the optimization termination for minimizing the sum(graphical_distances), while cutoff is used in defining the graphical distance between x,y and x',F(x').

short_wrt_data(data, L=None, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for shortness with respect to the given data

Parameters
  • data (list) – a list of data points or dataset to compare against.

  • L (float, default=None) – the lipschitz constant, if different than in data.

  • blamelist (bool, default=False) – if True, indicate the infeasible points.

  • pairs (bool, default=True) – if True, indicate indices of infeasible points.

  • all (bool, default=False) – if True, get results for each individual point.

  • raw (bool, default=False) – if False, get boolean results (i.e. non-float).

  • tol (float, default=0.0) – maximum acceptable deviation from shortness.

  • cutoff (float, default=tol) – zero out distances less than cutoff.

Notes

Each point x,y can be thought to have an associated double-cone with slope equal to the lipschitz constant. Shortness with respect to another point is defined by the first point not being inside the cone of the second. We can allow for some error in shortness, a short tolerance tol, for which the point x,y is some acceptable y-distance inside the cone. While very tightly related, cutoff and tol play distinct roles; tol is subtracted from calculation of the lipschitz_distance, while cutoff zeros out the value of any element less than the cutoff.

short_wrt_self(L, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for shortness with respect to the scenario itself

Parameters
  • L (float) – the lipschitz constant.

  • blamelist (bool, default=False) – if True, indicate the infeasible points.

  • pairs (bool, default=True) – if True, indicate indices of infeasible points.

  • all (bool, default=False) – if True, get results for each individual point.

  • raw (bool, default=False) – if False, get boolean results (i.e. non-float).

  • tol (float, default=0.0) – maximum acceptable deviation from shortness.

  • cutoff (float, default=tol) – zero out distances less than cutoff.

Notes

Each point x,y can be thought to have an associated double-cone with slope equal to the lipschitz constant. Shortness with respect to another point is defined by the first point not being inside the cone of the second. We can allow for some error in shortness, a short tolerance tol, for which the point x,y is some acceptable y-distance inside the cone. While very tightly related, cutoff and tol play distinct roles; tol is subtracted from calculation of the lipschitz_distance, while cutoff zeros out the value of any element less than the cutoff.

update(params)

update the scenario from a list of parameters

Parameters

params (list(float)) – parameters corresponding to N 1D discrete measures

Returns

the scenario itself

Return type

self (scenario)

Notes

The dimensions of the scenario will not change upon update, and it is assumed params either corresponds to the correct number of weights and positions for the existing scenario, or params has additional values which will be saved as the scenario.values. It is assumed that len(params) >= 2 * sum(scenario.pts).

If scenario.pts = (M, N, ...), then it is assumed that params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params should have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired scenario.

valid_wrt_model(model, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for scenario validity with respect to the model

Parameters
  • model (func) – the model function, y' = F(x').

  • blamelist (bool, default=False) – if True, indicate the infeasible points.

  • pairs (bool, default=True) – if True, indicate indices of infeasible points.

  • all (bool, default=False) – if True, get results for each individual point.

  • raw (bool, default=False) – if False, get boolean results (i.e. non-float).

  • ytol (float, default=0.0) – maximum acceptable difference |y - F(x')|.

  • xtol (float, default=0.0) – maximum acceptable difference |x - x'|.

  • cutoff (float, default=ytol) – zero out distances less than cutoff.

  • hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm.

Notes

xtol defines the n-dimensional base of a pilar of height ytol, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization for each x to find an appropriate x'.

ytol is a single value, while xtol is a single value or an iterable. cutoff takes a float or a boolean, where cutoff=True will set the value of cutoff to the default. Typically, the value of cutoff is ytol, 0.0, or None. hausdorff can be False (e.g. norm = 1.0), True (e.g. norm = spread(x)), or a list of points of len(x).

While cutoff and ytol are very tightly related, they play a distinct role; ytol is used to set the optimization termination for an acceptable |y - F(x')|, while cutoff is applied post-optimization.

If we are using the hausdorff norm, then ytol will set the optimization termination for an acceptable |y - F(x')| + |x - x'|/norm, where the x values are normalized by norm = hausdorff.

property values

a list of values corresponding to output data for all point masses in the underlying product measure

distance module

distances and norms for the legacy data module

Lnorm(weights, p=1, axis=None)

calculate L-p norm of weights

Parameters
  • weights (array(float)) – an array of weights

  • p (int, default=1) – the power of the p-norm, where p in [0,inf]

  • axis (int, default=None) – axis used to take the norm along

Returns

a float distance norm for the weights

absolute_distance(x, xp=None, pair=False, dmin=0)

pointwise (or pairwise) absolute distance

pointwise = |x.T[:,newaxis] - x'.T|  or  pairwise = |x.T - x'.T|.T

Parameters
  • x (array) – an array of points, x

  • xp (array, default=None) – a second array of points, x'

  • pair (bool, default=False) – if True, return the pairwise distances

  • dmin (int, default=0) – upconvert x,x' to dimension >= dmin

Returns

an array of absolute distances between points

Notes

  • x'==x is symmetric with zeros on the diagonal

  • use dmin=2 for the forced upconversion of 1-D arrays

chebyshev(x, xp=None, pair=False, dmin=0, axis=None)

infinity norm distance between points in euclidean space

d(inf) =  max(|x[0] - x[0]'|, |x[1] - x[1]'|, ..., |x[n] - x[n]'|)

Parameters
  • x (array) – an array of points, x

  • xp (array, default=None) – a second array of points, x'

  • pair (bool, default=False) – if True, return the pairwise distances

  • dmin (int, default=0) – upconvert x,x' to dimension >= dmin

  • axis (int, default=None) – if not None, reduce across the given axis

Returns

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

euclidean(x, xp=None, pair=False, dmin=0, axis=None)

L-2 norm distance between points in euclidean space

d(2) = sqrt(sum(|x[0] - x[0]'|^2, |x[1] - x[1]'|^2, ..., |x[n] - x[n]'|^2))

Parameters
  • x (array) – an array of points, x

  • xp (array, default=None) – a second array of points, x'

  • pair (bool, default=False) – if True, return the pairwise distances

  • dmin (int, default=0) – upconvert x,x' to dimension >= dmin

  • axis (int, default=None) – if not None, reduce across the given axis

Returns

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

graphical_distance(model, points, **kwds)

find the radius(x') that minimizes the graph between reality (data), y = G(x), and an approximating function, y' = F(x').

Parameters
  • model (func) – a model y' = F(x') that approximates reality y = G(x)

  • points (mystic.math.legacydata.dataset) – a dataset, defines y = G(x)

  • ytol (float, default=0.0) – maximum acceptable difference |y - F(x')|.

  • xtol (float, default=0.0) – maximum acceptable difference |x - x'|.

  • cutoff (float, default=ytol) – zero out distances less than cutoff.

  • hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm.

Returns

the radius (the minimum distance x,G(x) to x',F(x') for each x)

Notes

points can be a mystic.math.legacydata.dataset or a list of mystic.math.legacydata.datapoint objects.

xtol defines the n-dimensional base of a pilar of height ytol, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization for each x to find an appropriate x'.

ytol is a single value, while xtol is a single value or an iterable. cutoff takes a float or a boolean, where cutoff=True will set the value of cutoff to the default. Typically, the value of cutoff is ytol, 0.0, or None. hausdorff can be False (e.g. norm = 1.0), True (e.g. norm = spread(x)), or a list of points of len(x).

While cutoff and ytol are very tightly related, they play a distinct role; ytol is used to set the optimization termination for an acceptable |y - F(x')|, while cutoff is applied post-optimization.

If we are using the hausdorff norm, then ytol will set the optimization termination for an acceptable |y - F(x')| + |x - x'|/norm, where the x values are normalized by norm = hausdorff.

hamming(x, xp=None, pair=False, dmin=0, axis=None)

zero ‘norm’ distance between points in euclidean space

d(0) =  sum(x[0] != x[0]', x[1] != x[1]', ..., x[n] != x[n]')

Parameters
  • x (array) – an array of points, x

  • xp (array, default=None) – a second array of points, x'

  • pair (bool, default=False) – if True, return the pairwise distances

  • dmin (int, default=0) – upconvert x,x' to dimension >= dmin

  • axis (int, default=None) – if not None, reduce across the given axis

Returns

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

infeasibility(distance, cutoff=0.0)

amount by which the distance exceeds the given cutoff distance

Parameters
  • distance (array) – the measure of feasibility for each point

  • cutoff (float, default=0.0) – maximum acceptable distance

Returns

an array of distances by which each point is infeasbile

is_feasible(distance, cutoff=0.0)

determine if the distance exceeds the given cutoff distance

Parameters
  • distance (array) – the measure of feasibility for each point

  • cutoff (float, default=0.0) – maximum acceptable distance

Returns

bool array, with True where the distance is less than cutoff

lipschitz_distance(L, points1, points2, **kwds)

calculate the lipschitz distance between two sets of datapoints

Parameters
  • L (list) – a list of lipschitz constants

  • points1 (mystic.math.legacydata.dataset) – a dataset

  • points2 (mystic.math.legacydata.dataset) – a second dataset

  • tol (float, default=0.0) – maximum acceptable deviation from shortness

  • cutoff (float, default=tol) – zero out distances less than cutoff

Returns

a list of lipschitz distances

Notes

Both points1 and points2 can be a mystic.math.legacydata.dataset, or a list of mystic.math.legacydata.datapoint objects, or a list of lipschitzcone.vertex objects (from mystic.math.legacydata). cutoff takes a float or a boolean, where cutoff=True will set the value of cutoff to the default. Typically, the value of cutoff is tol, 0.0, or None.

Each point x,y can be thought to have an associated double-cone with slope equal to the lipschitz constant. Shortness with respect to another point is defined by the first point not being inside the cone of the second. We can allow for some error in shortness, a short tolerance tol, for which the point x,y is some acceptable y-distance inside the cone. While very tightly related, cutoff and tol play distinct roles; tol is subtracted from calculation of the lipschitz_distance, while cutoff zeros out the value of any element less than the cutoff.

lipschitz_metric(L, x, xp=None)

sum of lipschitz-weighted distance between points

d = sum(L[i] * |x[i] - x'[i]|)

Parameters
  • L (array) – an array of Lipschitz constants, L

  • x (array) – an array of points, x

  • xp (array, default=None) – a second array of points, x'

Returns

an array of absolute distances between points

manhattan(x, xp=None, pair=False, dmin=0, axis=None)

L-1 norm distance between points in euclidean space

d(1) = sum(|x[0] - x[0]'|, |x[1] - x[1]'|, ..., |x[n] - x[n]'|)

Parameters
  • x (array) – an array of points, x

  • xp (array, default=None) – a second array of points, x'

  • pair (bool, default=False) – if True, return the pairwise distances

  • dmin (int, default=0) – upconvert x,x' to dimension >= dmin

  • axis (int, default=None) – if not None, reduce across the given axis

Returns

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

minkowski(x, xp=None, pair=False, dmin=0, p=3, axis=None)

p-norm distance between points in euclidean space

d(p) = sum(|x[0] - x[0]'|^p, |x[1] - x[1]'|^p, ..., |x[n] - x[n]'|^p)^(1/p)

Parameters
  • x (array) – an array of points, x

  • xp (array, default=None) – a second array of points, x'

  • pair (bool, default=False) – if True, return the pairwise distances

  • dmin (int, default=0) – upconvert x,x' to dimension >= dmin

  • p (int, default=3) – value of p for the p-norm

  • axis (int, default=None) – if not None, reduce across the given axis

Returns

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

grid module

tools for generating points on a grid

fillpts(lb, ub, npts, data=None, rtol=None, dist=None)

takes lower and upper bounds (e.g. lb = [0,3], ub = [2,4]) finds npts that are at least rtol away from legacy data produces a list of sample points s = [[1,3],[1,4],[2,3],[2,4]]

Inputs:

lb – a list of the lower bounds ub – a list of the upper bounds npts – number of sample points data – a list of legacy sample points rtol – target radial distance from each point dist – a mystic.math.Distribution instance (or list of Distributions)

Notes: if rtol is None, use max rtol; if rtol < 0, use quick-n-dirty method

gridpts(q, dist=None)

takes a list of lists of arbitrary length q = [[1,2],[3,4]] and produces a list of gridpoints g = [[1,3],[1,4],[2,3],[2,4]]

Inputs:

q – a list of lists of integers denoting grid points dist – a mystic.math.Distribution instance (or list of Distributions)

Notes

if a mystic.math.Distribution is provided, use it to inject randomness

randomly_bin(N, ndim=None, ones=True, exact=True)

generate N bins randomly gridded across ndim dimensions

Inputs:

N – integer number of bins, where N = prod(bins) ndim – integer length of bins, thus ndim = len(bins) ones – if False, prevent bins from containing “1s”, wherever possible exact – if False, find N-1 bins for prime numbers

samplepts(lb, ub, npts, dist=None)

takes lower and upper bounds (e.g. lb = [0,3], ub = [2,4]) produces a list of sample points s = [[1,3],[1,4],[2,3],[2,4]]

Inputs:

lb – a list of the lower bounds ub – a list of the upper bounds npts – number of sample points dist – a mystic.math.Distribution instance (or list of Distributions)

integrate module

math tools related to integration

integrate(f, lb, ub)

Returns the integral of an n-dimensional function f from lb to ub

Inputs:

f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds

If scipy is installed, and number of dimensions is 3 or less, scipy.integrate is used. Otherwise, use mystic’s n-dimensional Monte Carlo integrator.

integrated_mean(f, lb, ub)

calculate the integrated mean of a function f

Inputs:

f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds

integrated_variance(f, lb, ub)

calculate the integrated variance of a function f

Inputs:

f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds

monte_carlo_integrate(f, lb, ub, n=10000)

Returns the integral of an m-dimensional function f from lb to ub using a Monte Carlo integration of n points

Inputs:

f – a function that takes a list and returns a number. lb – a list of lower bounds ub – a list of upper bounds n – the number of points to sample [Default is n=10000]

References

  1. “A Primer on Scientific Programming with Python”, by Hans Petter Langtangen, page 443-445, 2014.

  2. http://en.wikipedia.org/wiki/Monte_Carlo_integration

  3. http://math.fullerton.edu/mathews/n2003/MonteCarloMod.html

interpolate module

functions related to interpolation 1-D and n-D interpolation, building a grid of points, calculating gradients

axes(mins, maxs, npts=None)

generate a tuple of arrays defining axes on a grid, given bounds

Input:

mins: a tuple of lower bounds for coordinates on the grid maxs: a tuple of upper bounds for coordinates on the grid npts: a tuple of grid shape, or integer number or points on grid

Output:

a tuple of arrays defining the axes of the coordinate grid

Note

If npts is not provided, a default of 50 points in each direction is used.

extrapolate(x, z=None, method=None, mins=None, maxs=None, **kwds)

extrapolate a bounding-box for the given points

Input:

x: an array of shape (npts, dim) or (npts,) z: an array of shape (npts,) method: a function f(x) to generate new points; if None, use f(x) = nan mins: a list of floats defining minima of the bounding box, or None maxs: a list of floats defining maxima of the bounding box, or None all: if True, include the initial (x,z) points in the return

Output:

a tuple of (x,z) containing the requested boundbox points

Note

if z is None, return a tuple (x,) with the requested boundbox points.

Note

if mins is None, then use the minima found in x. Similarly for maxs.

Note

method can take a function, None, a boolean, or a string. If method is True, interpolate a cost function using interpf with method=’thin_plate’ (or ‘rbf’ if scipy is not found). If method is False, return the original input. Alternately, method can be any one of (‘rbf’,’linear’,’cubic’, ‘nearest’,’inverse’,’gaussian’,’multiquadric’,’quintic’,’thin_plate’).

gradient(x, fx, method=None, approx=True, extrap=False, **kwds)

find gradient of fx at x, where fx is a function z=fx(*x) or an array z

Input:

x: an array of shape (npts, dim) or (npts,) fx: an array of shape (npts,) or a function, z = fx(*x) method: string for kind of interpolator approx: if True, use local approximation method extrap: if True, extrapolate a bounding box (can reduce # of nans) new: if True, include the extrapolated points in the output

Output:

array of dimensions x.shape, gradient of the points at (x,fx)

Note

if approx is True, use mystic’s approx_fprime, which uses a local gradient approximation; otherwise use numpy’s gradient method which performs a more memory-intensive calcuation on a grid.

Note

if scipy is not installed, will use np.interp for 1D (non-rbf), or mystic’s rbf otherwise. default method is ‘nearest’ for 1D and ‘linear’ otherwise. method can be one of (‘rbf’,’linear’,’cubic’, ‘nearest’,’inverse’,’gaussian’,’multiquadric’,’quintic’,’thin_plate’).

Note

if extrap is True, extrapolate using interpf with method=’thin_plate’ (or ‘rbf’ if scipy is not found). Alternately, any one of (‘rbf’, ‘linear’,’cubic’,’nearest’,’inverse’,’gaussian’,’multiquadric’, ‘quintic’,’thin_plate’) can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation.

grid(*axes)

generate tuple of (irregular) coordinate grids, given coordinate axes

Input:

axes: a tuple of arrays defining the axes of the coordinate grid

Output:

the resulting coordinate grid

hessian(x, fx, method=None, approx=True, extrap=False, **kwds)

find hessian of fx at x, where fx is a function z=fx(*x) or an array z

Input:

x: an array of shape (npts, dim) or (npts,) fx: an array of shape (npts,) or a function, z = fx(*x) method: string for kind of interpolator approx: if True, use local approximation method extrap: if True, extrapolate a bounding box (can reduce # of nans) new: if True, include the extrapolated points in the output

Output:

array of shape indicated in NOTE, hessian of the points at (x,fx)

Note

output will be of the form x.shape+(dim,), where hess[:,i,j] corresponds to the second derivative z_{i,j} with i,j in range(dim). For a 1D array x, output will be a 1D array of the same length.

Note

if approx is True, first use interpolation to build gradient functions in each direction, then use mystic’s approx_fprime, which uses a local gradient approximation; otherwise use numpy’s gradient method which performs a more memory-intensive calcuation on a grid.

Note

if scipy is not installed, will use np.interp for 1D (non-rbf), or mystic’s rbf otherwise. default method is ‘nearest’ for 1D and ‘linear’ otherwise. method can be one of (‘rbf’,’linear’,’cubic’, ‘nearest’,’inverse’,’gaussian’,’multiquadric’,’quintic’,’thin_plate’).

Note

method string can provide either one or two methods (i.e. ‘rbf or ‘rbf, cubic’), where if two methods are provided, the first will be used to interpolate f(x) and the second will be used to interpolate the gradient of f(x).

Note

if extrap is True, extrapolate using interpf with method=’thin_plate’ (or ‘rbf’ if scipy is not found). Alternately, any one of (‘rbf’, ‘linear’,’cubic’,’nearest’,’inverse’,’gaussian’,’multiquadric’, ‘quintic’,’thin_plate’) can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation.

hessian_diagonal(x, fx, method=None, approx=True, extrap=False, **kwds)

find hessian diagonal of fx at x, with fx a function z=fx(*x) or array z

Input:

x: an array of shape (npts, dim) or (npts,) fx: an array of shape (npts,) or a function, z = fx(*x) method: string for kind of interpolator approx: if True, use local approximation method extrap: if True, extrapolate a bounding box (can reduce # of nans) new: if True, include the extrapolated points in the output

Output:

array of dimensions x.shape, hessian diagonal of the points at (x,fx)

Note

if approx is True, first use interpolation to build gradient functions in each direction, then use mystic’s approx_fprime, which uses a local gradient approximation; otherwise use numpy’s gradient method which performs a more memory-intensive calcuation on a grid.

Note

if scipy is not installed, will use np.interp for 1D (non-rbf), or mystic’s rbf otherwise. default method is ‘nearest’ for 1D and ‘linear’ otherwise. method can be one of (‘rbf’,’linear’,’cubic’, ‘nearest’,’inverse’,’gaussian’,’multiquadric’,’quintic’,’thin_plate’).

Note

if extrap is True, extrapolate using interpf with method=’thin_plate’ (or ‘rbf’ if scipy is not found). Alternately, any one of (‘rbf’, ‘linear’,’cubic’,’nearest’,’inverse’,’gaussian’,’multiquadric’, ‘quintic’,’thin_plate’) can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation.

interpf(x, z, method=None, extrap=False, arrays=False, **kwds)

interpolate (x,z) to generate f, where z=f(*x)

Input:

x: an array of shape (npts, dim) or (npts,) z: an array of shape (npts, N) method: string for kind of interpolator extrap: if True, extrapolate a bounding box (can reduce # of nans) arrays: if True, z = f(*x) is a numpy array; otherwise don’t use arrays axis: int in [0,N], the axis of z to interpolate (all, by default) map: map function, used to parallelize interpf for each axis

Output:

interpolated function f, where z=f(*x)

Note

if scipy is not installed, will use np.interp for 1D (non-rbf), or mystic’s rbf otherwise. default method is ‘nearest’ for 1D and ‘linear’ otherwise. method can be one of (‘rbf’,’linear’,’cubic’, ‘nearest’,’inverse’,’gaussian’,’multiquadric’,’quintic’,’thin_plate’).

Note

if extrap is True, extrapolate using interpf with method=’thin_plate’ (or ‘rbf’ if scipy is not found). Alternately, any one of (‘rbf’, ‘linear’,’cubic’,’nearest’,’inverse’,’gaussian’,’multiquadric’, ‘quintic’,’thin_plate’) can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation.

Note

additional keyword arguments (epsilon, smooth, norm) are avaiable for use with a Rbf interpolator. See mystic.math.interpolate.Rbf for more details.

interpolate(x, z, xgrid, method=None, extrap=False, arrays=True, **kwds)

interpolate to find z = f(x) sampled at points defined by xgrid

Input:

x: an array of shape (npts, dim) or (npts,) z: an array of shape (npts,) xgrid: (irregular) coordinate grid on which to sample z = f(x) method: string for kind of interpolator extrap: if True, extrapolate a bounding box (can reduce # of nans) arrays: if True, z = f(x) is a numpy array; otherwise don’t return arrays

Output:

interpolated points on a grid, where z = f(x) has been sampled on xgrid

Note

if scipy is not installed, will use np.interp for 1D (non-rbf), or mystic’s rbf otherwise. default method is ‘nearest’ for 1D and ‘linear’ otherwise. method can be one of (‘rbf’,’linear’,’cubic’, ‘nearest’,’inverse’,’gaussian’,’multiquadric’,’quintic’,’thin_plate’).

Note

if extrap is True, extrapolate using interpf with method=’thin_plate’ (or ‘rbf’ if scipy is not found). Alternately, any one of (‘rbf’, ‘linear’,’cubic’,’nearest’,’inverse’,’gaussian’,’multiquadric’, ‘quintic’,’thin_plate’) can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation.

Note

additional keyword arguments (epsilon, smooth, norm) are avaiable for use with a Rbf interpolator. See mystic.math.interpolate.Rbf for more details.

sort_axes(*axes, **kwds)

sort axes along the selected primary axis

Input:

axes: a tuple of arrays of points along each axis axis: an integer corresponding to the axis upon which to sort

Output:

a tuple of arrays sorted with regard to the selected axis

legacydata module

data structures for legacy data observations of lipschitz functions

class datapoint(position, value=None, id=None, lipschitz=None)

Bases: object

n-d data point with position and value

queries:

p.value – returns value p.position – returns position

settings:

p.value = v1 – set the value p.position = (x1, x2, …, xn) – set the position

Notes

a datapoint can have an assigned id and cone; also has utilities for comparison against other datapoints (intended for use in a dataset)

collisions(pts)

return True where a point exists with same ‘position’ and different ‘value’

conflicts(pts)

return True where a point exists with same ‘id’ but different ‘raw’

duplicates(pts)

return True where a point exists with same ‘raw’ and ‘id’

property position
repeats(pts)

return True where a point exists with same ‘raw’ but different ‘id’

property value
class dataset(iterable=(), /)

Bases: list

a collection of data points

s = dataset([point1, point2, …, pointN])

queries:

s.values – returns list of values s.coords – returns list of positions s.ids – returns list of ids s.raw – returns list of points s.npts – returns the number of points s.lipschitz – returns list of lipschitz constants

settings:

s.lipschitz = [s1, s2, …, sn] – sets lipschitz constants

short  -- check for shortness with respect to given data (or self)
valid  -- check for validity with respect to given model
update  -- update the positions and values in the dataset
load  -- load a list of positions and a list of values to the dataset
fetch  -- fetch the list of positions and the list of values in the dataset
intersection  -- return the set intersection between self and query
filter  -- return dataset entries where mask array is True
has_id  -- return True where dataset ids are in query
has_position  -- return True where dataset coords are in query
has_point  -- return True where dataset points are in query
has_datapoint  -- return True where dataset entries are in query

Notes

  • datapoints should not be edited; except possibly for id

  • assumes that s.n = len(s.coords) == len(s.values)

  • all datapoints in a dataset should have the same cone.slopes

property collisions
property conflicts
property coords
property duplicates
fetch()

fetch the list of positions and the list of values in the dataset

filter(mask)

return dataset entries where mask array is True

Inputs:

mask – a boolean array of the same length as dataset

has_datapoint(query)

return True where dataset entries are in query

Notes

query must be iterable

has_id(query)

return True where dataset ids are in query

Notes

query must be iterable

has_point(query)

return True where dataset points are in query

Notes

query must be iterable

has_position(query)

return True where dataset coords are in query

Notes

query must be iterable

property ids
intersection(query)

return the set intersection between self and query

property lipschitz
load(positions, values, ids=[])

load a list of positions and a list of values to the dataset

Returns

the dataset itself

Return type

self (dataset)

Notes

positions and values provided must be iterable

property npts
property raw
property repeats
short(data=None, L=None, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for shortness with respect to given data (or self)

Parameters
  • data (list, default=None) – a list of data points, or the dataset itself.

  • L (float, default=None) – the lipschitz constant, or the dataset’s constant.

  • blamelist (bool, default=False) – if True, indicate the infeasible points.

  • pairs (bool, default=True) – if True, indicate indices of infeasible points.

  • all (bool, default=False) – if True, get results for each individual point.

  • raw (bool, default=False) – if False, get boolean results (i.e. non-float).

  • tol (float, default=0.0) – maximum acceptable deviation from shortness.

  • cutoff (float, default=tol) – zero out distances less than cutoff.

Notes

Each point x,y can be thought to have an associated double-cone with slope equal to the lipschitz constant. Shortness with respect to another point is defined by the first point not being inside the cone of the second. We can allow for some error in shortness, a short tolerance tol, for which the point x,y is some acceptable y-distance inside the cone. While very tightly related, cutoff and tol play distinct roles; tol is subtracted from calculation of the lipschitz_distance, while cutoff zeros out the value of any element less than the cutoff.

update(positions, values)

update the positions and values in the dataset

Returns

the dataset itself

Return type

self (dataset)

Notes

positions and values provided must be iterable

valid(model, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for validity with respect to given model

Parameters
  • model (func) – the model function, y' = F(x').

  • blamelist (bool, default=False) – if True, indicate the infeasible points.

  • pairs (bool, default=True) – if True, indicate indices of infeasible points.

  • all (bool, default=False) – if True, get results for each individual point.

  • raw (bool, default=False) – if False, get boolean results (i.e. non-float).

  • ytol (float, default=0.0) – maximum acceptable difference |y - F(x')|.

  • xtol (float, default=0.0) – maximum acceptable difference |x - x'|.

  • cutoff (float, default=ytol) – zero out distances less than cutoff.

  • hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm.

Notes

xtol defines the n-dimensional base of a pilar of height ytol, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization for each x to find an appropriate x'.

ytol is a single value, while xtol is a single value or an iterable. cutoff takes a float or a boolean, where cutoff=True will set the value of cutoff to the default. Typically, the value of cutoff is ytol, 0.0, or None. hausdorff can be False (e.g. norm = 1.0), True (e.g. norm = spread(x)), or a list of points of len(x).

While cutoff and ytol are very tightly related, they play a distinct role; ytol is used to set the optimization termination for an acceptable |y - F(x')|, while cutoff is applied post-optimization.

If we are using the hausdorff norm, then ytol will set the optimization termination for an acceptable |y - F(x')| + |x - x'|/norm, where the x values are normalized by norm = hausdorff.

property values
class lipschitzcone(datapoint, slopes=None)

Bases: list

Lipschitz double cone around a data point, with vertex and slope

queries:

vertex – coordinates of lipschitz cone vertex slopes – lipschitz slopes for the cone (should be same dimension as ‘vertex’)

contains -- return True if a given point is within the cone
distance -- sum of lipschitz-weighted distance between a point and the vertex
contains(point)

return True if a given point is within the cone

distance(point)

sum of lipschitz-weighted distance between a point and the vertex

load_dataset(filename, filter=None)

read dataset from selected file

filename – string name of dataset file filter – tuple of points to select (‘False’ to ignore filter stored in file)

class point(position, value)

Bases: object

n-d data point with position and value but no id (i.e. ‘raw’)

queries:

p.value – returns value p.position – returns position p.rms – returns the square root of sum of squared position

settings:

p.value = v1 – set the value p.position = (x1, x2, …, xn) – set the position

property rms
save_dataset(data, filename='dataset.txt', filter=None, new=True)

save dataset to selected file

data – data set filename – string name of dataset file filter – tuple, filter to apply to dataset upon reading new – boolean, False if appending to existing file

measures module

Methods to support discrete measures

ess_maximum(f, samples, weights=None, tol=0.0)

calculate the max of function for support on the given list of points

ess_maximum(f,x,w) = max(f(support(x,w)))

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns

the maximum output value for a function at the given support points

ess_minimum(f, samples, weights=None, tol=0.0)

calculate the min of function for support on the given list of points

ess_minimum(f,x,w) = min(f(support(x,w)))

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns

the minimum output value for a function at the given support points

ess_ptp(f, samples, weights=None, tol=0.0)

calculate the spread of function for support on the given list of points

ess_minimum(f,x,w) = max(f(support(x,w))) - min(f(support(x,w)))

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns

the spread in output value for a function at the given support points

expectation(f, samples, weights=None, tol=0.0)

calculate the (weighted) expectation of a function for a list of points

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns

the weighted expectation for a list of sample points

expected_std(f, samples, weights=None, tol=0.0)

calculate the (weighted) expected standard deviation of a function

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns

the weighted expected standard deviation of f on a list of sample points

expected_variance(f, samples, weights=None, tol=0.0)

calculate the (weighted) expected variance of a function

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns

the weighted expected variance of f on a list of sample points

impose_collapse(pairs, samples, weights)

collapse the weight and position of each pair (i,j) in pairs

Collapse is defined as weight[j] += weight[i] and weights[i] = 0, with samples[j] = samples[i].

Inputs:

samples – a list of sample points weights – a list of sample weights pairs – set of tuples of indices (i,j) where collapse occurs

For example:
>>> impose_collapse({(0,1),(0,2)},[1,2,3,4,5],[.2,.2,.2,.2,.2])
([1.5999999999999996, 1.5999999999999996, 1.5999999999999996, 4.5999999999999996, 5.5999999999999996], [0.6000000000000001, 0.0, 0.0, 0.2, 0.2])
>>> impose_collapse({(0,1),(3,4)},[1,2,3,4,5],[.2,.2,.2,.2,.2])
([1.3999999999999999, 1.3999999999999999, 3.3999999999999999, 4.4000000000000004, 4.4000000000000004], [0.4, 0.0, 0.2, 0.4, 0.0])

Notes: is ‘mean-preserving’ for samples and ‘norm-preserving’ for weights

impose_expectation(m, f, npts, bounds=None, weights=None, **kwds)

impose a given expectation value E on a given function f, where E = m +/- tol and E = mean(f(x)) for x in bounds

Parameters
  • m (float) – target expected mean

  • f (func) – a function that takes a list and returns a number

  • npts (tuple(int)) – a tuple of dimensions of the target product measure

  • bounds (tuple, default=None) – tuple is (lower_bounds, upper_bounds)

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=None) – maximum allowable deviation from m

  • constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

a list of sample positions, with expectation E

Notes

Expectation value E is calculated by minimizing mean(f(x)) - m, over the given bounds, and will terminate when E is found within deviation tol of the target mean m. If tol is not provided, then a relative deviation of 1% of m will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with mean +/- dev imposed
>>> mean = 2.0;  dev = 0.01
>>> samples = impose_expectation(mean, f, (nx,ny,nz), (lb,ub), tol=dev)
>>>
>>> # test the results by calculating the expectation value for the samples
>>> expectation(f, samples)
>>> 2.000010010122465
impose_expected_mean_and_variance(param, f, npts, bounds=None, weights=None, **kwds)

impose a given expected mean E on a given function f, where E = m +/- tol and E = mean(f(x)) for x in bounds. Additionally, impose a given expected variance R on f, where R = v +/- tol and R = variance(f(x)) for x in bounds.

Parameters
  • param (tuple(float)) – target parameters, (mean, variance)

  • f (func) – a function that takes a list and returns a number

  • npts (tuple(int)) – a tuple of dimensions of the target product measure

  • bounds (tuple, default=None) – tuple is (lower_bounds, upper_bounds)

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=None) – maximum allowable deviation from m and v

  • constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

a list of sample positions, with expected mean E and variance R

Notes

Expected mean E and expected variance R are calculated by minimizing the sum of the absolute values of mean(f(x)) - m and variance(f(x)) - v over the given bounds, and will terminate when E and R are found within tolerance tol of the target mean m and variance v, respectively. If tol is not provided, then a relative deviation of 1% of max(m,v) will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with mean and variance imposed
>>> mean = 5.0;  var = 2.0;  tol = 0.01
>>> samples = impose_expected_mean_and_variance((mean,var), f, (nx,ny,nz),     ...                                             (lb,ub), tol=tol)
>>>
>>> # test the results by calculating the expected mean for the samples
>>> expected_mean(f, samples)
>>>
>>> # test the results by calculating the expected variance for the samples
>>> expected_variance(f, samples)
>>> 2.000010010122465
impose_expected_std(s, f, npts, bounds=None, weights=None, **kwds)

impose a given expected std E on a given function f, where E = s +/- tol and E = std(f(x)) for x in bounds

Parameters
  • s (float) – target expected standard deviation

  • f (func) – a function that takes a list and returns a number

  • npts (tuple(int)) – a tuple of dimensions of the target product measure

  • bounds (tuple, default=None) – tuple is (lower_bounds, upper_bounds)

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=None) – maximum allowable deviation from s

  • constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

a list of sample positions, with expected standard deviation E

Notes

Expected std E is calculated by minimizing std(f(x)) - s, over the given bounds, and will terminate when E is found within deviation tol of the target std s. If tol is not provided, then a relative deviation of 1% of s will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with std +/- dev imposed
>>> std = 2.0;  dev = 0.01
>>> samples = impose_expected_std(std, f, (nx,ny,nz), (lb,ub), tol=dev)
>>>
>>> # test the results by calculating the expected std for the samples
>>> expected_std(f, samples)
>>> 2.000010010122465
impose_expected_variance(v, f, npts, bounds=None, weights=None, **kwds)

impose a given expected variance E on a given function f, where E = v +/- tol and E = variance(f(x)) for x in bounds

Parameters
  • v (float) – target expected variance

  • f (func) – a function that takes a list and returns a number

  • npts (tuple(int)) – a tuple of dimensions of the target product measure

  • bounds (tuple, default=None) – tuple is (lower_bounds, upper_bounds)

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=None) – maximum allowable deviation from v

  • constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)

  • npop (int, default=200) – size of the trial solution population

  • maxiter (int, default=1000) – the maximum number of iterations to perform

  • maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns

a list of sample positions, with expected variance E

Notes

Expected variance E is calculated by minimizing variance(f(x)) - v, over the given bounds, and will terminate when E is found within deviation tol of the target variance v. If tol is not provided, then a relative deviation of 1% of v will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from

bounds is tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each parameter

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with variance +/- dev imposed
>>> var = 2.0;  dev = 0.01
>>> samples = impose_expected_variance(var, f, (nx,ny,nz), (lb,ub), tol=dev)
>>>
>>> # test the results by calculating the expected variance for the samples
>>> expected_variance(f, samples)
>>> 2.000010010122465
impose_mad(s, samples, weights=None)
impose a median absolute deviation on a list of (weighted) points

(this function is ‘median-preserving’)

Inputs:

s – the target median absolute deviation samples – a list of sample points weights – a list of sample weights

impose_mean(m, samples, weights=None)

impose a mean on a list of (weighted) points

Parameters
  • m (float) – the target mean

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

Returns

a list of sample points with the desired weighted mean

Notes

this function does not alter the weighted range or the weighted variance

impose_median(m, samples, weights=None)
impose a median on a list of (weighted) points

(this function is ‘range-preserving’ and ‘mad-preserving’)

Inputs:

m – the target median samples – a list of sample points weights – a list of sample weights

impose_moment(m, samples, weights=None, order=1, tol=0, skew=None)

impose the selected moment on a list of (weighted) points

Parameters
  • m (float) – the target moment

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • order (int, default=1) – the degree, a positive integer

  • tol (float, default=0.0) – a tolerance, where any mean <= tol is zero

  • skew (bool, default=None) – if True, allow skew in the samples

Returns

a list of sample points with the desired weighted moment

Notes

this function does not alter the weighted mean

if skew is None, then allow skew when order is odd

impose_product(mass, weights, zsum=False, zmass=1.0)

impose a product on a list of points

Inputs:

mass – target product of weights weights – a list of sample weights zsum – use counterbalance when mass = 0.0 zmass – member scaling when mass = 0.0

impose_reweighted_mean(m, samples, weights=None, solver=None)

impose a mean on a list of points by reweighting weights

impose_reweighted_std(s, samples, weights=None, solver=None)

impose a standard deviation on a list of points by reweighting weights

impose_reweighted_variance(v, samples, weights=None, solver=None)

impose a variance on a list of points by reweighting weights

impose_spread(r, samples, weights=None)

impose a range on a list of (weighted) points

Parameters
  • r (float) – the target range

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

Returns

a list of sample points with the desired weighted range

Notes

this function does not alter the weighted mean

impose_std(s, samples, weights=None)

impose a standard deviation on a list of (weighted) points

Parameters
  • s (float) – the target standard deviation

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

Returns

a list of sample points with the desired weighted standard deviation

Notes

this function does not alter the weighted mean

impose_sum(mass, weights, zsum=False, zmass=1.0)

impose a sum on a list of points

Inputs:

mass – target sum of weights weights – a list of sample weights zsum – use counterbalance when mass = 0.0 zmass – member scaling when mass = 0.0

impose_support(index, samples, weights)

set all weights not appearing in ‘index’ to zero

Inputs:

samples – a list of sample points weights – a list of sample weights index – a list of desired support indices (weights will be non-zero)

For example:
>>> impose_support([0,1],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([2.5, 3.5, 4.5, 5.5, 6.5], [0.5, 0.5, 0.0, 0.0, 0.0])
>>> impose_support([0,1,2,3],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([1.5, 2.5, 3.5, 4.5, 5.5], [0.25, 0.25, 0.25, 0.25, 0.0])
>>> impose_support([4],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([-1.0, 0.0, 1.0, 2.0, 3.0], [0.0, 0.0, 0.0, 0.0, 1.0])

Notes: is ‘mean-preserving’ for samples and ‘norm-preserving’ for weights

impose_tmean(m, samples, weights=None, k=0, clip=False)
impose a trimmed mean (at k%) on a list of (weighted) points

(this function is ‘range-preserving’ and ‘tvariance-preserving’)

Inputs:

m – the target trimmed mean samples – a list of sample points weights – a list of sample weights k – percent samples to be trimmed (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

impose_tstd(s, samples, weights=None, k=0, clip=False)
impose a trimmed std (at k%) on a list of (weighted) points

(this function is ‘tmean-preserving’)

Inputs:

s – the target trimmed standard deviation samples – a list of sample points weights – a list of sample weights k – percent samples to be trimmed (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

impose_tvariance(v, samples, weights=None, k=0, clip=False)
impose a trimmed variance (at k%) on a list of (weighted) points

(this function is ‘tmean-preserving’)

Inputs:

v – the target trimmed variance samples – a list of sample points weights – a list of sample weights k – percent samples to be trimmed (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

impose_unweighted(index, samples, weights, nullable=True)

set all weights appearing in ‘index’ to zero

Inputs:

samples – a list of sample points weights – a list of sample weights index – a list of indices where weight is to be zero nullable – if False, avoid null weights by reweighting non-index weights

For example:
>>> impose_unweighted([0,1,2],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([-0.5, 0.5, 1.5, 2.5, 3.5], [0.0, 0.0, 0.0, 0.5, 0.5])
>>> impose_unweighted([3,4],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([2.0, 3.0, 4.0, 5.0, 6.0], [0.33333333333333331, 0.33333333333333331, 0.33333333333333331, 0.0, 0.0])

Notes: is ‘mean-preserving’ for samples and ‘norm-preserving’ for weights

impose_variance(v, samples, weights=None)

impose a variance on a list of (weighted) points

Parameters
  • v (float) – the target variance

  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

Returns

a list of sample points with the desired weighted variance

Notes

this function does not alter the weighted mean

impose_weight_norm(samples, weights, mass=1.0)
normalize the weights for a list of (weighted) points

(this function is ‘mean-preserving’)

Inputs:

samples – a list of sample points weights – a list of sample weights mass – float target of normalized weights

kurtosis(samples, weights=None)

calculate the (weighted) kurtosis for a list of points

Parameters
  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

Returns

the weighted kurtosis for a list of sample points

mad(samples, weights=None)

calculate the (weighted) median absolute deviation for a list of points

Inputs:

samples – a list of sample points weights – a list of sample weights

maximum(f, samples)

calculate the max of function for the given list of points

maximum(f,x) = max(f(x))

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

Returns

the maximum output value for a function at the given inputs

mean(samples, weights=None, tol=0)

calculate the (weighted) mean for a list of points

Parameters
  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any mean <= tol is zero

Returns

the weighted mean for a list of sample points

median(samples, weights=None)

calculate the (weighted) median for a list of points

Inputs:

samples – a list of sample points weights – a list of sample weights

minimum(f, samples)

calculate the min of function for the given list of points

minimum(f,x) = min(f(x))

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

Returns

the minimum output value for a function at the given inputs

moment(samples, weights=None, order=1, tol=0)

calculate the (weighted) nth-order moment for a list of points

Parameters
  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • order (int, default=1) – the degree, a positive integer

  • tol (float, default=0.0) – a tolerance, where any mean <= tol is zero

Returns

the weighted nth-order moment for a list of sample points

norm(weights)

calculate the norm of a list of points

norm(x) = mean(x)

Parameters

weights (list) – a list of sample weights

Returns

the mean of the weights

normalize(weights, mass='l2', zsum=False, zmass=1.0)

normalize a list of points (e.g. normalize to 1.0)

Inputs:

weights – a list of sample weights mass – float target of normalized weights (or string for Ln norm) zsum – use counterbalance when mass = 0.0 zmass – member scaling when mass = 0.0

Notes: if mass=’l1’, will use L1-norm; if mass=’l2’ will use L2-norm; etc.

ptp(f, samples)

calculate the spread of function for the given list of points

minimum(f,x) = max(f(x)) - min(f(x))

Parameters
  • f (func) – a function that takes a list and returns a number

  • samples (list) – a list of sample points

Returns

the spread in output value for a function at the given inputs

skewness(samples, weights=None)

calculate the (weighted) skewness for a list of points

Parameters
  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

Returns

the weighted skewness for a list of sample points

split_param(params, npts)

splits a flat parameter list into a flat list of weights and a flat list of positions; weights and positions are expected to have the same dimensions (given by npts)

Inputs:

params – a flat list of weights and positions (formatted as noted below) npts – a tuple describing the shape of the target lists

For example:
>>> nx = 3;  ny = 2;  nz = 1
>>> par = ['wx']*nx + ['x']*nx + ['wy']*ny + ['y']*ny + ['wz']*nz + ['z']*nz
>>> weights, positions = split_param(par, (nx,ny,nz))
>>> weights
['wx','wx','wx','wy','wy','wz']
>>> positions
['x','x','x','y','y','z']
spread(samples)

calculate the range for a list of points

spread(x) = max(x) - min(x)

Parameters

samples (list) – a list of sample points

Returns

the range of the samples

standard_moment(samples, weights=None, order=1, tol=0)

calculate the (weighted) nth-order standard moment for a list of points

standard_moment(x,w,order) = moment(x,w,order)/std(x,w)^order

Parameters
  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

  • order (int, default=1) – the degree, a positive integer

  • tol (float, default=0.0) – a tolerance, where any mean <= tol is zero

Returns

the weighted nth-order standard moment for a list of sample points

std(samples, weights=None)

calculate the (weighted) standard deviation for a list of points

Parameters
  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

Returns

the weighted standard deviation for a list of sample points

support(samples, weights, tol=0)

get the positions which have non-zero weight

Parameters
  • samples (list) – a list of sample points

  • weights (list) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns

a list of positions with non-zero weight

support_index(weights, tol=0)

get the indices of the positions which have non-zero weight

Parameters
  • weights (list) – a list of sample weights

  • tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns

a list of indices of positions with non-zero weight

tmean(samples, weights=None, k=0, clip=False)

calculate the (weighted) trimmed mean for a list of points

Inputs:

samples – a list of sample points weights – a list of sample weights k – percent samples to trim (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

NOTE: if all samples are excluded, will return nan

tstd(samples, weights=None, k=0, clip=False)

calculate the (weighted) trimmed standard deviation for a list of points

Inputs:

samples – a list of sample points weights – a list of sample weights k – percent samples to trim (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

NOTE: if all samples are excluded, will return nan

tvariance(samples, weights=None, k=0, clip=False)

calculate the (weighted) trimmed variance for a list of points

Inputs:

samples – a list of sample points weights – a list of sample weights k – percent samples to trim (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

NOTE: if all samples are excluded, will return nan

variance(samples, weights=None)

calculate the (weighted) variance for a list of points

Parameters
  • samples (list) – a list of sample points

  • weights (list, default=None) – a list of sample weights

Returns

the weighted variance for a list of sample points

weighted_select(samples, weights, mass=1.0)

randomly select a sample from weighted set of samples

Parameters
  • samples (list) – a list of sample points

  • weights (list) – a list of sample weights

  • mass (float, default=1.0) – sum of normalized weights

Returns

a randomly selected sample point

poly module

tools for polynomial functions

poly1d(coeff)

generates a 1-D polynomial instance from a list of coefficients using numpy.poly1d(coeffs)

polyeval(coeffs, x)

takes list of coefficients & evaluation points, returns f(x) thus, [a3, a2, a1, a0] yields a3 x^3 + a2 x^2 + a1 x^1 + a0

samples module

tools related to sampling

alpha(n, diameter, epsilon=0.01)
random_samples(lb, ub, npts=10000, dist=None, clip=False)

generate npts samples from the given distribution between given lb & ub

Inputs:

dist – a mystic.tools.Distribution instance (or list of Distributions) lower bounds – a list of the lower bounds upper bounds – a list of the upper bounds npts – number of sample points [default = 10000] clip – if True, clip at bounds, else resample [default = False]

sample(f, lb, ub, npts=10000)

return number of failures and successes for some boolean function f

Inputs:

f – a function that returns True for ‘success’ and False for ‘failure’ lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000]

sampled_mean(f, lb, ub, npts=10000)

use random sampling to calculate the mean of a function

Inputs:

f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000]

sampled_pof(f, lb, ub, npts=10000)

use random sampling to calculate probability of failure for a function

Inputs:

f – a function that returns True for ‘success’ and False for ‘failure’ lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000]

sampled_prob(pts, lb, ub)

calculates probability by sampling if points are inside the given bounds

Inputs:

pts – a list of sample points lb – a list of lower bounds ub – a list of upper bounds

sampled_pts(pts, lb, ub)

determine the number of sample points inside the given bounds

Inputs:

pts – a list of sample points lb – a list of lower bounds ub – a list of upper bounds

sampled_variance(f, lb, ub, npts=10000)

use random sampling to calculate the variance of a function

Inputs:

f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000]

stats module

shortcut (?) math tools related to statistics; also, math tools related to gaussian distributions

cdf_factory(mean, variance)

Returns cumulative distribution function (as a Python function) for a Gaussian, given the mean and variance

erf(x)

evaluate the error function at x

gamma(x)

evaluate the gamma function at x

lgamma(x)

evaluate the natual log of the abs value of the gamma function at x

mcdiarmid_bound(mean, diameter)

calculates McDiarmid bound given mean and McDiarmid diameter

mean(expectation, volume)

calculates mean given expectation and volume

meanconf(std, npts, percent=95)

mean confidence interval: returns conf, where interval = mean +/- conf

pdf_factory(mean, variance)

Returns a probability density function (as a Python function) for a Gaussian, given the mean and variance

prob_mass(volume, norm)

calculates probability mass given volume and norm

sampvar(var, npts)

sample variance from variance

stderr(std, npts)

standard error

varconf(var, npts, percent=95, tight=False)

var confidence interval: returns max interval distance from var

volume(lb, ub)

calculates volume for a uniform distribution in n-dimensions