mystic.math module documentation

approx module

tools for measuring equality

_float_approx_equal(x, y, tol=1e-18, rel=1e-07)

almostEqual(x, y, tol=1e-18, rel=1e-07)

Returns True if two arrays are element-wise equal within a tolerance.

The tolerance values are positive, typically very small numbers. The relative difference (rtol * abs(b)) and the absolute difference atol are added together to compare against the absolute difference between a and b.

Parameters:

a (array_like) – Input arrays to compare.
b (array_like) – Input arrays to compare.
rtol (float) – The relative tolerance parameter (see Notes).
atol (float) – The absolute tolerance parameter (see Notes).

Returns:

y – Returns True if the two arrays are equal within the given tolerance; False otherwise. If either array contains nan, then False is returned.

Return type:

bool

Notes

If the following equation is element-wise True, then almostEqual returns True.

absolute(a - b) <= (atol + rtol * absolute(b))

Examples

>>> almostEqual([1e10,1.2345678], [1e10,1.2345677])
True
>>> almostEqual([1e10,1.234], [1e10,1.235])
False

approx_equal(x, y, *args, **kwargs)

Return True if x and y are approximately equal, otherwise False.

Parameters:

x (object) – first object to compare
y (object) – second object to compare
tol (float, default=1e-18) – absolute error
rel (float, default=1e-7) – relative error

Returns:

True if x and y are equal within tolerance, otherwise returns False.

Notes

If x and y are floats, return True if y is within either absolute error tol or relative error rel of x. You can disable either the absolute or relative check by passing None as tol or rel (but not both).

For any other objects, x and y are checked in that order for a method __approx_equal__, and the result of that is returned as a bool. Any optional arguments are passed to the __approx_equal__ method.

__approx_equal__ can return NotImplemented to signal it doesn’t know how to perform the specific comparison, in which case the other object is checked instead. If neither object has the method, or both defer by returning NotImplemented, then fall back on the same numeric comparison that is used for floats.

Examples

>>> approx_equal(1.2345678, 1.2345677)
True
>>> approx_equal(1.234, 1.235)
False

tolerance(x, tol=1e-15, rel=1e-15): relative plus absolute difference

compressed module

helpers for compressed format for measures

binary(n): converts an int to binary (returned as a string) Hence, int(binary(x), base=2) == x

binary2coords(binaries, positions, **kwds): convert a list of binary strings to product measure coordinates

differs_by_one(ith, binaries, all=True, index=True)

get the binary string that differs by exactly one index

Inputs:: ith = the target index binaries = a list of binary strings all = if False, return only the results for indices < i index = if True, return the index of the results (not results themselves)

index2binary(index, npts=None): convert a list of integers to a list of binary strings

discrete module

Classes for discrete measure data objects. Includes point_mass, measure, product_measure, and scenario classes.

_list_of_measures(samples, weights=None)

generate a list of N x 1D discrete measures from a nested list of N x 1D discrete measure positions and a nested list of N x 1D weights.

Note this function does not return a product measure, it returns a list.

_mimic(samples, weights)

Generate a product_measure object from a list of N product measure positions and a list of N weights. The resulting product measure will mimic the original product measure’s statistics, but be larger in size.

Examples

>>> smp = [[-6,3,6],[-2,4],[1]]
>>> wts = [[.4,.2,.4],[.5,.5],[1.]]
>>> c = compose(samples, weights)
>>> d = _mimic(c.positions, c.weights)
>>> c[0].center_mass == d[0].center_mass
True
>>> c[1].range == d[1].range
True
>>> c.npts == d.npts
False
>>> c.npts == d[0].npts
True

_uniform_weights(samples)

generate a nested list of N x 1D weights from a nested list of N x 1D discrete measure positions, where the weights have norm 1.0 and are uniform.

>>> c.pos
[[1, 2, 3], [4, 5], [6]]
>>> _uniform_weights(c.pos)
[[0.333333333333333, 0.333333333333333, 0.333333333333333], [0.5, 0.5], [1.0]]

bounded_mean(mean_x, samples, xmin, xmax, wts=None)

compose(samples, weights=None): Generate a product_measure object from a nested list of N x 1D discrete measure positions and a nested list of N x 1D weights. If weights are not provided, a uniform distribution with norm = 1.0 will be used.

decompose(c): Decomposes a product_measure object into a nested list of N x 1D discrete measure positions and a nested list of N x 1D weights.

flatten(c): Flattens a product_measure object into a list.

impose_feasible(cutoff, data, guess=None, **kwds)

impose shortness on a given scenario

This function attempts to minimize the infeasibility between observed data and a scenario of synthetic data by perforing an optimization on w,x,y over the given bounds.

Parameters:

cutoff (float) – maximum acceptable deviation from shortness
data (mystic.math.discrete.scenario) – a dataset of observed points
guess (mystic.math.discrete.scenario, default=None) – the synthetic points
tol (float, default=0.0) – maximum acceptable optimizer termination for sum(infeasibility).
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function x' = constraints(x), where x is a scenario that has been converted into a list of parameters (e.g. with scenario.flatten), and x' is the list of parameters after the encoded constaints have been satisfied.

Returns:

a scenario with desired shortness

Notes

Here, tol is used to set the optimization termination for minimizing the sum(infeasibility), while cutoff is used in defining the deviation from shortness for observed x,y and synthetic x',y'.

guess can be either a scenario providing initial guess at feasibility, or a tuple of the dimensions of the desired scenario, where initial values will be chosen at random.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each of the scenario weights, positions, and values.

impose_valid(cutoff, model, guess=None, **kwds)

impose model validity on a given scenario

This function attempts to minimize the graph distance between reality (data), y = G(x), and an approximating function, y' = F(x'), by perforing an optimization on w,x,y over the given bounds.

Parameters:

cutoff (float) – maximum acceptable model invalidity |y - F(x')|.
model (func) – the model function, y' = F(x').
guess (scenario, default=None) – a scenario, defines y = G(x).
hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm
xtol (float, default=0.0) – maximum acceptable pointwise graphical distance between model and reality.
tol (float, default=0.0) – maximum acceptable optimizer termination for sum(graphical distances).
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function x' = constraints(x), where x is a scenario that has been converted into a list of parameters (e.g. with scenario.flatten), and x' is the list of parameters after the encoded constaints have been satisfied.

Returns:

a scenario with the desired model validity

Notes

xtol defines the n-dimensional base of a pilar of height cutoff, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization to find a set of points where the model is valid. Here, tol is used to set the optimization termination for minimizing the sum(graphical_distances), while cutoff is used in defining the graphical distance between x,y and x',F(x').

guess can be either a scenario providing initial guess at validity, or a tuple of the dimensions of the desired scenario, where initial values will be chosen at random.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each of the scenario weights, positions, and values.

mean_y_norm_wts_constraintsFactory(target, pts): factory for a constraints function that: - imposes a mean on scenario values - normalizes weights

norm_wts_constraintsFactory(pts): factory for a constraints function that: - normalizes weights

unflatten(params, npts): Map a list of random variables to N x 1D discrete measures in a product_measure object.

class point_mass(position, weight=1.0)

Bases: object

a point mass object with weight and position

Parameters:

position (tuple(float)) – position of the point mass
weight (float, default=1.0) – weight of the point mass

Variables:

position (tuple(float)) – position of the point mass
weight (float) – weight of the point mass

__repr__(): Return repr(self).

__rms()

property rms

square root of the sum of squared position

Type:: readonly

class measure(iterable=(), /)

Bases: list

a 1-d collection of point masses forming a ‘discrete measure’

Parameters:: iterable (list) – a list of mystic.math.discrete.point_mass objects

Notes

assumes only contains mystic.math.discrete.point_mass objects
assumes measure.n = len(measure.positions) == len(measure.weights)
relies on constraints to impose notions such as sum(weights) == 1.0

__mass()

__mean()

__n()

__positions()

__range()

__set_mean(m)

__set_positions(positions)

__set_range(r)

__set_variance(v)

__set_weights(weights)

__variance()

__weights()

property center_mass: sum of weights * positions

ess_maximum(f, tol=0.0)

calculate the maximum for the support of a given function

Parameters:

f (func) – a function that takes a list and returns a number
tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns:

the maximum value of f over all measure positions with support

ess_minimum(f, tol=0.0)

calculate the minimum for the support of a given function

Parameters:

f (func) – a function that takes a list and returns a number
tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns:

the minimum value of f over all measure positions with support

ess_ptp(f, tol=0.0)

calculate the spread for the support of a given function

Parameters:

f (func) – a function that takes a list and returns a number
tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns:

the spread of the values of f over all measure positions with support

expect(f)

calculate the expectation for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the expectation of f over all measure positions

expect_var(f)

calculate the expected variance for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the expected variance of f over all measure positions

property mass

the sum of the weights

Type:: readonly

maximum(f)

calculate the maximum for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the maximum value of f over all measure positions

minimum(f)

calculate the minimum for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the minimum value of f over all measure positions

normalize()

normalize the weights to 1.0

Parameters:: None
Returns:: None

property npts

the number of point masses in the measure

Type:: readonly

property positions: a list of positions for all point masses in the measure

ptp(f)

calculate the spread for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the spread of the values of f over all measure positions

property range: |max - min| for the positions

set_expect(expected, f, bounds=None, constraints=None, **kwds)

impose an expectation on the measure by adjusting the positions

Parameters:

expected (float) – target expected mean
f (func) – a function that takes a list and returns a number
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.
tol (float, default=None) – maximum allowable deviation from expected
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns:

None

Notes

Expectation E is calculated by minimizing mean(f(x)) - expected, over the given bounds, and will terminate when E is found within deviation tol of the target mean expected. If tol is not provided, then a relative deviation of 1% of expected will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

set_expect_mean_and_var(expected, f, bounds=None, constraints=None, **kwds)

impose expected mean and var on the measure by adjusting the positions

Parameters:

expected (tuple(float)) – (expected mean, expected var)
f (func) – a function that takes a list and returns a number
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.
tol (tuple(float), default=None) – maximum allowed deviation of expected
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations
k (float, default=1e+6) – the penalty multiplier (for misfit in variance)

Returns:

None

Notes

Expected mean E and expected variance R are calculated by minimizing the cost plus a penalty, where the cost is the absolute value of mean(f(x)) - m and a penalty is incurred when variance(f(x)) - v is greater than tol[-1]. The optimization is over the given bounds, and will terminate when E and R are found within tolerance tol of the target mean m and variance v, respectively. If tol is not provided, then a relative deviation of 1% of max(m,v) will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

set_expect_var(expected, f, bounds=None, constraints=None, **kwds)

impose an expected variance on the measure by adjusting the positions

Parameters:

expected (float) – target expected variance
f (func) – a function that takes a list and returns a number
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.
tol (float, default=None) – maximum allowable deviation from expected
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns:

None

Notes

Expected var E is calculated by minimizing var(f(x)) - expected, over the given bounds, and will terminate when E is found within deviation tol of the target variance expected. If tol is not provided, then a relative deviation of 1% of expected will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

support(tol=0)

get the positions with non-zero weight (i.e. support)

Parameters:: tol (float, default=0.0) – tolerance, where any weight <= tol is zero
Returns:: the list of positions with support

support_index(tol=0)

get the indices where there is support (i.e. non-zero weight)

Parameters:: tol (float, default=0.0) – tolerance, where any weight <= tol is zero
Returns:: the list of indices where there is support

property var: mean(|positions - mean(positions)|**2)

property weights: a list of weights for all point masses in the measure

class product_measure(iterable=(), /)

Bases: list

a N-d measure-theoretic product of discrete measures

Parameters:: iterable (list) – a list of mystic.math.discrete.measure objects

Notes

all measures are treated as if they are orthogonal
assumes only contains mystic.math.discrete.measure objects
assumes len(product_measure.positions) == len(product_measure.weights)
relies on constraints to impose notions such as sum(weights) == 1.0
relies on constraints to impose expectation (within acceptable deviation)
positions are (xi,yi,zi) with weights (wxi,wyi,wzi), where weight wxi at xi should be the same for each (yj,zk). Similarly for each wyi and wzi.

__mass()

__mean()

__n()

__pos()

__positions()

__pts()

__set_mean(center_masses)

__set_positions(positions)

__val()

__weights()

__wts()

property center_mass: sum of weights * positions

differs_by_one(ith, all=True, index=True)

get the coordinates where the associated binary string differs by exactly one index

Parameters:

ith (int) – the target index
all (bool, default=True) – if False, only return results for indices < i
index (bool, default=True) – if True, return the indices of the results instead of the results themselves

Returns:

the coordinates where the associated binary string differs by one, or if index is True, return the corresponding indices

ess_maximum(f, tol=0.0)

calculate the maximum for the support of a given function

Parameters:

f (func) – a function that takes a list and returns a number
tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns:

the maximum value of f over all measure positions with support

ess_minimum(f, tol=0.0)

calculate the minimum for the support of a given function

Parameters:

f (func) – a function that takes a list and returns a number
tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns:

the minimum value of f over all measure positions with support

ess_ptp(f, tol=0.0)

calculate the spread for the support of a given function

Parameters:

f (func) – a function that takes a list and returns a number
tol (float, default=0.0) – tolerance, where any weight <= tol is zero

Returns:

the spread of values of f over all measure positions with support

expect(f)

calculate the expectation for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the expectation of f over all measure positions

expect_var(f)

calculate the expected variance for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the expected variance of f over all measure positions

flatten()

convert a product measure to a single list of parameters

Parameters:: None
Returns:: a list of parameters

Notes

Given product_measure.pts = (M, N, ...), then the returned list is params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params will have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired product measure.

load(params, pts)

load the product measure from a list of parameters

Parameters:

params (list(float)) – parameters corresponding to N 1D discrete measures
pts (tuple(int)) – number of point masses in each of the discrete measures

Returns:

the product measure itself

Return type:

self (measure)

Notes

To append len(pts) new discrete measures to the product measure, it is assumed params either corresponds to the correct number of weights and positions specified by pts, or params has additional values (typically output values) which will be ignored. It is assumed that len(params) >= 2 * sum(product_measure.pts).

Given the value of pts = (M, N, ...), it is assumed that params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params should have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired product measure.

property mass

a list of weight norms

Type:: readonly

maximum(f)

calculate the maximum for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the maximum value of f over all measure positions

minimum(f)

calculate the minimum for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the minimum value of f over all measure positions

property npts

the total number of point masses in the product measure

Type:: readonly

pof(f)

calculate probability of failure for a given function

Parameters:: f (func) – a function returning True for ‘success’ and False for ‘failure’
Returns:: the probabilty of failure, a float in [0.0,1.0]

Notes

the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0 or False)

property pos

a list of positions for each discrete mesure

Type:: readonly

property positions: a list of positions for all point masses in the product measure

ptp(f)

calculate the spread for a given function

Parameters:: f (func) – a function that takes a list and returns a number
Returns:: the spread for values of f over all measure positions

property pts

the number of point masses for each discrete mesure

Type:: readonly

sampled_expect(f, npts=10000, map=None)

use sampling to calculate expected value for a given function

Parameters:

f (func) – a function that takes a list and returns a number
npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures
map (func, default=builtins.map) – the mapping function

Returns:

the expected value, a float

Notes

the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

sampled_maximum(f, npts=10000, map=None)

use sampling to calculate maximum for the support for a given function

Parameters:

f (func) – a function that takes a list and returns a number
npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures
map (func, default=builtins.map) – the mapping function

Returns:

the ess_maximum, a float

Notes

the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

sampled_minimum(f, npts=10000, map=None)

use sampling to calculate minimum for the support for a given function

Parameters:

f (func) – a function that takes a list and returns a number
npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures
map (func, default=builtins.map) – the mapping function

Returns:

the sampled ess_minimum, a float

Notes

the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

sampled_pof(f, npts=10000, map=None)

use sampling to calculate probability of failure for a given function

Parameters:

f (func) – a function returning True for ‘success’ and False for ‘failure’
npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures
map (func, default=builtins.map) – the mapping function

Returns:

the probabilty of failure, a float in [0.0,1.0]

Notes

the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0 or False)

sampled_ptp(f, npts=10000, map=None)

use sampling to calculate spread for the support for a given function

Parameters:

f (func) – a function that takes a list and returns a number
npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures
map (func, default=builtins.map) – the mapping function

Returns:

the sampled |ess_maximum - ess_minimum|, a float

Notes

the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

sampled_support(npts=10000)

randomly select support points from the underlying discrete measures

Parameters:: npts (int, default=10000) – the number of sampled points
Returns:: a list of len(product measure) lists, each of length len(npts)

sampled_variance(f, npts=10000, map=None)

use sampling to calculate expected variance for a given function

Parameters:

f (func) – a function that takes a list and returns a number
npts (int, default=10000) – the number of point masses sampled from the underlying discrete measures
map (func, default=builtins.map) – the mapping function

Returns:

the expected variance, a float

Notes

the function f should take a list of positions (for example, scenario.positions or product_measure.positions) and return a single value (e.g. 0.0)

select(*index, **kwds)

generate product measure positions for the selected position indices

Parameters:: index (tuple(int)) – tuple of position indicies
Returns:: a list of product measure positions for the selected indices

Examples

>>> r
[[9, 8], [1, 3], [4, 2]]
>>> r.select(*range(r.npts))
[(9, 1, 4), (8, 1, 4), (9, 3, 4), (8, 3, 4), (9, 1, 2), (8, 1, 2), (9, 3, 2), (8, 3, 2)]
>>>
>>> _pack(r)
[(9, 1, 4), (8, 1, 4), (9, 3, 4), (8, 3, 4), (9, 1, 2), (8, 1, 2), (9, 3, 2), (8, 3, 2)]

Notes

This only works for product measures of dimension 2^K

set_expect(expected, f, bounds=None, constraints=None, **kwds)

impose an expectation on the measure by adjusting the positions

Parameters:

expected (float) – target expected mean
f (func) – a function that takes a list and returns a number
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.
tol (float, default=None) – maximum allowable deviation from expected
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns:

None

Notes

Expectation E is calculated by minimizing mean(f(x)) - expected, over the given bounds, and will terminate when E is found within deviation tol of the target mean expected. If tol is not provided, then a relative deviation of 1% of expected will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

set_expect_mean_and_var(expected, f, bounds=None, constraints=None, **kwds)

impose expected mean and var on the measure by adjusting the positions

Parameters:

expected (tuple(float)) – (expected mean, expected var)
f (func) – a function that takes a list and returns a number
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.
tol (tuple(float), default=None) – maximum allowed deviation of expected
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations
k (float, default=1e+6) – the penalty multiplier (for misfit in variance)

Returns:

None

Notes

Expected mean E and expected variance R are calculated by minimizing the cost plus a penalty, where the cost is the absolute value of mean(f(x)) - m and a penalty is incurred when variance(f(x)) - v is greater than tol[-1]. The optimization is over the given bounds, and will terminate when E and R are found within tolerance tol of the target mean m and variance v, respectively. If tol is not provided, then a relative deviation of 1% of max(m,v) will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

set_expect_var(expected, f, bounds=None, constraints=None, **kwds)

impose an expected variance on the measure by adjusting the positions

Parameters:

expected (float) – target expected variance
f (func) – a function that takes a list and returns a number
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function c' = constraints(c), where c is a product measure, and c' is a product measure where the encoded constaints are satisfied.
tol (float, default=None) – maximum allowable deviation from expected
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns:

None

Notes

Expected var E is calculated by minimizing var(f(x)) - expected, over the given bounds, and will terminate when E is found within deviation tol of the target variance expected. If tol is not provided, then a relative deviation of 1% of expected will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

support(tol=0)

get the positions with non-zero weight (i.e. support)

Parameters:: tol (float, default=0.0) – tolerance, where any weight <= tol is zero
Returns:: the list of positions with support

support_index(tol=0)

get the indices where there is support (i.e. non-zero weight)

Parameters:: tol (float, default=0.0) – tolerance, where any weight <= tol is zero
Returns:: the list of indices where there is support

update(params)

update the product measure from a list of parameters

Parameters:: params (list(float)) – parameters corresponding to N 1D discrete measures
Returns:: the product measure itself
Return type:: self (measure)

Notes

The dimensions of the product measure will not change upon update, and it is assumed params either corresponds to the correct number of weights and positions for the existing product_measure, or params has additional values (typically output values) which will be ignored. It is assumed that len(params) >= 2 * sum(product_measure.pts).

If product_measure.pts = (M, N, ...), then it is assumed that params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params should have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired product measure.

property weights: a list of weights for all point masses in the product measure

property wts

a list of weights for each discrete mesure

Type:: readonly

class scenario(pm=None, values=None)

Bases: product_measure

a N-d product measure with associated data values

A scenario is a measure-theoretic product of discrete measures that also includes a list of associated values, with the values corresponding to measured or synthetic data for each measure position. Each point mass in the product measure is paired with a value, and thus, essentially, a scenario is equivalent to a mystic.math.legacydata.dataset stored in a product_measure representation.

Parameters:

pm (mystic.math.discrete.product_measure, default=None) – a product measure
values (list(float), default=None) – values associated with each position

Notes

all measures are treated as if they are orthogonal
relies on constraints to impose notions such as sum(weights) == 1.0
relies on constraints to impose expectation (within acceptable deviation)
positions are (xi,yi,zi) with weights (wxi,wyi,wzi), where weight wxi at xi should be the same for each (yj,zk). Similarly for each wyi and wzi.

__set_values(values)

__values()

flatten(all=True)

convert a scenario to a single list of parameters

Parameters:: all (bool, default=True) – if True, append the scenario values
Returns:: a list of parameters

Notes

Given scenario.pts = (M, N, ...), then the returned list is params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params will have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the scenario. If all is True, then the scenario.values will be appended to the list of parameters.

load(params, pts)

load the scenario from a list of parameters

Parameters:

params (list(float)) – parameters corresponding to N 1D discrete measures
pts (tuple(int)) – number of point masses in each of the discrete measures

Returns:

the scenario itself

Return type:

self (scenario)

Notes

To append len(pts) new discrete measures to the scenario, it is assumed params either corresponds to the correct number of weights and positions specified by pts, or params has additional values which will be saved as the scenario.values. It is assumed that len(params) >= 2 * sum(scenario.pts).

Given the value of pts = (M, N, ...), it is assumed that params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params should have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired scenario. Any remaining parameters will be treated as scenario.values.

mean_value()

calculate the mean of the associated values for a scenario

Parameters:: None
Returns:: the weighted mean of the scenario values

pof_value(f)

calculate probability of failure for a given function

Parameters:: f (func) – a function returning True for ‘success’ and False for ‘failure’
Returns:: the probabilty of failure, a float in [0.0,1.0]

Notes

the function f should take a list of values (for example, scenario.values) and return a single value (e.g. 0.0 or False)

set_feasible(data, cutoff=0.0, bounds=None, constraints=None, with_self=True, **kwds)

impose shortness with respect to the given data points

This function attempts to minimize the infeasibility between observed data and the scenario of synthetic data by perforing an optimization on w,x,y over the given bounds.

Parameters:

data (mystic.math.discrete.scenario) – a dataset of observed points
cutoff (float, default=0.0) – maximum acceptable deviation from shortness
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function x' = constraints(x), where x is a scenario that has been converted into a list of parameters (e.g. with scenario.flatten), and x' is the list of parameters after the encoded constaints have been satisfied.
with_self (bool, default=True) – if True, shortness is also self-consistent
tol (float, default=0.0) – maximum acceptable optimizer termination for sum(infeasibility).

Returns:

None

Notes

both scenario.positions and scenario.values may be adjusted.
if with_self is True, shortness will be measured not only from the scenario to the given data, but also between scenario datapoints.
bounds can also be given as a tuple with length(bounds) == 2,

composed of all the lower bounds, then all the upper bounds, for each of the scenario weights, positions, and values.

set_mean_value(m)

set the mean for the associated values of a scenario

Parameters:: m (float) – the target weighted mean of the scenario values
Returns:: None

set_valid(model, cutoff=0.0, bounds=None, constraints=None, **kwds)

impose model validity on a scenario by adjusting positions and values

This function attempts to minimize the graph distance between reality (data), y = G(x), and an approximating function, y' = F(x'), by perforing an optimization on w,x,y over the given bounds.

Parameters:

model (func) – a model y' = F(x') that approximates reality y = G(x)
cutoff (float, default=0.0) – acceptable model invalidity |y - F(x')|
bounds (bounds, default=None) – a bounds instance
constraints (func, default=None) – a function x' = constraints(x), where x is a scenario that has been converted into a list of parameters (e.g. with scenario.flatten), and x' is the list of parameters after the encoded constaints have been satisfied.
hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm
xtol (float, default=0.0) – maximum acceptable pointwise graphical distance between model and reality.
tol (float, default=0.0) – maximum acceptable optimizer termination for sum(graphical distances).

Returns:

None

Notes

xtol defines the n-dimensional base of a pilar of height cutoff,

centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization to find a set of points where the model is valid. Here, tol is used to set the optimization termination for minimizing the sum(graphical_distances), while cutoff is used in defining the graphical distance between x,y and x',F(x'). - bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each of the scenario weights, positions, and values.

short_wrt_data(data, L=None, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for shortness with respect to the given data

Parameters:

data (list) – a list of data points or dataset to compare against.
L (float, default=None) – the lipschitz constant, if different than in data.
blamelist (bool, default=False) – if True, indicate the infeasible points.
pairs (bool, default=True) – if True, indicate indices of infeasible points.
all (bool, default=False) – if True, get results for each individual point.
raw (bool, default=False) – if False, get boolean results (i.e. non-float).
tol (float, default=0.0) – maximum acceptable deviation from shortness.
cutoff (float, default=tol) – zero out distances less than cutoff.

Notes

Each point x,y can be thought to have an associated double-cone with slope equal to the lipschitz constant. Shortness with respect to another point is defined by the first point not being inside the cone of the second. We can allow for some error in shortness, a short tolerance tol, for which the point x,y is some acceptable y-distance inside the cone. While very tightly related, cutoff and tol play distinct roles; tol is subtracted from calculation of the lipschitz_distance, while cutoff zeros out the value of any element less than the cutoff.

short_wrt_self(L, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for shortness with respect to the scenario itself

Parameters:

L (float) – the lipschitz constant.
blamelist (bool, default=False) – if True, indicate the infeasible points.
pairs (bool, default=True) – if True, indicate indices of infeasible points.
all (bool, default=False) – if True, get results for each individual point.
raw (bool, default=False) – if False, get boolean results (i.e. non-float).
tol (float, default=0.0) – maximum acceptable deviation from shortness.
cutoff (float, default=tol) – zero out distances less than cutoff.

Notes

Each point x,y can be thought to have an associated double-cone with slope equal to the lipschitz constant. Shortness with respect to another point is defined by the first point not being inside the cone of the second. We can allow for some error in shortness, a short tolerance tol, for which the point x,y is some acceptable y-distance inside the cone. While very tightly related, cutoff and tol play distinct roles; tol is subtracted from calculation of the lipschitz_distance, while cutoff zeros out the value of any element less than the cutoff.

update(params)

update the scenario from a list of parameters

Parameters:: params (list(float)) – parameters corresponding to N 1D discrete measures
Returns:: the scenario itself
Return type:: self (scenario)

Notes

The dimensions of the scenario will not change upon update, and it is assumed params either corresponds to the correct number of weights and positions for the existing scenario, or params has additional values which will be saved as the scenario.values. It is assumed that len(params) >= 2 * sum(scenario.pts).

If scenario.pts = (M, N, ...), then it is assumed that params = [wx1, ..., wxM, x1, ..., xM, wy1, ..., wyN, y1, ..., yN, ...]. Thus, params should have M weights and M corresponding positions, followed by N weights and N corresponding positions, with this pattern followed for each new dimension of the desired scenario.

valid_wrt_model(model, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for scenario validity with respect to the model

Parameters:

model (func) – the model function, y' = F(x').
blamelist (bool, default=False) – if True, indicate the infeasible points.
pairs (bool, default=True) – if True, indicate indices of infeasible points.
all (bool, default=False) – if True, get results for each individual point.
raw (bool, default=False) – if False, get boolean results (i.e. non-float).
ytol (float, default=0.0) – maximum acceptable difference |y - F(x')|.
xtol (float, default=0.0) – maximum acceptable difference |x - x'|.
cutoff (float, default=ytol) – zero out distances less than cutoff.
hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm.

Notes

xtol defines the n-dimensional base of a pilar of height ytol, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization for each x to find an appropriate x'.

ytol is a single value, while xtol is a single value or an iterable. cutoff takes a float or a boolean, where cutoff=True will set the value of cutoff to the default. Typically, the value of cutoff is ytol, 0.0, or None. hausdorff can be False (e.g. norm = 1.0), True (e.g. norm = spread(x)), or a list of points of len(x).

While cutoff and ytol are very tightly related, they play a distinct role; ytol is used to set the optimization termination for an acceptable |y - F(x')|, while cutoff is applied post-optimization.

If we are using the hausdorff norm, then ytol will set the optimization termination for an acceptable |y - F(x')| + |x - x'|/norm, where the x values are normalized by norm = hausdorff.

property values: a list of values corresponding to output data for all point masses in the underlying product measure

distance module

distances and norms for the legacy data module

Lnorm(weights, p=1, axis=None)

calculate L-p norm of weights

Parameters:

weights (array(float)) – an array of weights
p (int, default=1) – the power of the p-norm, where p in [0,inf]
axis (int, default=None) – axis used to take the norm along

Returns:

a float distance norm for the weights

_get_xy(points): extract the list of positions and the list of values for given points

_npts(*x): get len(product measure), given lengths of each underlying measure

absolute_distance(x, xp=None, pair=False, dmin=0)

pointwise (or pairwise) absolute distance

pointwise = |x.T[:,newaxis] - x'.T| or pairwise = |x.T - x'.T|.T

Parameters:

x (array) – an array of points, x
xp (array, default=None) – a second array of points, x'
pair (bool, default=False) – if True, return the pairwise distances
dmin (int, default=0) – upconvert x,x' to dimension >= dmin

Returns:

an array of absolute distances between points

Notes

x'==x is symmetric with zeros on the diagonal
use dmin=2 for the forced upconversion of 1-D arrays

chebyshev(x, xp=None, pair=False, dmin=0, axis=None)

infinity norm distance between points in euclidean space

d(inf) = max(|x[0] - x[0]'|, |x[1] - x[1]'|, ..., |x[n] - x[n]'|)

Parameters:

x (array) – an array of points, x
xp (array, default=None) – a second array of points, x'
pair (bool, default=False) – if True, return the pairwise distances
dmin (int, default=0) – upconvert x,x' to dimension >= dmin
axis (int, default=None) – if not None, reduce across the given axis

Returns:

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

euclidean(x, xp=None, pair=False, dmin=0, axis=None)

L-2 norm distance between points in euclidean space

d(2) = sqrt(sum(|x[0] - x[0]'|^2, |x[1] - x[1]'|^2, ..., |x[n] - x[n]'|^2))

Parameters:

x (array) – an array of points, x
xp (array, default=None) – a second array of points, x'
pair (bool, default=False) – if True, return the pairwise distances
dmin (int, default=0) – upconvert x,x' to dimension >= dmin
axis (int, default=None) – if not None, reduce across the given axis

Returns:

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

graphical_distance(model, points, **kwds)

find the radius(x') that minimizes the graph between reality (data), y = G(x), and an approximating function, y' = F(x').

Parameters:

model (func) – a model y' = F(x') that approximates reality y = G(x)
points (mystic.math.legacydata.dataset) – a dataset, defines y = G(x)
ytol (float, default=0.0) – maximum acceptable difference |y - F(x')|.
xtol (float, default=0.0) – maximum acceptable difference |x - x'|.
cutoff (float, default=ytol) – zero out distances less than cutoff.
hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm.

Returns:

the radius (the minimum distance x,G(x) to x',F(x') for each x)

Notes

points can be a mystic.math.legacydata.dataset or a list of mystic.math.legacydata.datapoint objects.

xtol defines the n-dimensional base of a pilar of height ytol, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization for each x to find an appropriate x'.

ytol is a single value, while xtol is a single value or an iterable. cutoff takes a float or a boolean, where cutoff=True will set the value of cutoff to the default. Typically, the value of cutoff is ytol, 0.0, or None. hausdorff can be False (e.g. norm = 1.0), True (e.g. norm = spread(x)), or a list of points of len(x).

While cutoff and ytol are very tightly related, they play a distinct role; ytol is used to set the optimization termination for an acceptable |y - F(x')|, while cutoff is applied post-optimization.

If we are using the hausdorff norm, then ytol will set the optimization termination for an acceptable |y - F(x')| + |x - x'|/norm, where the x values are normalized by norm = hausdorff.

hamming(x, xp=None, pair=False, dmin=0, axis=None)

zero ‘norm’ distance between points in euclidean space

d(0) = sum(x[0] != x[0]', x[1] != x[1]', ..., x[n] != x[n]')

Parameters:

x (array) – an array of points, x
xp (array, default=None) – a second array of points, x'
pair (bool, default=False) – if True, return the pairwise distances
dmin (int, default=0) – upconvert x,x' to dimension >= dmin
axis (int, default=None) – if not None, reduce across the given axis

Returns:

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

infeasibility(distance, cutoff=0.0)

amount by which the distance exceeds the given cutoff distance

Parameters:

distance (array) – the measure of feasibility for each point
cutoff (float, default=0.0) – maximum acceptable distance

Returns:

an array of distances by which each point is infeasbile

is_feasible(distance, cutoff=0.0)

determine if the distance exceeds the given cutoff distance

Parameters:

distance (array) – the measure of feasibility for each point
cutoff (float, default=0.0) – maximum acceptable distance

Returns:

bool array, with True where the distance is less than cutoff

lipschitz_distance(L, points1, points2, **kwds)

calculate the lipschitz distance between two sets of datapoints

Parameters:

L (list) – a list of lipschitz constants
points1 (mystic.math.legacydata.dataset) – a dataset
points2 (mystic.math.legacydata.dataset) – a second dataset
tol (float, default=0.0) – maximum acceptable deviation from shortness
cutoff (float, default=tol) – zero out distances less than cutoff

Returns:

a list of lipschitz distances

Notes

Both points1 and points2 can be a mystic.math.legacydata.dataset, or a list of mystic.math.legacydata.datapoint objects, or a list of lipschitzcone.vertex objects (from mystic.math.legacydata). cutoff takes a float or a boolean, where cutoff=True will set the value of cutoff to the default. Typically, the value of cutoff is tol, 0.0, or None.

Each point x,y can be thought to have an associated double-cone with slope equal to the lipschitz constant. Shortness with respect to another point is defined by the first point not being inside the cone of the second. We can allow for some error in shortness, a short tolerance tol, for which the point x,y is some acceptable y-distance inside the cone. While very tightly related, cutoff and tol play distinct roles; tol is subtracted from calculation of the lipschitz_distance, while cutoff zeros out the value of any element less than the cutoff.

lipschitz_metric(L, x, xp=None)

sum of lipschitz-weighted distance between points

d = sum(L[i] * |x[i] - x'[i]|)

Parameters:

L (array) – an array of Lipschitz constants, L
x (array) – an array of points, x
xp (array, default=None) – a second array of points, x'

Returns:

an array of absolute distances between points

manhattan(x, xp=None, pair=False, dmin=0, axis=None)

L-1 norm distance between points in euclidean space

d(1) = sum(|x[0] - x[0]'|, |x[1] - x[1]'|, ..., |x[n] - x[n]'|)

Parameters:

x (array) – an array of points, x
xp (array, default=None) – a second array of points, x'
pair (bool, default=False) – if True, return the pairwise distances
dmin (int, default=0) – upconvert x,x' to dimension >= dmin
axis (int, default=None) – if not None, reduce across the given axis

Returns:

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

minkowski(x, xp=None, pair=False, dmin=0, p=3, axis=None)

p-norm distance between points in euclidean space

d(p) = sum(|x[0] - x[0]'|^p, |x[1] - x[1]'|^p, ..., |x[n] - x[n]'|^p)^(1/p)

Parameters:

x (array) – an array of points, x
xp (array, default=None) – a second array of points, x'
pair (bool, default=False) – if True, return the pairwise distances
dmin (int, default=0) – upconvert x,x' to dimension >= dmin
p (int, default=3) – value of p for the p-norm
axis (int, default=None) – if not None, reduce across the given axis

Returns:

an array of absolute distances between points

Notes

most common usage has pair=False and axis=0, or pairwise distance with pair=True and axis=1

grid module

tools for generating points on a grid

binnedpts(lb, ub, nbins, dist=None, clip=False)

takes lower and upper bounds (e.g. lb = [1,3], ub = [2,4]) and a list of number of bins on each axis (e.g. nbins = [2,2]) produces a list of points on a grid g = [[1,3],[1,4],[2,3],[2,4]]

Inputs:: lb – a list of the lower bounds ub – a list of the upper bounds nbins – a list of number of grid points per axis dist – a mystic.math.Distribution instance (or list of Distributions) clip – if True, clip at bounds, else resample [default = False]

Notes: if clip is None, do not clip or resample (may exceed bounds)

errorpts(lb, ub, npts, data=None, error=None, mtol=None, dist=None, clip=False)

takes lower and upper bounds (e.g. lb = [0,3], ub = [2,4]) sample npts near maxima of surface interpolated from {data:error} produces a list of sample points s = [[1,3],[1,4],[2,3],[2,4]]

Inputs:: lb – a list of the lower bounds ub – a list of the upper bounds npts – number of sample points data – a list of legacy sample points error – a list of legacy sample error values (must be non-negative) mtol – iteration tolerance solving for maximum error dist – a mystic.math.Distribution instance (or list of Distributions) clip – if True, clip at bounds, else resample [default = False]

Notes: if mtol is None, defaults to mtol=0.1 Notes: if mtol is negative, use a deterministic search Notes: if clip is None, do not clip or resample (may exceed bounds)

fillpts(lb, ub, npts, data=None, rtol=None, dist=None, clip=False)

takes lower and upper bounds (e.g. lb = [0,3], ub = [2,4]) finds npts that are at least rtol away from legacy data produces a list of sample points s = [[1,3],[1,4],[2,3],[2,4]]

Inputs:: lb – a list of the lower bounds ub – a list of the upper bounds npts – number of sample points data – a list of legacy sample points rtol – target radial distance from each point dist – a mystic.math.Distribution instance (or list of Distributions) clip – if True, clip at bounds, else resample [default = False]

Notes: if rtol is None, use max rtol; if rtol < 0, use quick-n-dirty method Notes: if clip is None, do not clip or resample (may exceed bounds)

gridpts(q, dist=None)

takes a list of lists of arbitrary length q = [[1,2],[3,4]] produces a list of points on a grid g = [[1,3],[1,4],[2,3],[2,4]]

Inputs:: q – a list of lists of integers denoting grid points per axis dist – a mystic.math.Distribution instance (or list of Distributions)

Notes: if a mystic.math.Distribution is provided, use it to inject randomness

randomly_bin(N, ndim=None, ones=True, exact=True)

generate N bins randomly gridded across ndim dimensions

Inputs:: N – integer number of bins, where N = prod(bins) ndim – integer length of bins, thus ndim = len(bins) ones – if False, prevent bins from containing “1s”, wherever possible exact – if False, find N-1 bins for prime numbers

samplepts(lb, ub, npts, dist=None, clip=False)

takes lower and upper bounds (e.g. lb = [0,3], ub = [2,4]) produces a list of sample points s = [[1,3],[1,4],[2,3],[2,4]]

Inputs:: lb – a list of the lower bounds ub – a list of the upper bounds npts – number of sample points dist – a mystic.math.Distribution instance (or list of Distributions) clip – if True, clip at bounds, else resample [default = False]

Notes: if clip is None, do not clip or resample (may exceed bounds)

integrate module

math tools related to integration

__test_integrator1()

__test_mean()

__uniform_integrated_mean(lb, ub)

use integration of cumulative function to calculate mean (in 1D)

same as: mean = (lb + ub) / 2.0

__uniform_integrated_variance(lb, ub)

use integration of cumulative function to calculate variance (in 1D)

same as: variance = (ub - lb)**2 / 12.0

_scipy_integrate(f, lb, ub)

Returns the integral of an n-dimensional function f from lb to ub (where n = 1, 2, or 3), using scipy.integrate

Inputs:: f – a function that takes a list and returns a number. lb – a list of lower bounds ub – a list of upper bounds

integrate(f, lb, ub)

Returns the integral of an n-dimensional function f from lb to ub

Inputs:: f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds

If scipy is installed, and number of dimensions is 3 or less, scipy.integrate is used. Otherwise, use mystic’s n-dimensional Monte Carlo integrator.

integrated_mean(f, lb, ub)

calculate the integrated mean of a function f

Inputs:: f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds

integrated_variance(f, lb, ub)

calculate the integrated variance of a function f

Inputs:: f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds

monte_carlo_integrate(f, lb, ub, n=10000)

Returns the integral of an m-dimensional function f from lb to ub using a Monte Carlo integration of n points

Inputs:: f – a function that takes a list and returns a number. lb – a list of lower bounds ub – a list of upper bounds n – the number of points to sample [Default is n=10000]

References

“A Primer on Scientific Programming with Python”, by Hans Petter Langtangen, page 443-445, 2014.
http://en.wikipedia.org/wiki/Monte_Carlo_integration
http://math.fullerton.edu/mathews/n2003/MonteCarloMod.html

interpolate module

functions related to interpolation 1-D and n-D interpolation, building a grid of points, calculating gradients

class Rbf(*args)

Bases: object

A class for radial basis function approximation/interpolation of n-dimensional scattered data.

Parameters:

*args (arrays) – x, y, z, …, d, where x, y, z, … are the coordinates of the nodes and d is the array of values at the nodes

function (str or callable, optional) –

The radial basis function, based on the radius, r, given by the norm (default is Euclidean distance); the default is ‘multiquadric’:

'multiquadric': sqrt((r/epsilon)**2 + 1)
'inverse_multiquadric': 1.0/sqrt((r/epsilon)**2 + 1)
'inverse_quadratic': 1.0/((r/epsilon)**2 + 1)
'gaussian': exp(-(r/epsilon)**2)
'hyperbolic_tangent': r * tanh(r/epsilon)
'bump': exp(-1.0/(1 - (r/epsilon)**2)) if r < epsilon else 0.0
'linear': r
'cubic': r**3
'quintic': r**5
'thin_plate': r**2 * log(r)
'quartic': r**4 * log(r)

If callable, then it must take 2 arguments (self, r). The epsilon parameter will be available as self.epsilon. Other keyword arguments passed in will be available as well.

epsilon (float, optional) – Adjustable shape parameter for infinitely smooth functions - defaults to approximate average distance between nodes (which is a good start).
smooth (float, optional) – Values greater than zero increase the smoothness of the approximation. 0 is for interpolation (default), the function will always go through the nodal points in this case.
norm (callable, optional) –
A function that returns the ‘distance’ between two points, with inputs as arrays of positions (x, y, z, …), and an output as an array of distance. E.g, the default:
```
def euclidean_norm(x1, x2):
    return sqrt( ((x1 - x2)**2).sum(axis=0) )
```
which is called with x1 = x1[ndims, newaxis, :] and x2 = x2[ndims, : ,newaxis] such that the result is a matrix of the distances from each point in x1 to each point in x2.

Examples

>>> from rbf import Rbf
>>> x, y, z, d = np.random.rand(4, 50)
>>> rbfi = Rbf(x, y, z, d)  # radial basis function interpolator instance
>>> xi = yi = zi = np.linspace(0, 1, 20)
>>> di = rbfi(xi, yi, zi)   # interpolated values
>>> di.shape
(20,)

property A

__call__(*args): Call self as a function.

_call_norm(x1, x2)

_euclidean_norm(x1, x2)

_h_bump(r)

_h_cubic(r)

_h_gaussian(r)

_h_hyperbolic_tangent(r)

_h_inverse_multiquadric(r)

_h_inverse_quadratic(r)

_h_linear(r)

_h_multiquadric(r)

_h_quartic(r)

_h_quintic(r)

_h_thin_plate(r)

_init_function(r)

_array(x)

convert lists or values to numpy arrays

Parameters:: x (list) – a list of values (or a scalar value)
Returns:: x cast as a numpy array (or numpy scalar)

_axes(x)

convert array of measured points x to a tuple of coordinate axes

Parameters:: x (ndarray) – an array of shape (npts, dim) or (npts,)
Returns:: tuple of arrays (x0, ..., xm) with m = dim-1 and length npts

_boundbox(monitor, fx=None, mins=None, maxs=None, **kwds)

produce a bounding-box to facilitate interpolation at the given points

Parameters:

monitor (monitor) – a mystic monitor instance of existing points
fx (function, default=None) – evaluate as z = f(x) for new x
mins (list[float], default=None) – list of the minima of the bounding box
maxs (list[float], default=None) – list of the maxima of the bounding box
all (bool, default=True) – include initial monitor points in the return

Returns:

a mystic monitor instance containing the requested boundbox points

Notes

if mins is None, then use the minima found in the monitor. Similarly for maxs.

_fprime(x, fx, method=None, extrap=False, **kwds)

find gradient of fx at x, with fx a function z = fx(x)

Parameters:

x (ndarray) – an array of shape (npts, dim) or (npts,)
fx (function) – a function of form z = fx(x)
method (str, default=None) – string name of the kind of gradient method
extrap (bool, default=False) – if True, extrapolate a bounding box
new (bool, default=False) – include the extrapolated points in the output

Returns:

array of dimensions x.shape, gradient of the points at (x,fx)

Notes

if method is 'approx' (the default), use approx_fprime from mystic, which uses a local gradient approximation; other choices are 'symbolic', which uses mpmath.diff if installed.
if extrap is True, extrapolate using interpf with method='thin_plate' (or 'rbf' if scipy is not installed). Alternately, any one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic') can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation. Using extrapolation can reduce the number of nan.

_getaxis(z, axis)

get the selected axis of the multi-valued array

Parameters:

z (ndarray) – an array of shape (npts, N)
axis (int) – index of the desired column of the multi-valued array z

Returns:

array of shape (npts,) corresponding to the selected column of z

_gradient(x, grid)

find gradient of f(x) sampled on the coordinate grid defined by x

Parameters:

x (ndarray) – an array of shape (npts, dim) or (npts,)
grid (ndarray) – irregular coordinate grid generated from z = f(x)

Returns:

list of length dim for the gradient of the points on grid

Notes

output will be of the form (dim,) + grid.shape.
gradient on the grid calculated using tuple generated with _axes(x)

_hessian(x, grid)

find hessian of f(x), sampled on the coordinate grid defined by x

Parameters:

x (ndarray) – an array of shape (npts, dim) or (npts,)
grid (ndarray) – irregular coordinate grid generated from z = f(x)

Returns:

array of shape indicated in Notes, the hessian of the points on grid

Notes

Output will be of the form (dim,dim) + grid.shape, where the output hess[i,j] corresponds to the second derivative z_{i,j} with i,j in range(dim). For a 1D array x, the output will be a 1D array of the same length. The hessian is calculated using finite differences.

_interpf(x, z, method=None, extrap=False, arrays=False, **kwds)

interpolate to produce function f, where z = f(*x)

Parameters:

x (ndarray) – an input array of shape (npts, dim) or (npts,)
z (ndarray) – an output array of shape (npts,)
method (str, default=None) – string name of the kind of interpolator
extrap (bool, default=False) – if True, extrapolate a bounding box
arrays (bool, default=False) – return z = f(*x) as a numpy array

Returns:

interpolated function f, where z = f(*x)

Notes

if scipy is not installed, will use numpy.interp for 1D (non-rbf), or rbf from mystic otherwise. Default method is 'nearest' for 1D, and is 'linear' otherwise. method can be one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic').
if extrap is True, extrapolate using interpf with method='thin_plate' (or 'rbf' if scipy is not installed). Alternately, any one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic') can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation. Using extrapolation can reduce the number of nan.
additional keyword arguments (epsilon, smooth, norm) are avaiable for use with a Rbf interpolator. See mystic.math.interpolate.Rbf for more details.
if initialization produces a singlular matrix, try non-zero smooth.

_isin(i, x)

check if i is in iterable x

Parameters:

i (ndarray) – an array of shape (npts,), or a scalar value
x (ndarray) – an array of shape (npts, nparam) or higher dimension

Returns:

True, if x contains i

_noisy(x, scale=1e-08)

add random gaussian noise of the given scale, or None if scale=None

Parameters:

x (ndarray) – an array of shape (npts, dim) or (npts,)
scale (float, default=1e-8) – amplitude of the additive gaussian noise

Returns:

array of shape (npts, dim) or (npts,), where noise has been added

_nonarray(x)

convert arrays (or lists of arrays) to values or (lists of values)

Parameters:: x (ndarray) – an array of values (or a numpy scalar)
Returns:: x cast as a list (or a scalar value)

_sort(x, y=None, param=0)

sort x (and y, if provided) by the given parameter

Parameters:

x (ndarray) – an array of shape (npts, nparam)
y (ndarray, default=None) – an array of shape (npts,)
param (int, default=0) – index of nparam upon which to sort

Returns:

sorted x, or tuple of sorted (x,y) if y is provided

Examples

>>> _sort([[1,5,9],[7,8,3],[4,2,6]], param=0)
array([[1, 5, 9],
       [4, 2, 6],
       [7, 8, 3]])

>>> _sort([[1,5,9],[7,8,3],[4,2,6]], param=1)
array([[4, 2, 6],
       [1, 5, 9],
       [7, 8, 3]])

>>> _sort([[1,5,9],[7,8,3],[4,2,6]], [4,3,2])
(array([[1, 5, 9],
       [4, 2, 6],
       [7, 8, 3]]), array([4, 2, 3]))

_swapvals(x, i)

swap values from column 0 in x with column i

Parameters:

x (ndarray) – an array of shape (npts, ndim)
i (int) – index of column to swap with column 0

Returns:

an array where the indicated columns have been swapped

_to_array(f)

return a function where the return is a numpy array

Parameters:: f (function) – a callable that returns a list (or scalar)
Returns:: a function with the return value of f cast as an array (or scalar)

_to_function(objective, ndim=None)

convert f(x, *args, **kwds) to f(*xargs, **kwds), where xargs = x + args

Parameters:

objective (function) – a function of the form f(x, *args, **kwds)
ndim (int, default=None) – the length of x in f(x, *args, **kwds)

Returns:

a function of the form f(*xargs, **kwds)

Examples

>>> @_to_function
... def model(x):
...   return sum(x)
...
>>> model(1,2,3)
6

_to_nonarray(f)

return a function where the return does not contain numpy arrays

Parameters:: f (function) – a callable that returns an array (or scalar)
Returns:: a function with the return value of f cast as a list (or scalar)

_to_objective(function)

convert f(*xargs, **kwds) to f(x, *args, **kwds), where xargs = x + args

Parameters:: function (function) – a function of the form f(*xargs, **kwds)
Returns:: a function of the form f(x, *args, **kwds)

Examples

>>> @_to_objective
... def cost(x,y,z):
...   return x+y+z
...
>>> x = [1,2,3]
>>> cost(x)
6

_unique(x, z=None, sort=False, index=False)

return the unique values of x, and corresponding z (if provided)

Parameters:

x (ndarray) – an input array of shape (npts, dim) or (npts,)
z (ndarray, default=None) – an ouput array of shape (npts,)
sort (bool, default=False) – if True, return arrays sorted on x
index (bool, default=False) – also return an index array to recover x

Returns:

an array of the unique elements of x, or a tuple (x, z) if z is provided. If index is True, also return an array of indicies that can be used to recover the original x array. If sort is True, the returned arrays will be sorted on x.

axes(mins, maxs, npts=None)

generate a tuple of arrays defining axes on a grid, given bounds

Parameters:

mins (tuple[float]) – lower bounds for coordinates on the grid
maxs (tuple[float]) – upper bounds for coordinates on the grid
npts (tuple[int]) – number of grid points per axis, or integer if equal

Returns:

a tuple of arrays defining the axes of the coordinate grid

Notes

If npts is None, a default of 50 points in each direction is used.

extrapolate(x, z=None, method=None, mins=None, maxs=None, **kwds)

extrapolate a bounding-box for the given points

Parameters:

x (ndarray) – an input array of shape (npts, dim) or (npts,)
z (ndarray, default=None) – an output array of shape (npts,)
method (function, default=None) – evaluate as z = f(x) for new x
mins (list[float], default=None) – list of the minima of the bounding box
maxs (list[float], default=None) – list of the maxima of the bounding box
all (bool, default=True) – include initial (x,z) points in the return

Returns:

a tuple of (x,z) containing the requested boundbox points

Notes

if z is None, return the tuple (x,) at the requested boundbox points.
if mins is None, then use the minima found in x. Similarly for maxs.
method can take a function, None, a bool, or a string. If method is True, interpolate a cost function using interpf with method='thin_plate' (or 'rbf' if scipy is not found). If method is False, return the original input. Alternately, method can be any one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic'). If method is None, return nan upon new input.

gradient(x, fx, method=None, approx=True, extrap=False, **kwds)

find gradient of fx at x, with fx a function z = fx(*x) or an array z

Parameters:

x (ndarray) – an array of shape (npts, dim) or (npts,)
fx (ndarray) – array of shape (npts,) or function z = f(*x)
method (str, default=None) – string name of the kind of gradient method
approx (bool, default=True) – if True, use local approximation method
extrap (bool, default=False) – if True, extrapolate a bounding box
new (bool, default=False) – include the extrapolated points in the output

Returns:

array of dimensions x.shape, gradient of the points at (x,fx)

Notes

if approx is True, use approx_fprime from mystic, which uses a local gradient approximation; otherwise use gradient from numpy, which performs a memory-intensive calculation on a grid.
if scipy is not installed, will use numpy.interp for 1D (non-rbf), or rbf from mystic otherwise. Default method is 'nearest' for 1D, and is 'linear' otherwise. method can be one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic').
if extrap is True, extrapolate using interpf with method='thin_plate' (or 'rbf' if scipy is not installed). Alternately, any one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic') can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation. Using extrapolation can reduce the number of nan.

grid(*axes)

generate tuple of (irregular) coordinate grids, given coordinate axes

Parameters:: axes (tuple[ndarray]) – arrays defining the axes of the coordinate grid
Returns:: a tuple of ndarrays representing the resulting coordinate grid

hessian(x, fx, method=None, approx=True, extrap=False, **kwds)

find hessian of fx at x, with fx a function z = fx(*x) or an array z

Parameters:

x (ndarray) – an array of shape (npts, dim) or (npts,)
fx (ndarray) – array of shape (npts,) or function z = f(*x)
method (str, default=None) – string name of the kind of interpolator
approx (bool, default=True) – if True, use local approximation method
extrap (bool, default=False) – if True, extrapolate a bounding box
new (bool, default=False) – include the extrapolated points in the output

Returns:

array of shape indicated in Notes, the hessian of the points at (x,fx)

Notes

output will be of the form x.shape + (dim,), where hess[:,i,j] corresponds to the second derivative z_{i,j} with i,j in range(dim). For a 1D array x, output will be a 1D array of the same length.
if approx is True, first use interpolation to build gradient functions in each direction, then use approx_fprime from mystic, which uses a local gradient approximation; otherwise use gradient from numpy, which performs a memory-intensive calcuation on a grid.
if scipy is not installed, will use numpy.interp for 1D (non-rbf), or rbf from mystic otherwise. Default method is 'nearest' for 1D, and is 'linear' otherwise. method can be one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic').
method string can provide either one or two methods (i.e. 'rbf' or 'rbf, cubic'), where if two methods are provided, the first will be used to interpolate f(x) and the second will be used to interpolate the gradient of f(x).
if extrap is True, extrapolate using interpf with method='thin_plate' (or 'rbf' if scipy is not installed). Alternately, any one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic') can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation. Using extrapolation can reduce the number of nan.

hessian_diagonal(x, fx, method=None, approx=True, extrap=False, **kwds)

find hessian diagonal of fx at x, with fx a function z = fx(*x) or an array z

Parameters:

x (ndarray) – an array of shape (npts, dim) or (npts,)
fx (ndarray) – array of shape (npts,) or function z = f(*x)
method (str, default=None) – string name of the kind of interpolator
approx (bool, default=True) – if True, use local approximation method
extrap (bool, default=False) – if True, extrapolate a bounding box
new (bool, default=False) – include the extrapolated points in the output

Returns:

array of dimensions x.shape, the hessian diagonal of the points at (x,fx)

Notes

if approx is True, first use interpolation to build gradient functions in each direction, then use approx_fprime from mystic, which uses a local gradient approximation; otherwise use gradient from numpy, which performs a memory-intensive calcuation on a grid.
if scipy is not installed, will use numpy.interp for 1D (non-rbf), or rbf from mystic otherwise. Default method is 'nearest' for 1D, and is 'linear' otherwise. method can be one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic').
if extrap is True, extrapolate using interpf with method='thin_plate' (or 'rbf' if scipy is not installed). Alternately, any one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic') can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation. Using extrapolation can reduce the number of nan.

interpf(x, z, method=None, extrap=False, arrays=False, **kwds)

interpolate (x,z) to generate f, where z = f(*x)

Parameters:

x (ndarray) – an input array of shape (npts, dim) or (npts,)
z (ndarray) – an output array of shape (npts, N)
method (str, default=None) – string name of the kind of interpolator
extrap (bool, default=False) – if True, extrapolate a bounding box
arrays (bool, default=False) – return z = f(*x) as a numpy array
axis (int, default=None) – index of z upon which to interpolate
axmap – (map, default=None): map instance to execute each axis in parallel

Returns:

interpolated function f, where z = f(*x)

Notes

if axis is None, then interpolate using all indicies of z
if axmap is None, then use the (sequential) map from builtins
if scipy is not installed, will use numpy.interp for 1D (non-rbf), or rbf from mystic otherwise. Default method is 'nearest' for 1D, and is 'linear' otherwise. method can be one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic').
if extrap is True, extrapolate using interpf with method='thin_plate' (or 'rbf' if scipy is not installed). Alternately, any one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic') can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation. Using extrapolation can reduce the number of nan.
additional keyword arguments (epsilon, smooth, norm) are avaiable for use with a Rbf interpolator. See mystic.math.interpolate.Rbf for more details.
if initialization produces a singlular matrix, try non-zero smooth.

interpolate(x, z, xgrid, method=None, extrap=False, arrays=True, **kwds)

interpolate to find z = f(x) sampled at points defined by xgrid

Parameters:

x (ndarray) – an input array of shape (npts, dim) or (npts,)
z (ndarray) – an output array of shape (npts,)
xgrid (ndarray) – irregular coordinate grid on which to sample z = f(x)
method (str, default=None) – string name of the kind of interpolator
extrap (bool, default=False) – if True, extrapolate a bounding box
arrays (bool, default=True) – return z = f(x) as a numpy array

Returns:

interpolated points, where z = f(x) is sampled on xgrid

Notes

if scipy is not installed, will use numpy.interp for 1D (non-rbf), or rbf from mystic otherwise. Default method is 'nearest' for 1D, and is 'linear' otherwise. method can be one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic').
if extrap is True, extrapolate using interpf with method='thin_plate' (or 'rbf' if scipy is not installed). Alternately, any one of ('rbf', 'linear', 'cubic', 'nearest', 'inverse', 'gaussian', 'multiquadric', 'quintic', 'thin_plate', 'bump', 'hyperbolic', 'quartic', 'inverse_quadratic') can be used. If extrap is a cost function z = f(x), then directly use it in the extrapolation. Using extrapolation can reduce the number of nan.
additional keyword arguments (epsilon, smooth, norm) are avaiable for use with a Rbf interpolator. See mystic.math.interpolate.Rbf for more details.
if initialization produces a singlular matrix, try non-zero smooth.

sort_axes(*axes, **kwds)

sort axes along the selected primary axis

Parameters:

axes (tuple[ndarray]) – a tuple of arrays of points along each axis
axis (int, default=0) – index of the axis upon which to sort

Returns:

a tuple of arrays sorted with respect to the selected axis

legacydata module

data structures for legacy data observations of lipschitz functions

_fails(filter, data=None)

filter a data array for failures using a boolean filter

Parameters:

filter (list) – a list of booleans (same size as data) to filter the data
data (list, default=None) – a list of data points

Returns:

filtered list of data points

Notes

if data is None, return the pairs of indices where data fails

class datapoint(position, value=None, id=None, lipschitz=None)

Bases: object

n-d data point with position and value

queries::: p.value – returns value p.position – returns position
settings::: p.value = v1 – set the value p.position = (x1, x2, …, xn) – set the position
notes::: a datapoint can have an assigned id and cone; also has utilities for comparison against other datapoints (intended for use in a dataset)

__eq__(other): Return self==value.

__ge__(other): Return self>=value.

__gt__(other): Return self>value.

__hash__ = None

__hash_id(obj, /)

Return the identity of an object.

This is guaranteed to be unique among simultaneously existing objects. (CPython uses the object’s memory address.)

__is_collision(pt)

__is_conflict(pt)

__is_duplicate(pt)

__is_repeat(pt)

__le__(other): Return self<=value.

__lt__(other): Return self<value.

__ne__(other): Return self!=value.

__position()

__repr__(): Return repr(self).

__set_position(position)

__set_value(value)

__value()

collisions(pts): return True where a point exists with same ‘position’ and different ‘value’

conflicts(pts): return True where a point exists with same ‘id’ but different ‘raw’

duplicates(pts): return True where a point exists with same ‘raw’ and ‘id’

property position

repeats(pts): return True where a point exists with same ‘raw’ but different ‘id’

property value

class dataset(iterable=(), /)

Bases: list

a collection of data points

s = dataset([point1, point2, …, pointN])

queries::

s.values – returns list of values s.coords – returns list of positions s.ids – returns list of ids s.raw – returns list of points s.npts – returns the number of points s.lipschitz – returns list of lipschitz constants

settings::

s.lipschitz = [s1, s2, …, sn] – sets lipschitz constants

methods::

short – check for shortness with respect to given data (or self) valid – check for validity with respect to given model update – update the positions and values in the dataset load – load a list of positions and a list of values to the dataset fetch – fetch the list of positions and the list of values in the dataset intersection – return the set intersection between self and query filter – return dataset entries where mask array is True has_id – return True where dataset ids are in query has_position – return True where dataset coords are in query has_point – return True where dataset points are in query has_datapoint – return True where dataset entries are in query

notes::

datapoints should not be edited; except possibly for id
assumes that s.n = len(s.coords) == len(s.values)
all datapoints in a dataset should have the same cone.slopes

__coords()

__data()

__has_collision()

__has_conflict()

__has_duplicate()

__has_repeat()

__ids()

__lipschitz()

__name__ = 'dataset'

__npts()

__raw()

__repr__(): Return repr(self).

__set_coords(coords)

__set_ids(ids)

__set_lipschitz(slopes)

__set_values(values)

__values()

property collisions

property conflicts

property coords

property duplicates

fetch(): fetch the list of positions and the list of values in the dataset

filter(mask)

return dataset entries where mask array is True

Parameters:: mask (ndarray[bool]) – a boolean array of the same length as dataset
Returns:: the filtered dataset

has_datapoint(query)

return True where dataset entries are in query

Notes

query must be iterable

has_id(query)

return True where dataset ids are in query

Notes

query must be iterable

has_point(query)

return True where dataset points are in query

Notes

query must be iterable

has_position(query)

return True where dataset coords are in query

Notes

query must be iterable

property ids

intersection(query): return the set intersection between self and query

property lipschitz

load(positions, values, ids=[])

load a list of positions and a list of values to the dataset

Returns:: the dataset itself
Return type:: self (dataset)

Notes

positions and values provided must be iterable

property npts

property raw

property repeats

short(data=None, L=None, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for shortness with respect to given data (or self)

Parameters:

data (list, default=None) – a list of data points, or the dataset itself.
L (float, default=None) – the lipschitz constant, or the dataset’s constant.
blamelist (bool, default=False) – if True, indicate the infeasible points.
pairs (bool, default=True) – if True, indicate indices of infeasible points.
all (bool, default=False) – if True, get results for each individual point.
raw (bool, default=False) – if False, get boolean results (i.e. non-float).
tol (float, default=0.0) – maximum acceptable deviation from shortness.
cutoff (float, default=tol) – zero out distances less than cutoff.

Notes

Each point x,y can be thought to have an associated double-cone with slope equal to the lipschitz constant. Shortness with respect to another point is defined by the first point not being inside the cone of the second. We can allow for some error in shortness, a short tolerance tol, for which the point x,y is some acceptable y-distance inside the cone. While very tightly related, cutoff and tol play distinct roles; tol is subtracted from calculation of the lipschitz_distance, while cutoff zeros out the value of any element less than the cutoff.

update(positions, values)

update the positions and values in the dataset

Returns:: the dataset itself
Return type:: self (dataset)

Notes

positions and values provided must be iterable

valid(model, blamelist=False, pairs=True, all=False, raw=False, **kwds)

check for validity with respect to given model

Parameters:

model (func) – the model function, y' = F(x').
blamelist (bool, default=False) – if True, indicate the infeasible points.
pairs (bool, default=True) – if True, indicate indices of infeasible points.
all (bool, default=False) – if True, get results for each individual point.
raw (bool, default=False) – if False, get boolean results (i.e. non-float).
ytol (float, default=0.0) – maximum acceptable difference |y - F(x')|.
xtol (float, default=0.0) – maximum acceptable difference |x - x'|.
cutoff (float, default=ytol) – zero out distances less than cutoff.
hausdorff (bool, default=False) – hausdorff norm, where if given, then ytol = |y - F(x')| + |x - x'|/norm.

Notes

xtol defines the n-dimensional base of a pilar of height ytol, centered at each point. The region inside the pilar defines the space where a “valid” model must intersect. If xtol is not specified, then the base of the pilar will be a dirac at x' = x. This function performs an optimization for each x to find an appropriate x'.

ytol is a single value, while xtol is a single value or an iterable. cutoff takes a float or a boolean, where cutoff=True will set the value of cutoff to the default. Typically, the value of cutoff is ytol, 0.0, or None. hausdorff can be False (e.g. norm = 1.0), True (e.g. norm = spread(x)), or a list of points of len(x).

While cutoff and ytol are very tightly related, they play a distinct role; ytol is used to set the optimization termination for an acceptable |y - F(x')|, while cutoff is applied post-optimization.

If we are using the hausdorff norm, then ytol will set the optimization termination for an acceptable |y - F(x')| + |x - x'|/norm, where the x values are normalized by norm = hausdorff.

property values

class lipschitzcone(datapoint, slopes=None)

Bases: list

Lipschitz double cone around a data point, with vertex and slope

queries::: vertex – coordinates of lipschitz cone vertex slopes – lipschitz slopes for the cone (should be same dimension as ‘vertex’)
methods::: contains – return True if a given point is within the cone distance – sum of lipschitz-weighted distance between a point and the vertex

__repr__(): Return repr(self).

contains(point): return True if a given point is within the cone

distance(point): sum of lipschitz-weighted distance between a point and the vertex

load_dataset(filename, filter=None)

read dataset from selected file

Parameters:

filename (str) – filepath of dataset file
filter (tuple, default=None) – points to select

Returns:

the stored mystic.math.legacydata.dataset

Notes

if filter` is False, ignore the filter stored in filename

class point(position, value)

Bases: object

n-d data point with position and value but no id (i.e. ‘raw’)

queries::: p.value – returns value p.position – returns position p.rms – returns the square root of sum of squared position
settings::: p.value = v1 – set the value p.position = (x1, x2, …, xn) – set the position

__eq__(other): Return self==value.

__ge__(other): Return self>=value.

__gt__(other): Return self>value.

__hash__ = None

__le__(other): Return self<=value.

__lt__(other): Return self<value.

__ne__(other): Return self!=value.

__repr__(): Return repr(self).

__rms()

property rms

save_dataset(data, filename='dataset.txt', filter=None, new=True)

save dataset to selected file

Parameters:

data (dataset) – a mystic.math.legacydata.dataset
filename (str, default='dataset.txt') – filepath of dataset file
filter (tuple, default=None) – filter to apply to dataset upon reading
new (bool, default=True) – if False, append to existing file

Returns:

None

measures module

Methods to support discrete measures

_expected_moment(f, samples, weights=None, order=1, tol=0.0)

calculate the (weighted) nth-order expected moment of a function

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
order (int, default=1) – the degree, a positive integer
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

the weighted nth-order expected moment of f on a list of sample points

_flat(params)

converts a nested parameter list into flat parameter list

Inputs:: params – a nested list of weights or positions

Examples

>>> par = [['x','x','x'], ['y','y'], ['z']]
>>> _flat(par)
['x','x','x','y','y','z']

_impose_expected_moment(m, f, npts, bounds=None, weights=None, **kwds)

impose a given expected moment E on a given function f, where E = m +/- tol and E = moment(f(x)) for x in bounds

Parameters:

m (float) – target expected moment
f (func) – a function that takes a list and returns a number
npts (tuple(int)) – a tuple of dimensions of the target product measure
bounds (bounds, default=None) – a bounds instance
weights (list, default=None) – a list of sample weights
order (int, default=1) – the degree, a positive integer
tol (float, default=None) – maximum allowable deviation from m
constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns:

a list of sample positions, with expected moment E

Notes

Expected moment E is calculated by minimizing moment(f(x)) - m, over the given bounds, and will terminate when E is found within deviation tol of the target moment m. If tol is not provided, then a relative deviation of 1% of m will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with moment +/- dev imposed
>>> mom = 2.0;  dev = 0.01
>>> samples = _impose_expected_moment(mom, f, (nx,ny,nz), (lb,ub),     ...                                            order=2, tol=dev)
>>>
>>> # test the results by calculating the expected moment for the samples
>>> _expected_moment(f, samples, order=2)
>>> 2.000010010122465

_k(weights, k=0, clip=False, norm=False, eps=15): trim weights at k%; if clip is True, winsorize instead of trim

_nested(params, npts)

converts a flat parameter list into nested parameter list

Inputs:: params – a flat list of weights or positions npts – a tuple describing the shape of the target list

Examples

>>> nx = 3;  ny = 2;  nz = 1
>>> par = ['x']*nx + ['y']*ny + ['z']*nz
>>> _nested(par, (nx,ny,nz))
[['x','x','x'], ['y','y'], ['z']]

_nested_split(params, npts)

splits a flat parameter list into a list of weights and a list of positions; weights and positions are expected to have the same dimensions (given by npts)

Inputs:: params – a flat list of weights and positions (formatted as noted below) npts – a tuple describing the shape of the target lists

Examples

>>> nx = 3;  ny = 2;  nz = 1
>>> par = ['wx']*nx + ['x']*nx + ['wy']*ny + ['y']*ny + ['wz']*nz + ['z']*nz
>>> weights, positions = _nested_split(par, (nx,ny,nz))
>>> weights
[['wx','wx','wx'], ['wy','wy'], ['wz']]
>>> positions
[['x','x','x'], ['y','y'], ['z']]

_pack(samples)

‘pack’ a list of discrete measure sample points into a list of product measure sample points

Inputs:: samples – a list of sample points for N discrete measures

Examples

>>> _pack([[1,2,3], [4,5], [6,7]])
[(1,4,6), (2,4,6), (3,4,6), (1,5,6), (2,5,6), (3,5,6),      (1,4,7), (2,4,7), (3,4,7), (1,5,7), (2,5,7), (3,5,7)]

_sort(samples, weights=None): sort (weighted) samples; returns 2-D array of samples and weights

_unpack(samples, npts)

‘unpack’ a list of product measure sample points into a list of discrete measure sample points

Inputs:: samples – a list of sample points for a product measure npts – a tuple of dimensions of the target discrete measure

Examples

>>> _unpack([(1,4,6), (2,4,6), (3,4,6), (1,5,6), (2,5,6), (3,5,6),     ...          (1,4,7), (2,4,7), (3,4,7), (1,5,7), (2,5,7), (3,5,7)], (3,2,2)    ...        )
[[1,2,3], [4,5], [6,7]]

ess_maximum(f, samples, weights=None, tol=0.0)

calculate the max of function for support on the given list of points

ess_maximum(f,x,w) = max(f(support(x,w)))

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

the maximum output value for a function at the given support points

ess_minimum(f, samples, weights=None, tol=0.0)

calculate the min of function for support on the given list of points

ess_minimum(f,x,w) = min(f(support(x,w)))

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

the minimum output value for a function at the given support points

ess_ptp(f, samples, weights=None, tol=0.0)

calculate the spread of function for support on the given list of points

ess_ptp(f,x,w) = max(f(support(x,w))) - min(f(support(x,w)))

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

the spread in output value for a function at the given support points

expectation(f, samples, weights=None, tol=0.0)

calculate the (weighted) expectation of a function for a list of points

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

the weighted expectation for a list of sample points

expected_std(f, samples, weights=None, tol=0.0)

calculate the (weighted) expected standard deviation of a function

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

the weighted expected standard deviation of f on a list of sample points

expected_variance(f, samples, weights=None, tol=0.0)

calculate the (weighted) expected variance of a function

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

the weighted expected variance of f on a list of sample points

impose_collapse(pairs, samples, weights)

collapse the weight and position of each pair (i,j) in pairs

Collapse is defined as weight[j] += weight[i] and weights[i] = 0, with samples[j] = samples[i].

Inputs:: samples – a list of sample points weights – a list of sample weights pairs – set of tuples of indices (i,j) where collapse occurs

Examples

>>> impose_collapse({(0,1),(0,2)},[1,2,3,4,5],[.2,.2,.2,.2,.2])
([1.5999999999999996, 1.5999999999999996, 1.5999999999999996, 4.5999999999999996, 5.5999999999999996], [0.6000000000000001, 0.0, 0.0, 0.2, 0.2])

>>> impose_collapse({(0,1),(3,4)},[1,2,3,4,5],[.2,.2,.2,.2,.2])
([1.3999999999999999, 1.3999999999999999, 3.3999999999999999, 4.4000000000000004, 4.4000000000000004], [0.4, 0.0, 0.2, 0.4, 0.0])

Notes: is ‘mean-preserving’ for samples and ‘norm-preserving’ for weights

impose_expectation(m, f, npts, bounds=None, weights=None, **kwds)

impose a given expectation value E on a given function f, where E = m +/- tol and E = mean(f(x)) for x in bounds

Parameters:

m (float) – target expected mean
f (func) – a function that takes a list and returns a number
npts (tuple(int)) – a tuple of dimensions of the target product measure
bounds (bounds, default=None) – a bounds instance
weights (list, default=None) – a list of sample weights
tol (float, default=None) – maximum allowable deviation from m
constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns:

a list of sample positions, with expectation E

Notes

Expectation value E is calculated by minimizing mean(f(x)) - m, over the given bounds, and will terminate when E is found within deviation tol of the target mean m. If tol is not provided, then a relative deviation of 1% of m will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with mean +/- dev imposed
>>> mean = 2.0;  dev = 0.01
>>> samples = impose_expectation(mean, f, (nx,ny,nz), (lb,ub), tol=dev)
>>>
>>> # test the results by calculating the expectation value for the samples
>>> expectation(f, samples)
>>> 2.000010010122465

impose_expected_mean_and_variance(param, f, npts, bounds=None, weights=None, **kwds)

impose a given expected mean E on a given function f, where E = m +/- mtol and E = mean(f(x)) for x in bounds. Additionally, impose a given expected variance R on f, where R = v +/- vtol and R = variance(f(x)) for x in bounds.

Parameters:

param (tuple(float)) – target parameters, (mean, variance)
f (func) – a function that takes a list and returns a number
npts (tuple(int)) – a tuple of dimensions of the target product measure
bounds (bounds, default=None) – a bounds instance
weights (list, default=None) – a list of sample weights
tol (tuple(float), default=None) – maximum allowed deviation from m,v
constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations
k (float, default=1e+6) – the penalty multiplier (for misfit in variance)

Returns:

a list of sample positions, with expected mean E and variance R

Notes

Expected mean E and expected variance R are calculated by minimizing the cost plus a penalty, where the cost is the absolute value of mean(f(x)) - m and a penalty is incurred when variance(f(x)) - v is greater than tol[-1]. The optimization is over the given bounds, and will terminate when E and R are found within tolerance tol of the target mean m and variance v, respectively. If tol is not provided, then a relative deviation of 1% of max(m,v) will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with mean and variance imposed
>>> mean = 10.0;  var = 2.0;  tol = 0.01
>>> samples = impose_expected_mean_and_variance((mean,var), f, (nx,ny,nz),     ...                                             (lb,ub), tol=tol)
>>>
>>> # test the results by calculating the expected mean for the samples
>>> expectation(f, samples)
>>> 9.9939677586948521
>>>
>>> # test the results by calculating the expected variance for the samples
>>> expected_variance(f, samples)
>>> 1.9936039288789822

impose_expected_std(s, f, npts, bounds=None, weights=None, **kwds)

impose a given expected std E on a given function f, where E = s +/- tol and E = std(f(x)) for x in bounds

Parameters:

s (float) – target expected standard deviation
f (func) – a function that takes a list and returns a number
npts (tuple(int)) – a tuple of dimensions of the target product measure
bounds (bounds, default=None) – a bounds instance
weights (list, default=None) – a list of sample weights
tol (float, default=None) – maximum allowable deviation from s
constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns:

a list of sample positions, with expected standard deviation E

Notes

Expected std E is calculated by minimizing std(f(x)) - s, over the given bounds, and will terminate when E is found within deviation tol of the target std s. If tol is not provided, then a relative deviation of 1% of s will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with std +/- dev imposed
>>> std = 2.0;  dev = 0.01
>>> samples = impose_expected_std(std, f, (nx,ny,nz), (lb,ub), tol=dev)
>>>
>>> # test the results by calculating the expected std for the samples
>>> expected_std(f, samples)
>>> 2.000010010122465

impose_expected_variance(v, f, npts, bounds=None, weights=None, **kwds)

impose a given expected variance E on a given function f, where E = v +/- tol and E = variance(f(x)) for x in bounds

Parameters:

v (float) – target expected variance
f (func) – a function that takes a list and returns a number
npts (tuple(int)) – a tuple of dimensions of the target product measure
bounds (bounds, default=None) – a bounds instance
weights (list, default=None) – a list of sample weights
tol (float, default=None) – maximum allowable deviation from v
constraints (func, default=None) – a function that takes a nested list of N x 1D discrete measure positions and weights, with the intended purpose of kernel-transforming x,w as x' = constraints(x, w)
npop (int, default=200) – size of the trial solution population
maxiter (int, default=1000) – the maximum number of iterations to perform
maxfun (int, default=1e+6) – the maximum number of function evaluations

Returns:

a list of sample positions, with expected variance E

Notes

Expected variance E is calculated by minimizing variance(f(x)) - v, over the given bounds, and will terminate when E is found within deviation tol of the target variance v. If tol is not provided, then a relative deviation of 1% of v will be used.

This function does not preserve the mean, variance, or range, as there is no initial list of samples to draw the mean, variance, and etc from.

bounds can also be given as a tuple with length(bounds) == 2, composed of all the lower bounds, then all the upper bounds, for each positional value (i.e. bounds for the weights should not be included).

Examples

>>> # provide the dimensions and bounds
>>> nx = 3;  ny = 2;  nz = 1
>>> x_lb = [10.0];  y_lb = [0.0];  z_lb = [10.0]
>>> x_ub = [50.0];  y_ub = [9.0];  z_ub = [90.0]
>>>
>>> # prepare the bounds
>>> lb = (nx * x_lb) + (ny * y_lb) + (nz * z_lb)
>>> ub = (nx * x_ub) + (ny * y_ub) + (nz * z_ub)
>>>
>>> # generate a list of samples with variance +/- dev imposed
>>> var = 2.0;  dev = 0.01
>>> samples = impose_expected_variance(var, f, (nx,ny,nz), (lb,ub), tol=dev)
>>>
>>> # test the results by calculating the expected variance for the samples
>>> expected_variance(f, samples)
>>> 2.000010010122465

impose_mad(s, samples, weights=None)

impose a median absolute deviation on a list of (weighted) points: (this function is ‘median-preserving’)
Inputs:: s – the target median absolute deviation samples – a list of sample points weights – a list of sample weights

impose_mean(m, samples, weights=None)

impose a mean on a list of (weighted) points

Parameters:

m (float) – the target mean
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights

Returns:

a list of sample points with the desired weighted mean

Notes

this function does not alter the weighted range or the weighted variance

impose_median(m, samples, weights=None)

impose a median on a list of (weighted) points: (this function is ‘range-preserving’ and ‘mad-preserving’)
Inputs:: m – the target median samples – a list of sample points weights – a list of sample weights

impose_moment(m, samples, weights=None, order=1, tol=0, skew=None)

impose the selected moment on a list of (weighted) points

Parameters:

m (float) – the target moment
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
order (int, default=1) – the degree, a positive integer
tol (float, default=0.0) – a tolerance, where any mean <= tol is zero
skew (bool, default=None) – if True, allow skew in the samples

Returns:

a list of sample points with the desired weighted moment

Notes

this function does not alter the weighted mean

if skew is None, then allow skew when order is odd

impose_product(mass, weights, zsum=False, zmass=1.0)

impose a product on a list of points

Inputs:: mass – target product of weights weights – a list of sample weights zsum – use counterbalance when mass = 0.0 zmass – member scaling when mass = 0.0

impose_reweighted_mean(m, samples, weights=None, solver=None): impose a mean on a list of points by reweighting weights

impose_reweighted_std(s, samples, weights=None, solver=None): impose a standard deviation on a list of points by reweighting weights

impose_reweighted_variance(v, samples, weights=None, solver=None): impose a variance on a list of points by reweighting weights

impose_spread(r, samples, weights=None)

impose a range on a list of (weighted) points

Parameters:

r (float) – the target range
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights

Returns:

a list of sample points with the desired weighted range

Notes

this function does not alter the weighted mean

impose_std(s, samples, weights=None)

impose a standard deviation on a list of (weighted) points

Parameters:

s (float) – the target standard deviation
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights

Returns:

a list of sample points with the desired weighted standard deviation

Notes

this function does not alter the weighted mean

impose_sum(mass, weights, zsum=False, zmass=1.0)

impose a sum on a list of points

Inputs:: mass – target sum of weights weights – a list of sample weights zsum – use counterbalance when mass = 0.0 zmass – member scaling when mass = 0.0

impose_support(index, samples, weights)

set all weights not appearing in ‘index’ to zero

Inputs:: samples – a list of sample points weights – a list of sample weights index – a list of desired support indices (weights will be non-zero)

Examples

>>> impose_support([0,1],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([2.5, 3.5, 4.5, 5.5, 6.5], [0.5, 0.5, 0.0, 0.0, 0.0])

>>> impose_support([0,1,2,3],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([1.5, 2.5, 3.5, 4.5, 5.5], [0.25, 0.25, 0.25, 0.25, 0.0])

>>> impose_support([4],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([-1.0, 0.0, 1.0, 2.0, 3.0], [0.0, 0.0, 0.0, 0.0, 1.0])

Notes: is ‘mean-preserving’ for samples and ‘norm-preserving’ for weights

impose_tmean(m, samples, weights=None, k=0, clip=False)

impose a trimmed mean (at k%) on a list of (weighted) points: (this function is ‘range-preserving’ and ‘tvariance-preserving’)
Inputs:: m – the target trimmed mean samples – a list of sample points weights – a list of sample weights k – percent samples to be trimmed (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

impose_tstd(s, samples, weights=None, k=0, clip=False)

impose a trimmed std (at k%) on a list of (weighted) points: (this function is ‘tmean-preserving’)
Inputs:: s – the target trimmed standard deviation samples – a list of sample points weights – a list of sample weights k – percent samples to be trimmed (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

impose_tvariance(v, samples, weights=None, k=0, clip=False)

impose a trimmed variance (at k%) on a list of (weighted) points: (this function is ‘tmean-preserving’)
Inputs:: v – the target trimmed variance samples – a list of sample points weights – a list of sample weights k – percent samples to be trimmed (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

impose_unweighted(index, samples, weights, nullable=True)

set all weights appearing in ‘index’ to zero

Inputs:: samples – a list of sample points weights – a list of sample weights index – a list of indices where weight is to be zero nullable – if False, avoid null weights by reweighting non-index weights

Examples

>>> impose_unweighted([0,1,2],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([-0.5, 0.5, 1.5, 2.5, 3.5], [0.0, 0.0, 0.0, 0.5, 0.5])

>>> impose_unweighted([3,4],[1,2,3,4,5],[.2,.2,.2,.2,.2])
([2.0, 3.0, 4.0, 5.0, 6.0], [0.33333333333333331, 0.33333333333333331, 0.33333333333333331, 0.0, 0.0])

Notes: is ‘mean-preserving’ for samples and ‘norm-preserving’ for weights

impose_variance(v, samples, weights=None)

impose a variance on a list of (weighted) points

Parameters:

v (float) – the target variance
samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights

Returns:

a list of sample points with the desired weighted variance

Notes

this function does not alter the weighted mean

impose_weight_norm(samples, weights, mass=1.0)

normalize the weights for a list of (weighted) points: (this function is ‘mean-preserving’)
Inputs:: samples – a list of sample points weights – a list of sample weights mass – float target of normalized weights

kurtosis(samples, weights=None)

calculate the (weighted) kurtosis for a list of points

Parameters:

samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights

Returns:

the weighted kurtosis for a list of sample points

mad(samples, weights=None)

calculate the (weighted) median absolute deviation for a list of points

Inputs:: samples – a list of sample points weights – a list of sample weights

maximum(f, samples)

calculate the max of function for the given list of points

maximum(f,x) = max(f(x))

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points

Returns:

the maximum output value for a function at the given inputs

mean(samples, weights=None, tol=0)

calculate the (weighted) mean for a list of points

Parameters:

samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any mean <= tol is zero

Returns:

the weighted mean for a list of sample points

median(samples, weights=None)

calculate the (weighted) median for a list of points

Inputs:: samples – a list of sample points weights – a list of sample weights

minimum(f, samples)

calculate the min of function for the given list of points

minimum(f,x) = min(f(x))

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points

Returns:

the minimum output value for a function at the given inputs

moment(samples, weights=None, order=1, tol=0)

calculate the (weighted) nth-order moment for a list of points

Parameters:

samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
order (int, default=1) – the degree, a positive integer
tol (float, default=0.0) – a tolerance, where any mean <= tol is zero

Returns:

the weighted nth-order moment for a list of sample points

norm(weights)

calculate the norm of a list of points

norm(x) = mean(x)

Parameters:: weights (list) – a list of sample weights
Returns:: the mean of the weights

normalize(weights, mass='l2', zsum=False, zmass=1.0)

normalize a list of points (e.g. normalize to 1.0)

Inputs:: weights – a list of sample weights mass – float target of normalized weights (or string for Ln norm) zsum – use counterbalance when mass = 0.0 zmass – member scaling when mass = 0.0

Notes: if mass=’l1’, will use L1-norm; if mass=’l2’ will use L2-norm; etc.

ptp(f, samples)

calculate the spread of function for the given list of points

ptp(f,x) = max(f(x)) - min(f(x))

Parameters:

f (func) – a function that takes a list and returns a number
samples (list) – a list of sample points

Returns:

the spread in output value for a function at the given inputs

skewness(samples, weights=None)

calculate the (weighted) skewness for a list of points

Parameters:

samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights

Returns:

the weighted skewness for a list of sample points

split_param(params, npts)

splits a flat parameter list into a flat list of weights and a flat list of positions; weights and positions are expected to have the same dimensions (given by npts)

Inputs:: params – a flat list of weights and positions (formatted as noted below) npts – a tuple describing the shape of the target lists

Examples

>>> nx = 3;  ny = 2;  nz = 1
>>> par = ['wx']*nx + ['x']*nx + ['wy']*ny + ['y']*ny + ['wz']*nz + ['z']*nz
>>> weights, positions = split_param(par, (nx,ny,nz))
>>> weights
['wx','wx','wx','wy','wy','wz']
>>> positions
['x','x','x','y','y','z']

spread(samples)

calculate the range for a list of points

spread(x) = max(x) - min(x)

Parameters:: samples (list) – a list of sample points
Returns:: the range of the samples

standard_moment(samples, weights=None, order=1, tol=0)

calculate the (weighted) nth-order standard moment for a list of points

standard_moment(x,w,order) = moment(x,w,order)/std(x,w)^order

Parameters:

samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights
order (int, default=1) – the degree, a positive integer
tol (float, default=0.0) – a tolerance, where any mean <= tol is zero

Returns:

the weighted nth-order standard moment for a list of sample points

std(samples, weights=None)

calculate the (weighted) standard deviation for a list of points

Parameters:

samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights

Returns:

the weighted standard deviation for a list of sample points

support(samples, weights, tol=0)

get the positions which have non-zero weight

Parameters:

samples (list) – a list of sample points
weights (list) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

a list of positions with non-zero weight

support_index(weights, tol=0)

get the indices of the positions which have non-zero weight

Parameters:

weights (list) – a list of sample weights
tol (float, default=0.0) – a tolerance, where any weight <= tol is zero

Returns:

a list of indices of positions with non-zero weight

tmean(samples, weights=None, k=0, clip=False)

calculate the (weighted) trimmed mean for a list of points

Inputs:: samples – a list of sample points weights – a list of sample weights k – percent samples to trim (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

NOTE: if all samples are excluded, will return nan

tstd(samples, weights=None, k=0, clip=False)

calculate the (weighted) trimmed standard deviation for a list of points

Inputs:: samples – a list of sample points weights – a list of sample weights k – percent samples to trim (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

NOTE: if all samples are excluded, will return nan

tvariance(samples, weights=None, k=0, clip=False)

calculate the (weighted) trimmed variance for a list of points

Inputs:: samples – a list of sample points weights – a list of sample weights k – percent samples to trim (k%) [tuple (lo,hi) or float if lo=hi] clip – if True, winsorize instead of trimming k% of samples

NOTE: if all samples are excluded, will return nan

variance(samples, weights=None)

calculate the (weighted) variance for a list of points

Parameters:

samples (list) – a list of sample points
weights (list, default=None) – a list of sample weights

Returns:

the weighted variance for a list of sample points

weighted_select(samples, weights, mass=1.0)

randomly select a sample from weighted set of samples

Parameters:

samples (list) – a list of sample points
weights (list) – a list of sample weights
mass (float, default=1.0) – sum of normalized weights

Returns:

a randomly selected sample point

poly module

tools for polynomial functions

poly1d(coeff): generate a 1-D polynomial instance from a list of coefficients

polyeval(coeffs, x)

evaluate the polynomial defined by coeffs at evaluation points, x

thus, [a3, a2, a1, a0] yields a3 x^3 + a2 x^2 + a1 x^1 + a0

Parameters:

coeffs (list[float]) – polynomial coefficients
x (array[float]) – array of points to evaluate the polynomial

Returns:

array of evaluations of the polynomial

Examples

>>> x = numpy.array([1, 2, 3, 4, 5])
>>> polyeval([1, 0, 0], x)
array([ 1,  4,  9, 16, 25])
>>> polyeval([0, 1, -1], x)
array([0, 1, 2, 3, 4])

samples module

tools related to sampling

_bounded_samples(lb, ub, samples, dist=None, clip=False)

ensure randomness from distribution yields samples bounded by given lb & ub

Inputs:: lb – a list of the lower bounds (e.g. lb = [0,3]) ub – a list of the upper bounds (e.g. ub = [2,4]) samples – a list of sample points (e.g. [[1,3],[0,4],[2,3],[2,4]]) dist – a mystic.tools.Distribution instance (or list of Distributions) clip – if True, clip at bounds, else resample [default = False]

Notes: if clip is None, do not clip or resample (may exceed bounds)

_expectation_given_samples(f, pts, map=None)

use given sample pts to calculate expected value for function f

Inputs:: f – a function that returns a single value, given a list of inputs pts – a list of sample points map – the mapping function [Default is builtins.map]

_maximum_given_samples(f, pts, map=None)

use given sample pts to calculate maximum for function f

Inputs:: f – a function that returns a single value, given a list of inputs pts – a list of sample points map – the mapping function [Default is builtins.map]

_minimum_given_samples(f, pts, map=None)

use given sample pts to calculate minimum for function f

Inputs:: f – a function that returns a single value, given a list of inputs pts – a list of sample points map – the mapping function [Default is builtins.map]

_pof_given_samples(f, pts, map=None)

use given sample pts to calculate probability of failure for function f

Inputs:: f – a function that returns True for ‘success’ and False for ‘failure’ pts – a list of sample points map – the mapping function [Default is builtins.map]

_ptp_given_samples(f, pts, map=None)

use given sample pts to calculate spread for function f

Inputs:: f – a function that returns a single value, given a list of inputs pts – a list of sample points map – the mapping function [Default is builtins.map]

_random_samples(lb, ub, npts=10000)

generate npts random samples between given lb & ub

Inputs:: lb – a list of the lower bounds ub – a list of the upper bounds npts – number of sample points [default = 10000]

_variance_given_samples(f, pts, map=None)

use given sample pts to calculate expected variance for function f

Inputs:: f – a function that returns a single value, given a list of inputs pts – a list of sample points map – the mapping function [Default is builtins.map]

alpha(n, diameter, epsilon=0.01)

random_samples(lb, ub, npts=10000, dist=None, clip=False)

generate npts samples from the given distribution between given lb & ub

Inputs:: lb – a list of the lower bounds ub – a list of the upper bounds npts – number of sample points [default = 10000] dist – a mystic.tools.Distribution instance (or list of Distributions) clip – if True, clip at bounds, else resample [default = False]

Notes: if clip is None, do not clip or resample (may exceed bounds)

sample(f, lb, ub, npts=10000, map=None)

return number of failures and successes for some boolean function f

Inputs:: f – a function that returns True for ‘success’ and False for ‘failure’ lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000] map – the mapping function [Default is builtins.map]

sampled_maximum(f, lb, ub, npts=10000, map=None)

use random sampling to calculate the maximum of a function

Inputs:: f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000] map – the mapping function [Default is builtins.map]

sampled_mean(f, lb, ub, npts=10000, map=None)

use random sampling to calculate the mean of a function

Inputs:: f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000] map – the mapping function [Default is builtins.map]

sampled_minimum(f, lb, ub, npts=10000, map=None)

use random sampling to calculate the minimum of a function

Inputs:: f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000] map – the mapping function [Default is builtins.map]

sampled_pof(f, lb, ub, npts=10000, map=None)

use random sampling to calculate probability of failure for a function

Inputs:: f – a function that returns True for ‘success’ and False for ‘failure’ lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000] map – the mapping function [Default is builtins.map]

sampled_prob(pts, lb, ub, map=None)

calculates probability by sampling if points are inside the given bounds

Inputs:: pts – a list of sample points lb – a list of lower bounds ub – a list of upper bounds map – the mapping function [Default is builtins.map]

sampled_ptp(f, lb, ub, npts=10000, map=None)

use random sampling to calculate the spread of a function

Inputs:: f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000] map – the mapping function [Default is builtins.map]

sampled_pts(pts, lb, ub, map=None)

determine the number of sample points inside the given bounds

Inputs:: pts – a list of sample points lb – a list of lower bounds ub – a list of upper bounds map – the mapping function [Default is builtins.map]

sampled_variance(f, lb, ub, npts=10000, map=None)

use random sampling to calculate the variance of a function

Inputs:: f – a function that takes a list and returns a number lb – a list of lower bounds ub – a list of upper bounds npts – the number of points to sample [Default is npts=10000] map – the mapping function [Default is builtins.map]

stats module

shortcut (?) math tools related to statistics; also, math tools related to gaussian distributions

__mean(xarr): mean = x.sum() / len(x)

__test_probability_mass()

__variance(xarr): var = mean(abs(x - x.mean())**2) / 3

_erf(x): approximate the error function at x

_gamma(x): approximate the gamma function at x

_lefttail(percent): calculate left-area percent from center-area percent

_lgamma(x): approximate the natual log of the abs value of the gamma function at x

_varconf(var, npts, percent=95): var confidence interval: returns interval, where var in interval

cdf_factory(mean, variance): Returns cumulative distribution function (as a Python function) for a Gaussian, given the mean and variance

erf(x): evaluate the error function at x

gamma(x): evaluate the gamma function at x

lgamma(x): evaluate the natual log of the abs value of the gamma function at x

mcdiarmid_bound(mean, diameter): calculates McDiarmid bound given mean and McDiarmid diameter

mean(expectation, volume): calculates mean given expectation and volume

meanconf(std, npts, percent=95): mean confidence interval: returns conf, where interval = mean +/- conf

pdf_factory(mean, variance): Returns a probability density function (as a Python function) for a Gaussian, given the mean and variance

prob_mass(volume, norm): calculates probability mass given volume and norm

sampvar(var, npts): sample variance from variance

stderr(std, npts): standard error

varconf(var, npts, percent=95, tight=False): var confidence interval: returns max interval distance from var

volume(lb, ub): calculates volume for a uniform distribution in n-dimensions