Skip to content

Bend and Mix Models

Core utilities

bmi.samplers._tfp._core.JointDistribution dataclass

The main object of this package, representing a Bend and Mix Model (BMM), i.e., a joint distribution \(P_{XY}\) together with the marginal distributions \(P_X\) and \(P_Y\).

Attributes:

Name Type Description
dist

\(P_{XY}\)

dist_x Distribution

\(P_X\)

dist_y Distribution

\(P_Y\)

dim_x int

dimension of the support of \(X\)

dim_y int

dimension of the support of \(Y\)

analytic_mi Optional[float]

analytical mutual information. Use None if unknown (in most cases)

pmi(self, x, y)

Calculates pointwise mutual information at specified points.

Parameters:

Name Type Description Default
x Array

points in the X space, shape (n_points, dim_x)

required
y Array

points in the Y space, shape (n_points, dim_y)

required

Returns:

Type Description
Array

pointwise mutual information evaluated at (x, y) points, shape (n_points,)

Note

This function is vectorized, i.e. it can calculate PMI for multiple points at once.

sample(self, n_points, key)

Sample from the joint distribution \(P_{XY}\).

Parameters:

Name Type Description Default
n_points int

number of samples to draw

required
key Array

JAX random key

required

bmi.samplers._tfp._core.monte_carlo_mi_estimate(key, dist, n)

Estimates the mutual information \(I(X; Y)\) using Monte Carlo sampling.

Returns:

Type Description
tuple[float, float]

mutual information estimate standard error estimate

Note

It is worth to run this procedure multiple times and see whether the standard error estimate is accurate.

bmi.samplers._tfp._core.pmi_profile(key, dist, n)

Monte Carlo draws a sample of size n from the PMI distribution.

Parameters:

Name Type Description Default
key Array

JAX random key, used to generate the sample

required
dist JointDistribution

distribution

required
n int

number of points to sample

required

Returns:

Type Description
Array

PMI profile, shape (n,)

bmi.samplers._tfp._core.transform(dist, x_transform=None, y_transform=None)

For given diffeomorphisms \(f\) and \(g\) transforms the joint distribution \(P_{XY}\) into \(P_{f(X)g(Y)}\).

Parameters:

Name Type Description Default
dist JointDistribution

distribution to be transformed

required
x_transform Optional[tensorflow_probability.substrates.jax.bijectors.bijector.Bijector]

diffeomorphism \(f\) to transform \(X\). Defaults to identity.

None
y_transform Optional[tensorflow_probability.substrates.jax.bijectors.bijector.Bijector]

diffeomorphism \(g\) to transform \(Y\). Defaults to identity.

None

Returns:

Type Description
JointDistribution

transformed distribution

bmi.samplers._tfp._product.ProductDistribution (JointDistribution)

From distributions \(P_X\) and \(P_Y\) creates a distribution \(P_{XY} = P_X \otimes P_Y\), so that \(X\) and \(Y\) are independent.

In particular, \(I(X; Y) = 0\).

__init__(self, dist_x, dist_y) special

Creates a product distribution.

Parameters:

Name Type Description Default
dist_x Distribution

distribution \(P_X\)

required
dist_y Distribution

distribution \(P_Y\)

required

bmi.samplers._tfp._wrapper.BMMSampler (BaseSampler)

Wraps a given Bend and Mix Model (BMM) into a sampler.

__init__(self, dist, mi=None, mi_estimate_seed=0, mi_estimate_sample=200000) special

Parameters:

Name Type Description Default
dist JointDistribution

distribution represented by a BMM to be wrapped

required
mi Optional[float]

mutual information of the distribution, if already calculated. If not provided, it will be estimated via Monte Carlo sampling.

None
mi_estimate_seed Union[Any, int]

seed for the Monte Carlo sampling

0
mi_estimate_sample int

number of samples for the Monte Carlo sampling

200000

mutual_information(self)

Mutual information MI(X; Y).

sample(self, n_points, rng)

Returns a sample from the joint distribution P(X, Y).

Parameters:

Name Type Description Default
n_points int

sample size

required
rng Union[int, Any]

pseudorandom number generator

required

Returns:

Type Description
tuple[jax.Array, jax.Array]

X samples, shape (n_points, dim_x) Y samples, shape (n_points, dim_y). Note that these samples are paired with X samples.

Basic distributions

bmi.samplers._tfp._normal.construct_multivariate_normal_distribution(mean, covariance)

Constructs a multivariate normal distribution.

bmi.samplers._tfp._normal.MultivariateNormalDistribution (JointDistribution)

Multivariate normal distribution \(P_{XY}\), such that \(P_X\) is a multivariate normal distribution on the space of dimension dim_x and \(P_Y\) is a multivariate normal distribution on the space of dimension dim_y.

__init__(self, *, dim_x, dim_y, covariance, mean=None) special

Parameters:

Name Type Description Default
dim_x int

dimension of the \(X\) support

required
dim_y int

dimension of the \(Y\) support

required
mean Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

mean vector, shape (n,) where n = dim_x + dim_y. Default: zero vector

None
covariance Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

covariance matrix, shape (n, n)

required

bmi.samplers._tfp._student.construct_multivariate_student_distribution(mean, dispersion, df)

Constructs a multivariate Student distribution.

Parameters:

Name Type Description Default
mean Array

location vector, shape (dim,)

required
dispersion Array

dispersion matrix, shape (dim, dim)

required
df Union[int, float]

degrees of freedom

required

bmi.samplers._tfp._student.MultivariateStudentDistribution (JointDistribution)

Multivariate Student distribution \(P_{XY}\), such that \(P_X\) is a multivariate Student distribution on the space of dimension dim_x and \(P_Y\) is a multivariate Student distribution on the space of dimension dim_y.

Note that the degrees of freedom df are the same for all distributions.

__init__(self, *, dim_x, dim_y, df, dispersion, mean=None) special

Parameters:

Name Type Description Default
dim_x int

dimension of the \(X\) support

required
dim_y int

dimension of the \(Y\) support

required
df int

degrees of freedom

required
mean Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

mean vector, shape (n,) where n = dim_x + dim_y. Default: zero vector

None
dispersion Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

dispersion matrix, shape (n, n)

required