Skip to content

Estimators

The package implements a number of estimators with common API. For general usage instructions and examples, consult the general instructions. Although the API is organized alphabetically, the estimators have been grouped by types in the list of estimators together with relevant literature.

bmi.estimators.correlation.CCAMutualInformationEstimator (IMutualInformationPointEstimator)

__init__(self, scale=True) special

Initialize self. See help(type(self)) for accurate signature.

estimate(self, x, y)

A point estimate of MI(X; Y) from an i.i.d. sample from the \(P(X, Y)\) distribution.

Parameters:

Name Type Description Default
x

shape (n_samples, dim_x)

required
y

shape (n_samples, dim_y)

required

Returns:

Type Description

mutual information estimate

parameters(self)

Returns the parameters of the estimator.

bmi.estimators.neural._estimators.DonskerVaradhanEstimator (NeuralEstimatorBase)

bmi.estimators._histogram.HistogramEstimator (IMutualInformationPointEstimator)

__init__(self, n_bins_x=5, n_bins_y=None, standardize=True) special

Parameters:

Name Type Description Default
n_bins_x int

number of bins per each X dimension

5
n_bins_y Optional[int]

number of bins per each Y dimension. Leave to None to use n_bins_x

None
standardize bool

whether to standardize the data set

True

estimate(self, x, y)

MI estimate.

parameters(self)

Returns the parameters of the estimator.

bmi.estimators.neural._estimators.InfoNCEEstimator (NeuralEstimatorBase)

bmi.estimators._kde.KDEMutualInformationEstimator (IMutualInformationPointEstimator)

The kernel density mutual information estimator based on

\(I(X; Y) = h(X) + h(Y) - h(X, Y)\),

where \(h(X)\) is the differential entropy \(h(X) = -\mathbb{E}[ \log p(X) ]\).

The logarithm of probability density function \(\log p(X)\) is estimated via a kernel density estimator (KDE) using SciKit-Learn.

Note

This estimator is very sensitive to the choice of the bandwidth and the kernel. We suggest to treat it with caution.

__init__(self, kernel_xy='tophat', kernel_x=None, kernel_y=None, bandwidth_xy='scott', bandwidth_x=None, bandwidth_y=None, standardize=True) special

Parameters:

Name Type Description Default
kernel_xy Literal['gaussian', 'tophat', 'epanechnikov', 'exponential', 'linear', 'cosine']

kernel to be used for joint distribution PDF \(p_{XY}\) estimation. See SciKit-Learn's KernelDensity object for more information.

'tophat'
kernel_x Optional[Literal['gaussian', 'tophat', 'epanechnikov', 'exponential', 'linear', 'cosine']]

kernel to be used for the :math:p_X estimation. If None (default), kernel_xy will be used.

None
kernel_y Optional[Literal['gaussian', 'tophat', 'epanechnikov', 'exponential', 'linear', 'cosine']]

similarly to kernel_x.

None
bandwidth_xy Union[float, Literal['scott', 'silverman']]

kernel bandwidth to be used for joint distribution estimation.

'scott'
bandwidth_x Union[float, Literal['scott', 'silverman']]

kernel bandwidth to be used for \(p_X\) estimation. If set to None (default), then bandwidth_xy is used.

None
bandwidth_y Union[float, Literal['scott', 'silverman']]

similar to bandwidth_x

None
standardize bool

whether to standardize the data points

True

estimate(self, x, y)

A point estimate of MI(X; Y) from an i.i.d. sample from the \(P(X, Y)\) distribution.

Parameters:

Name Type Description Default
x Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

shape (n_samples, dim_x)

required
y Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

shape (n_samples, dim_y)

required

Returns:

Type Description
float

mutual information estimate

estimate_entropies(self, x, y)

Calculates differential entropies.

Note

Differential entropy is not invariant to standardization. In particular, if you want to estimate differential entropy of the original variables, you should use standardize=False.

parameters(self)

Returns the parameters of the estimator.

bmi.estimators.ksg.KSGEnsembleFirstEstimator (IMutualInformationPointEstimator)

An implementation of of the neighborhood-based KSG estimator.

We use the first approximation (i.e., equation (8) in the paper) and allow for using different neighborhood sizes. The final estimate is the average of the estimates using different neighborhood sizes.

__init__(self, neighborhoods=(5, 10), standardize=True, metric_x='euclidean', metric_y=None, n_jobs=1, chunk_size=10) special

Parameters:

Name Type Description Default
neighborhoods Sequence[int]

sequence of positive integers, specifying the size of neighborhood for MI calculation

(5, 10)
standardize bool

whether to standardize the data before MI calculation, by default true

True
metric_x Literal['euclidean', 'manhattan', 'chebyshev']

metric on the X space

'euclidean'
metric_y Optional[Literal['euclidean', 'manhattan', 'chebyshev']]

metric on the Y space. If None, metric_x will be used

None
n_jobs int

number of jobs to be launched to compute distances. Use -1 to use all processors.

1
chunk_size int

internal batch size, used to speed up the computations while fitting into the memory

10

Note

If you use Chebyshev (\(\l_\infty\)) distance for both \(X\) and \(Y\) spaces, KSGChebyshevEstimator may be faster.

estimate(self, x, y)

A point estimate of MI(X; Y) from an i.i.d. sample from the \(P(X, Y)\) distribution.

Parameters:

Name Type Description Default
x Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

shape (n_samples, dim_x)

required
y Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

shape (n_samples, dim_y)

required

Returns:

Type Description
float

mutual information estimate

parameters(self)

Returns the parameters of the estimator.

bmi.estimators.neural._mine_estimator.MINEEstimator (IMutualInformationPointEstimator)

trained_critic: Optional[equinox._module.Module] property readonly

Returns the critic function from the end of the training.

Note: 1. You need to train the model by estimating mutual information, otherwise None is returned. 2. Note that the critic can have different meaning depending on the function used.

__init__(self, batch_size=256, max_n_steps=10000, train_test_split=0.5, test_every_n_steps=250, learning_rate=0.1, hidden_layers=(16, 8), smoothing_alpha=0.9, standardize=True, verbose=True, seed=42) special

Initialize self. See help(type(self)) for accurate signature.

estimate(self, x, y)

A point estimate of MI(X; Y) from an i.i.d. sample from the \(P(X, Y)\) distribution.

Parameters:

Name Type Description Default
x Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

shape (n_samples, dim_x)

required
y Union[numpy.__array_like._SupportsArray[numpy.dtype[Any]], numpy.__nested_sequence._NestedSequence[numpy.__array_like._SupportsArray[numpy.dtype[Any]]], bool, int, float, complex, str, bytes, numpy.__nested_sequence._NestedSequence[Union[bool, int, float, complex, str, bytes]]]

shape (n_samples, dim_y)

required

Returns:

Type Description
float

mutual information estimate

estimate_with_info(self, x, y)

Allows for reporting additional information about the run.

parameters(self)

Returns the parameters of the estimator.

bmi.estimators.neural._estimators.NWJEstimator (NeuralEstimatorBase)