Markov Sampler

class tsbootstrap.markov_sampler.BlockCompressor(method: Literal['first', 'middle', 'last', 'mean', 'mode', 'median', 'kmeans', 'kmedians', 'kmedoids'] = 'middle', apply_pca_flag: bool = False, pca: PCA | None = None, random_seed: Integral | None = None)[source]

BlockCompressor class provides the functionality to compress blocks of data using different techniques.

__init__(method: BlockCompressorTypes = 'middle', apply_pca_flag: bool = False, pca: PCA | None = None, random_seed: Integral | None = None) → None[source]: Initialize the BlockCompressor instance.

_pca_compression(block: np.ndarray, summary: np.ndarray) → np.ndarray[source]: Summarize a block of data using PCA.

_summarize_block(block: np.ndarray) → np.ndarray[source]: Summarize a block using a specified method.

summarize_blocks(blocks) → np.ndarray[source]: Summarize each block in the input list of blocks using the specified method.

property apply_pca_flag: bool: Getter for apply_pca_flag.

classmethod get_test_params(parameter_set='default')[source]

Return testing parameter settings for the estimator.

Parameters:: parameter_set (str, default="default") – Name of the set of test parameters to return, for use in tests. If no special parameters are defined for a value, will return “default” set.
Returns:: params – Parameters to create testing instances of the class Each dict are parameters to construct an “interesting” test instance, i.e., MyClass(**params) or MyClass(**params[i]) creates a valid test instance. create_test_instance uses the first (or only) dictionary in params
Return type:: dict or list of dict, default = {}

property method: str: Getter for method.

property pca: PCA: Getter for pca.

summarize_blocks(blocks) → ndarray[source]

Summarize each block in the input list of blocks using the specified method.

Parameters:: blocks (List[np.ndarray]) – List of numpy arrays representing the blocks to be summarized.
Returns:: Numpy array containing the summarized blocks.
Return type:: np.ndarray

Example

>>> compressor = BlockCompressor(method='middle')
>>> blocks = [np.array([1, 2, 3]), np.array([4, 5, 6])]
>>> summarized_blocks = compressor.summarize_blocks(blocks)
>>> summarized_blocks
array([2, 5])

class tsbootstrap.markov_sampler.MarkovSampler(method: Literal['first', 'middle', 'last', 'mean', 'mode', 'median', 'kmeans', 'kmedians', 'kmedoids'] = 'middle', apply_pca_flag: bool = False, pca: PCA | None = None, n_iter_hmm: Integral = 100, n_fits_hmm: Integral = 10, blocks_as_hidden_states_flag: bool = False, random_seed: Integral | None = None)[source]

A class for sampling from a Markov chain with given transition probabilities.

This class allows for the combination of block-based bootstrapping and Hidden Markov Model (HMM) fitting.

transition_matrix_calculator

An instance of MarkovTransitionMatrixCalculator to calculate transition probabilities.

Type:: MarkovTransitionMatrixCalculator

block_compressor

An instance of BlockCompressor to perform block summarization/compression.

Type:: BlockCompressor

__init__(method: str = 'mean', apply_pca_flag: bool = False, pca: PCA | None = None, n_iter_hmm: Integral = 100, n_fits_hmm: Integral = 10, blocks_as_hidden_states_flag: bool = False, random_seed: Integral | None = None) → None[source]: Initialize the MarkovSampler instance.

_validate_n_states(n_states: Integral, blocks) → Integral: Validate the number of states.

_validate_n_iter_hmm(n_iter_hmm: Integral) → Integral: Validate the number of iterations for the HMM.

_validate_n_fits_hmm(n_fits_hmm: Integral) → Integral: Validate the number of fits for the HMM.

_validate_blocks_as_hidden_states_flag(blocks_as_hidden_states_flag: bool) → bool: Validate the blocks_as_hidden_states_flag.

_validate_random_seed(random_seed: Integral | None) → Integral | None: Validate the random seed.

fit_hidden_markov_model(blocks, n_states: Integral = 5) → hmm.GaussianHMM[source]: Fit a Hidden Markov Model (HMM) to the input blocks.

fit(blocks, n_states: Integral = 5) → MarkovSampler[source]: Fit the MarkovSampler instance to the input blocks.

sample(blocks, n_states: Integral = 5) → Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, np.ndarray][source]: Sample from the MarkovSampler instance.

Examples

>>> sampler = MarkovSampler(n_iter_hmm=200, n_fits_hmm=20)
>>> blocks = [np.random.rand(10, 5) for _ in range(50)]
>>> start_probs, trans_probs, centers, covariances, assignments = sampler.sample(blocks, n_states=5, blocks_as_hidden_states_flag=True)

property blocks_as_hidden_states_flag: bool: Getter for blocks_as_hidden_states_flag.

fit(blocks, n_states: Integral = 5) → MarkovSampler[source]

Sample from a Markov chain with given transition probabilities.

Parameters:

blocks (List[np.ndarray] or np.ndarray) – A list of 2D NumPy arrays, each representing a block of data, or a 2D NumPy array, where each row represents a row of raw data.
n_states (Integral, optional) – The number of states in the hidden Markov model. Default is 5.

Returns:

Current instance of the MarkovSampler class, with the model trained.

Return type:

MarkovSampler

Examples

>>> blocks = [np.random.rand(10, 5) for _ in range(50)]
>>> sampler.fit(blocks, n_states=5)

fit_hidden_markov_model(X: ndarray, n_states: Integral = 5, transmat_init: ndarray | None = None, means_init: ndarray | None = None, lengths: ndarray | None = None)[source]

Fit a Gaussian Hidden Markov Model on the input data.

Parameters:

X (np.ndarray) – A 2D NumPy array, where each row represents a summarized block of data.
n_states (Integral, optional) – The number of states in the hidden Markov model. By default 5.

Returns:

The trained Gaussian Hidden Markov Model.

Return type:

hmm.GaussianHMM

property n_fits_hmm: Integral: Getter for n_fits_hmm.

property n_iter_hmm: Integral: Getter for n_iter_hmm.

property random_seed: Getter for random_seed.

sample(X: ndarray | None = None, random_seed: Integral | None = None)[source]

Sample from a Markov chain with given transition probabilities.

Parameters:

X (Optional[np.ndarray]) – A 2D NumPy array, where each row represents a summarized block of data. If not provided, the model will be sampled using the data used to fit the model.
random_seed (Optional[Integral]) – The seed for the random number generator. If not provided, the random seed used to fit the model will be used.

Returns:

A tuple containing the start probabilities and transition probabilities of the Markov chain.

Return type:

Tuple[np.ndarray, np.ndarray]

class tsbootstrap.markov_sampler.MarkovTransitionMatrixCalculator[source]

MarkovTransitionMatrixCalculator class provides the functionality to calculate the transition matrix for a set of data blocks based on their DTW distances between consecutive blocks.

The transition matrix is normalized to obtain transition probabilities. The underlying assumption is that the data blocks are generated from a Markov chain. In other words, the next block is generated based on the current block and not on any previous blocks.

__init__() → None: Initialize the MarkovTransitionMatrixCalculator instance.

_calculate_dtw_distances(blocks, eps: float = 1e-5) → np.ndarray[source]: Calculate the DTW distances between all pairs of blocks.

calculate_transition_probabilities(blocks) → np.ndarray[source]: Calculate the transition probability matrix based on DTW distances between all pairs of blocks.

Examples

>>> calculator = MarkovTransitionMatrixCalculator()
>>> blocks = [np.random.rand(10, 5) for _ in range(50)]
>>> transition_matrix = calculator.calculate_transition_probabilities(blocks)

static calculate_transition_probabilities(blocks) → ndarray[source]

Calculate the transition probability matrix based on DTW distances between all pairs of blocks.

Parameters:: blocks (List[np.ndarray]) – A list of numpy arrays, each of shape (num_timestamps, num_features), representing the time series data blocks.
Returns:: A transition probability matrix of shape (len(blocks), len(blocks)).
Return type:: np.ndarray