Block Generator

class tsbootstrap.block_generator.BlockGenerator(*, block_length_sampler: BlockLengthSampler, input_length: Annotated[int, Ge(ge=3), Gt(gt=0)], wrap_around_flag: bool = False, rng: Generator | Integral | None = None, overlap_length: Annotated[int, Gt(gt=0)] | None = 0, min_block_length: Annotated[Annotated[int, Gt(gt=0)] | None, Ge(ge=1)] = 1)[source]

Generates blocks of indices for time series bootstrapping.

This class is responsible for creating blocks of indices that can be used to sample segments from a time series. It supports both overlapping and non-overlapping blocks and can optionally wrap around the end of the time series.

Parameters:
  • input_length (PositiveInt) – The length of the input time series. Must be at least 3.

  • block_length_sampler (BlockLengthSampler) – An instance of BlockLengthSampler to determine block lengths.

  • wrap_around_flag (bool, optional) – If True, blocks can wrap around the end of the time series. Default is False.

  • rng (RngTypes, optional) – Random number generator for sampling. Defaults to a new RNG instance.

  • overlap_length (PositiveInt, optional) – The length of overlap between consecutive blocks. If not provided, defaults to half the average block length.

  • min_block_length (PositiveInt, optional) – The minimum allowed block length. Defaults to MIN_BLOCK_LENGTH.

Examples

>>> from tsbootstrap.block_length_sampler import BlockLengthSampler, DistributionTypes
>>> sampler = BlockLengthSampler(avg_block_length=5, block_length_distribution=DistributionTypes.GAMMA)
>>> generator = BlockGenerator(input_length=100, block_length_sampler=sampler, wrap_around_flag=True)
>>> blocks = generator.generate_blocks(overlap_flag=True)
>>> print(blocks)
[array([...]), array([...]), ...]
check_consistency() BlockGenerator[source]

Perform inter-field validation to ensure consistency among fields.

This validator runs after all field validators have processed their respective fields, ensuring that interdependent fields maintain logical consistency.

Returns:

The validated BlockGenerator instance.

Return type:

BlockGenerator

Raises:

ValueError – If any of the consistency checks fail.

generate_blocks(overlap_flag: bool = False) list[ndarray][source]

Generate block indices based on the overlap flag.

This method serves as a general entry point to generate either overlapping or non-overlapping blocks based on the provided flag.

Parameters:

overlap_flag (bool, optional) – If True, generate overlapping blocks. Otherwise, generate non-overlapping blocks. Default is False.

Returns:

list of numpy arrays representing the generated blocks.

Return type:

list[np.ndarray]

generate_non_overlapping_blocks() list[ndarray][source]

Generate non-overlapping block indices in the time series.

Returns:

list of numpy arrays containing the indices for each non-overlapping block.

Return type:

list[np.ndarray]

Raises:

ValueError – If the block length sampler is not set.

generate_overlapping_blocks() list[ndarray][source]

Generate overlapping block indices in the time series.

Returns:

A list of numpy arrays where each array represents the indices of a block in the time series.

Return type:

list[np.ndarray]

Notes

The block indices are generated as follows:

  1. A starting index is determined based on the wrap_around_flag.

  2. A block length is sampled from the block_length_sampler.

  3. An overlap length is calculated from the block length.

  4. A block is created from the starting index and block length.

  5. The starting index is updated to the next starting index.

  6. Steps 2-5 are repeated until the total length covered equals the length of the time series.

The block length sampler is used to sample the block length. The overlap length is calculated based on the sampled block length. The starting index is updated by adding the block length and subtracting the overlap length, then taking modulo input_length to ensure it wraps around if necessary.

model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'validate_assignment': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'block_length_sampler': FieldInfo(annotation=BlockLengthSampler, required=True, description='Sampler for determining block lengths.'), 'input_length': FieldInfo(annotation=int, required=True, description='The length of the input time series.', metadata=[Ge(ge=3), Gt(gt=0)]), 'min_block_length': FieldInfo(annotation=Union[Annotated[int, Gt], NoneType], required=False, default=1, description='Minimum block length.', metadata=[Ge(ge=1)]), 'overlap_length': FieldInfo(annotation=Union[Annotated[int, Gt], NoneType], required=False, default=0, description='Overlap length between blocks.'), 'rng': FieldInfo(annotation=Union[Generator, Integral, NoneType], required=False, default_factory=<lambda>, description='Random number generator.'), 'wrap_around_flag': FieldInfo(annotation=bool, required=False, default=False, description='Flag to allow wrap-around in block generation.')}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

classmethod validate_block_length_sampler(v: BlockLengthSampler, info: ValidationInfo) BlockLengthSampler[source]

Validate the block_length_sampler to ensure its average block length does not exceed the input_length.

Parameters:
  • v (BlockLengthSampler) – The BlockLengthSampler instance to validate.

  • info (ValidationInfo) – Provides context about the validation, including other fields.

Returns:

The validated BlockLengthSampler instance.

Return type:

BlockLengthSampler

Raises:

ValueError – If the sampler’s average block length exceeds the input_length.

classmethod validate_min_block_length(v: int | None, info: ValidationInfo) int[source]

Validate and adjust the min_block_length parameter.

If min_block_length is provided, it must be between MIN_BLOCK_LENGTH and the sampler’s average block length. Otherwise, it defaults to MIN_BLOCK_LENGTH.

Parameters:
  • v (Optional[int]) – The minimum block length to validate.

  • info (ValidationInfo) – Provides context about the validation, including other fields.

Returns:

The validated and possibly adjusted minimum block length.

Return type:

int

Raises:

ValueError – If block_length_sampler is not provided.

classmethod validate_overlap_length(v: int | None, info: ValidationInfo) int[source]

Validate and adjust the overlap_length parameter.

If overlap_length is provided and is greater than or equal to input_length, it is adjusted to input_length - 1 with a warning. If overlap_length is not provided, it defaults to half of the average block length.

Parameters:
  • v (Optional[int]) – The overlap length to validate.

  • info (ValidationInfo) – Provides context about the validation, including other fields.

Returns:

The validated and possibly adjusted overlap length.

Return type:

int

Raises:

ValueError – If input_length or block_length_sampler is not provided.