PrepareDALI

class lightning_pose.data.dali.PrepareDALI[source]

Bases: object

Factory for DALI video-reading pipelines used during training and prediction.

Construction is split into two phases:

  1. __init__: validates inputs (file existence, multiview frame-count consistency), pre-computes pipe arguments for all four combinations of {train, predict} × {base, context}, and stores them in _pipe_dict.

  2. __call__: builds the DALI pipe for the requested train_stage / model_type combination and returns a ready-to-iterate LitDaliWrapper.

Splitting validation from pipe-building lets callers inspect num_iters and other properties before committing GPU memory, and makes it straightforward to call __call__ again with a different stage without repeating validation.

Attributes Summary

num_iters

Number of dataloader iterations required to process all frames.

Methods Summary

__call__()

Build the DALI pipeline and return a ready-to-iterate LitDaliWrapper.

Attributes Documentation

num_iters

Number of dataloader iterations required to process all frames.

Returns:

Integer count of how many times the dataloader must be enumerated to exhaust all video frames for the current train_stage and model_type configuration.

Methods Documentation

__call__() LitDaliWrapper[source]

Build the DALI pipeline and return a ready-to-iterate LitDaliWrapper.

Returns:

LitDaliWrapper configured for self.train_stage and self.model_type.

__init__(train_stage: Literal['predict', 'train'], model_type: Literal['base', 'context'], filenames: list[str] | list[list[str]], resize_dims: list[int], dali_config: dict | DictConfig | ListConfig | None = None, imgaug: str | None = 'default', num_threads: int = 1, bbox_df: DataFrame | None = None) None[source]

Initialize DALI pipelines and dataloaders for training or prediction.

Parameters:
  • train_stage – whether to set up pipelines for "train" or "predict".

  • model_type"base" for standard single-frame models, "context" for MHCRNN models that consume a temporal window.

  • filenames – for single-view models, a flat list of video file paths; for multi-view models, a list of per-view lists of video file paths.

  • resize_dims[height, width] to resize frames to before feeding the model. Also used as the post-crop resize target when bbox_df is provided.

  • dali_config – DALI-specific config dict; falls back to package defaults when None.

  • imgaug – name of the augmentation pipeline to apply during training (e.g. "dlc"); pass "default" for resize-only or None to disable.

  • num_threads – number of CPU threads used by DALI pipelines.

  • bbox_df – optional DataFrame with columns ["x", "y", "h", "w"], one row per frame. When provided, the predict pipeline loads full-resolution frames (DALI resize is disabled) and LitDaliWrapper crops each frame to its bbox before resizing to resize_dims.

Raises:
  • FileNotFoundError – if any path in filenames does not exist or is not a file.

  • ValueError – for multiview inputs, if views have differing numbers of sessions or if a session has differing frame counts across views (which would desynchronize the per-view readers).

__new__(**kwargs)