lightning_pose.data

lightning_pose.data.augmentations Module

Functions to build augmentation pipeline.

lightning_pose.data.cameras Module

Camera geometry utilities for multi-view 2D-to-3D projection and triangulation.

lightning_pose.data.dali Module

Data pipelines based on efficient video reading by nvidia dali package.

Import warning

nvidia-dali-cuda110 is not installed on CPU-only machines (macOS, GPU-less Linux), so importing this module at the top level of any other module will raise ImportError on those platforms and break import lightning_pose entirely. Always import from this module lazily, inside the function or method body that uses it:

def my_function(...):
    from lightning_pose.data.dali import PrepareDALI  # lazy: avoids ImportError on cpu-only
    ...

Architecture overview

PrepareDALI is the entry point. Its __init__ validates inputs and pre-computes pipeline arguments for all four combinations of stage ("train", "predict") and model type ("base", "context"). Calling the instance (__call__) builds the DALI pipe for the requested combination and returns a ready-to-iterate LitDaliWrapper.

LitDaliWrapper extends DALIGenericIterator and converts raw DALI output into typed UnlabeledBatchDict or MultiviewUnlabeledBatchDict instances on every __next__.

Two prediction modes

Standard mode (no bbox_df):: DALI resizes frames to resize_dims on the GPU. The bbox field of each returned batch covers the full frame (x=0, y=0, h=H, w=W).
Bbox-crop mode (bbox_df supplied to PrepareDALI):: DALI delivers full-resolution frames (resize_dims=None in the predict pipe) so that LitDaliWrapper._apply_bbox_crop can crop each frame to its per-frame bounding box and resize to the original resize_dims using torch.nn.functional.interpolate. The bbox field of each batch contains the actual crop coordinates, so downstream code can remap predictions back to the original coordinate space.

lightning_pose.data.datamodules Module

Data modules split a dataset into train, val, and test modules.

lightning_pose.data.datasets Module

Dataset objects store images, labels, and functions for manipulation.

lightning_pose.data.datatypes Module

Classes to streamline data typechecking.

Classes

`BaseLabeledExampleDict`	Return type when calling __getitem__() on BaseTrackingDataset.
`HeatmapLabeledExampleDict`	Return type when calling __getitem__() on HeatmapTrackingDataset.
`MultiviewLabeledExampleDict`	Return type when calling __getitem__() on MultiviewDataset.
`MultiviewHeatmapLabeledExampleDict`	Return type when calling __getitem__() on MultiviewHeatmapDataset.
`BaseLabeledBatchDict`	Batch type for base labeled data.
`HeatmapLabeledBatchDict`	Batch type for heatmap labeled data.
`MultiviewLabeledBatchDict`	Batch type for multiview labeled data.
`MultiviewHeatmapLabeledBatchDict`	Batch type for multiview heatmap labeled data.
`UnlabeledBatchDict`	Batch type for unlabeled data.
`MultiviewUnlabeledBatchDict`	Batch type for multiview unlabeled data.
`SemiSupervisedBatchDict`	Batch type for base labeled+unlabeled data.
`SemiSupervisedHeatmapBatchDict`	Batch type for heatmap labeled+unlabeled data.
`SemiSupervisedDataLoaderDict`	Return type when calling train/val/test_dataloader() on semi-supervised models.

lightning_pose.data.extractor Module

Helper class to extract labeled data from a data module.

lightning_pose.data.factory Module

Factory functions to build data pipeline components from a Hydra config.

Three public functions, typically called in order:

get_imgaug_transform() — builds an imgaug augmentation pipeline from cfg.training.imgaug.
get_dataset() — wraps the labeled CSV data in the appropriate dataset class (regression, single-view heatmap, or multiview heatmap).
get_data_module() — wraps a dataset in a data module that handles train/val/test splitting; selects UnlabeledDataModule for semi-supervised training (adds DALI video loader) or BaseDataModule for supervised-only training.

Adding a new model type (data-side changes only — see models/factory.py for the model-side steps):

If the new type can reuse an existing dataset class (e.g. it is a heatmap variant), extend the appropriate elif branch in get_dataset() to match the new cfg.model.model_type string. If it needs a new dataset class, define that class in datasets.py, import it here, and add a new elif branch.
If the new type needs a different BaseDataModule subclass, add a branch in get_data_module(); otherwise no change is needed there.

lightning_pose.data.utils Module

Dataset/data module utilities.