lightning_pose.data

lightning_pose.data.augmentations Module

Functions to build augmentation pipeline.

Functions

imgaug_transform(params_dict)

Create simple and flexible data transform pipeline that augments images and keypoints.

lightning_pose.data.cameras Module

Camera geometry utilities for multi-view 2D-to-3D projection and triangulation.

Functions

project_camera_pairs_to_3d(points, ...)

Project 2D keypoints from each pair of cameras into 3D world space.

project_3d_to_2d(points_3d, intrinsics, ...)

Project 3D keypoints to 2D using camera parameters.

Classes

CameraGroup

Inherit Anipose camera group and add new non-jitted triangulation method for dataloaders.

lightning_pose.data.dali Module

Data pipelines based on efficient video reading by nvidia dali package.

Architecture overview

PrepareDALI is the entry point. Its __init__ validates inputs and pre-computes pipeline arguments for all four combinations of stage ("train", "predict") and model type ("base", "context"). Calling the instance (__call__) builds the DALI pipe for the requested combination and returns a ready-to-iterate LitDaliWrapper.

LitDaliWrapper extends DALIGenericIterator and converts raw DALI output into typed UnlabeledBatchDict or MultiviewUnlabeledBatchDict instances on every __next__.

Two prediction modes

Standard mode (no bbox_df):

DALI resizes frames to resize_dims on the GPU. The bbox field of each returned batch covers the full frame (x=0, y=0, h=H, w=W).

Bbox-crop mode (bbox_df supplied to PrepareDALI):

DALI delivers full-resolution frames (resize_dims=None in the predict pipe) so that LitDaliWrapper._apply_bbox_crop can crop each frame to its per-frame bounding box and resize to the original resize_dims using torch.nn.functional.interpolate. The bbox field of each batch contains the actual crop coordinates, so downstream code can remap predictions back to the original coordinate space.

Functions

video_pipe(filenames[, resize_dims, ...])

Generic video reader pipeline that loads videos, resizes, augments, and normalizes.

Classes

LitDaliWrapper

Typed wrapper around a DALI pipeline iterator for Lightning Pose models.

PrepareDALI

Factory for DALI video-reading pipelines used during training and prediction.

lightning_pose.data.datamodules Module

Data modules split a dataset into train, val, and test modules.

Classes

BaseDataModule

Splits a labeled dataset into train, val, and test data loaders.

UnlabeledDataModule

Data module that contains labeled and unlabled data loaders.

lightning_pose.data.datasets Module

Dataset objects store images, labels, and functions for manipulation.

Classes

BaseTrackingDataset

Base dataset that contains images and keypoints as (x, y) pairs.

HeatmapDataset

Heatmap dataset that contains the images and keypoints in 2D arrays.

MultiviewHeatmapDataset

Heatmap dataset that contains the images and keypoints in 2D arrays from all the cameras.

lightning_pose.data.datatypes Module

Classes to streamline data typechecking.

Classes

BaseLabeledExampleDict

Return type when calling __getitem__() on BaseTrackingDataset.

HeatmapLabeledExampleDict

Return type when calling __getitem__() on HeatmapTrackingDataset.

MultiviewLabeledExampleDict

Return type when calling __getitem__() on MultiviewDataset.

MultiviewHeatmapLabeledExampleDict

Return type when calling __getitem__() on MultiviewHeatmapDataset.

BaseLabeledBatchDict

Batch type for base labeled data.

HeatmapLabeledBatchDict

Batch type for heatmap labeled data.

MultiviewLabeledBatchDict

Batch type for multiview labeled data.

MultiviewHeatmapLabeledBatchDict

Batch type for multiview heatmap labeled data.

UnlabeledBatchDict

Batch type for unlabeled data.

MultiviewUnlabeledBatchDict

Batch type for multiview unlabeled data.

SemiSupervisedBatchDict

Batch type for base labeled+unlabeled data.

SemiSupervisedHeatmapBatchDict

Batch type for heatmap labeled+unlabeled data.

SemiSupervisedDataLoaderDict

Return type when calling train/val/test_dataloader() on semi-supervised models.

lightning_pose.data.extractor Module

Helper class to extract labeled data from a data module.

Classes

DataExtractor

Helper class to extract all data from a data module.

lightning_pose.data.factory Module

Factory functions to build data pipeline components from config.

Functions

get_imgaug_transform(cfg)

Create simple and flexible data transform pipeline that augments images and keypoints.

get_dataset(cfg, data_dir, imgaug_transform)

Create a dataset that contains labeled data.

get_data_module(cfg, dataset[, video_dir])

Create a data module that splits a dataset into train/val/test iterators.

lightning_pose.data.utils Module

Dataset/data module utilities.

Functions

split_sizes_from_probabilities(total_number, ...)

Returns the number of examples for train, val and test given split probs.

clean_any_nans(data, dim)

Remove samples from a data array that contain nans.

count_frames(video_file)

Simple function to count the number of frames in a video.

compute_num_train_frames(len_train_dataset)

Quickly compute number of training frames for a given dataset.

generate_heatmaps(keypoints, height, width, ...)

Generate 2D Gaussian heatmaps from mean and sigma.

evaluate_heatmaps_at_location(heatmaps, locs)

Evaluate 4D heatmaps using a 3D location tensor (last dim is x, y coords).

undo_affine_transform(keypoints, transform)

Undo an affine transform given a tensor of keypoints and the tranform matrix.

undo_affine_transform_batch(...[, is_multiview])

Potentially undo an affine transform given a tensor of keypoints and the tranform matrix.

normalized_to_bbox(keypoints, bbox)

Transform keypoints from normalized coordinates to bbox coordinates

convert_bbox_coords(batch_dict, ...[, in_place])

Transform keypoints from bbox coordinates to absolute frame coordinates.

convert_original_to_model_coords(batch_dict, ...)

Transform keypoints from original frame coordinates to model input coordinates.

original_to_model(keypoints, bbox, ...)

Convert keypoints from original image coordinates to model input coordinates.