Lightning Pose API

Train function

lightning_pose.train.train(cfg: DictConfig | ListConfig, model_dir: str | Path | None = None, skip_evaluation: bool = False) Model[source]

Train a model using the configuration cfg, saving outputs to model_dir.

Parameters:
  • cfg – hydra config object.

  • model_dir – directory to save model outputs; defaults to cwd if unspecified.

  • skip_evaluation – if True, skip post-training evaluation.

Returns:

trained Model instance.

To train a model using config.yaml and output to outputs/doc_model:
import os
from lightning_pose.train import train
from omegaconf import OmegaConf

cfg = OmegaConf.load("config.yaml")
os.chdir("outputs/doc_model")
train(cfg)
To override settings before training:
cfg = OmegaConf.load("config.yaml")
overrides = {
    "training": {
        "min_epochs": 5,
        "max_epochs": 5
    }
}
cfg = OmegaConf.merge(cfg, overrides)
train(cfg)

Training returns a Model object, which is described next.

Model class

The Model class provides an easy-to-use interface to a lightning-pose model. It supports running inference and accessing model metadata. The set of supported Model operations will expand as we continue development.

You create a model object using Model.from_dir:

from lightning_pose.api.model import Model

model = Model.from_dir("outputs/doc_model")

Then, to predict on new data:

model.predict_on_video_file("path/to/video.mp4")

or:

model.predict_on_label_csv("path/to/csv_file.csv")

To predict on a single numpy frame (no file I/O):

import numpy as np

frame = np.array(...)  # (H, W, 3) uint8 RGB
result = model.predict_frame(frame)
keypoints = result["keypoints"]   # (num_kp, 2) float32
confidence = result["confidence"] # (num_kp,) float32

API Reference

class lightning_pose.api.model.Model[source]

High-level interface for inference with a trained lightning-pose model.

Load a saved model with Model.from_dir, then call prediction methods directly. Model weights are loaded lazily on the first prediction call.

model_dir

absolute path to the directory the model is stored in.

Type:

pathlib.Path

config

the model configuration as a ModelConfig object.

Type:

lightning_pose.api.model_config.ModelConfig

model

the underlying PyTorch model; None until the first prediction call.

Type:

lightning_pose.models.HeatmapTracker | lightning_pose.models.SemiSupervisedHeatmapTracker | lightning_pose.models.HeatmapTrackerMHCRNN | lightning_pose.models.SemiSupervisedHeatmapTrackerMHCRNN | lightning_pose.models.HeatmapTrackerMultiviewTransformer | lightning_pose.models.SemiSupervisedHeatmapTrackerMultiviewTransformer | lightning_pose.models.RegressionTracker | lightning_pose.models.SemiSupervisedRegressionTracker | None

Examples

>>> from lightning_pose.api import Model
>>> model = Model.from_dir("outputs/2024-01-01/12-00-00")

Single-frame inference (no file I/O): >>> import numpy as np >>> frame = np.zeros((256, 256, 3), dtype=np.uint8) >>> result = model.predict_frame(frame) >>> result[β€œkeypoints”].shape # (num_keypoints, 2) >>> result[β€œconfidence”].shape # (num_keypoints,)

Predict on a video file: >>> pred_result = model.predict_on_video_file(β€œpath/to/video.mp4”) >>> pred_result.predictions # pd.DataFrame with MultiIndex columns >>> pred_result.metrics # ComputeMetricsSingleResult or None

Predict on a labeled CSV (also computes pixel error): >>> pred_result = model.predict_on_label_csv(β€œpath/to/CollectedData.csv”)

property cfg: DictConfig | ListConfig

The model configuration as an omegaconf.DictConfig.

config: ModelConfig

The model configuration stored as a ModelConfig object. ModelConfig wraps the omegaconf.DictConfig and provides util functions over it.

cropped_csv_file_path(csv_file_path: str | Path) Path[source]

Return the path where a cropzoom-adjusted CSV file will be saved.

Parameters:

csv_file_path – path to the original labeled CSV file.

Returns:

path of the form {model_dir}/image_preds/{csv_name}/cropped_{csv_name}.

cropped_data_dir() Path[source]

Return the directory where cropzoom-cropped images are saved.

cropped_videos_dir() Path[source]

Return the directory where cropzoom-cropped videos are saved.

static from_dir(model_dir: str | Path) Model[source]

Create a Model instance for a model stored at model_dir.

Parameters:

model_dir – path to a model output directory containing config.yaml and a .ckpt checkpoint file.

Returns:

Model ready for inference. Weights are loaded lazily on the first prediction call.

Examples

>>> from lightning_pose.api import Model
>>> model = Model.from_dir("outputs/2024-01-01/12-00-00")
>>> model.config.is_multi_view()
False
image_preds_dir() Path[source]

Return the directory where image/CSV predictions are saved.

labeled_videos_dir() Path[source]

Return the directory where prediction-annotated videos are saved.

model_dir: Path

Directory the model is stored in.

predict_frame(frame_rgb: ndarray, bbox: tuple[int, int, int, int] | None = None) dict[str, ndarray][source]

Single-frame inference. No file I/O, no DALI.

Preprocessing uses cv2 (not DALI). Results will differ numerically from predict_on_video_file due to interpolation and normalization differences. Do not mix results from the two paths in quantitative analysis.

For MHCRNN (context) models, pass a (T, H, W, 3) array where T is the temporal context length (typically 5). Passing a single frame to a context model raises ValueError β€” use predict_on_video_file for proper temporal inference.

The first call triggers model loading and CUDA initialization, which may take several seconds. Subsequent calls are fast (~5-50ms depending on backbone). For latency-sensitive loops, call once on a dummy frame before entering the loop.

Parameters:
  • frame_rgb – (H, W, 3) uint8 RGB array for standard models, or (T, H, W, 3) uint8 RGB array for context (MHCRNN) models.

  • bbox – Optional (x, y, w, h) crop region. Note: this is (x, y, width, height), NOT (x1, y1, x2, y2). If provided, crops first, then remaps keypoints back to original coordinates.

Returns:

(num_kp, 2) float32 array (x, y) in original frame coords,
”confidence”: (num_kp,) float32 in [0, 1] – likelihood/confidence

per keypoint. For regression models, confidence is always 1.0.}

Return type:

{β€œkeypoints”

Raises:

ValueError – If frame_rgb has wrong shape/dtype, bbox has non-positive dimensions, bbox produces an empty crop, or a context model receives single-frame input.

Examples

>>> import numpy as np
>>> frame = np.zeros((256, 256, 3), dtype=np.uint8)
>>> result = model.predict_frame(frame)
>>> result["keypoints"].shape    # (num_keypoints, 2)
>>> result["confidence"].shape   # (num_keypoints,)

With a bounding-box crop (x, y, width, height): >>> result = model.predict_frame(frame, bbox=(100, 50, 128, 128))

predict_on_label_csv(csv_file: str | Path, data_dir: str | Path | None = None, compute_metrics: bool = True, add_train_val_test_set: bool = False, bbox_file: str | Path | None = None) PredictionResult[source]

Predicts on a labeled dataset and computes error/loss metrics if applicable.

Parameters:
  • csv_file – path to the CSV file of images and keypoint locations.

  • data_dir – root path for relative image paths in the CSV file. Defaults to the data_dir used during training.

  • compute_metrics – whether to compute pixel error and loss metrics on predictions.

  • add_train_val_test_set – set to True when predicting on the training dataset to add a set column to the output.

  • bbox_file – optional path to a bbox CSV produced by litpose create_bbox (or any compatible source). When provided, each frame is cropped to its bounding box before being passed to the model, and predictions are returned in the original (un-cropped) coordinate space.

Returns:

A PredictionResult object containing the predictions and metrics.

Return type:

PredictionResult

Examples

>>> result = model.predict_on_label_csv("path/to/CollectedData.csv")
>>> result.predictions           # pd.DataFrame with MultiIndex columns
>>> result.metrics.pixel_error   # mean pixel error per keypoint

Skip metric computation for faster inference: >>> result = model.predict_on_label_csv( … β€œpath/to/CollectedData.csv”, … compute_metrics=False, … )

predict_on_label_csv_multiview(csv_file_per_view: list[str] | list[Path], bbox_file_per_view: list[str] | list[Path] | None = None, camera_params_file: str | Path | None = None, data_dir: str | Path | None = None, compute_metrics: bool = True, add_train_val_test_set: bool = False) MultiviewPredictionResult[source]

Version of predict_on_label_csv that gives models access to all views of each frame.

Parameters:

csv_file_per_view – a list of csv files each from a different view of the same session; order must match view_names in the config file.

See predict_on_label_csv docstring for other arguments.

predict_on_video_file(video_file: str | Path, output_dir: str | Path | None = 'unspecified', compute_metrics: bool = True, generate_labeled_video: bool = False, progress_file: Path | None = None, bbox_file: str | Path | None = None) PredictionResult[source]

Predicts on a video file and computes unsupervised loss metrics if applicable.

Parameters:
  • video_file (str | Path) – Path to the video file.

  • output_dir (str | Path, optional) – The directory to save outputs to. Defaults to {model_dir}/image_preds/{csv_file_name}. If set to None, outputs are not saved.

  • compute_metrics (bool, optional) – Whether to compute pixel error and loss metrics on predictions.

  • generate_labeled_video (bool, optional) – Whether to save a labeled video. Defaults to False.

  • progress_file (Path, optional) – Path to a file to save progress information for the App. Defaults to None.

  • bbox_file (str | Path, optional) – Path to a per-frame bbox CSV (columns x, y, h, w; one row per frame). When provided, each frame is cropped to its bounding box before being passed to the model, and predictions are returned in the original coordinate space. Single-view only. Defaults to None.

Returns:

A PredictionResult object containing the predictions and metrics.

Return type:

PredictionResult

Examples

>>> result = model.predict_on_video_file("path/to/video.mp4")
>>> result.predictions   # pd.DataFrame, one row per frame

Save a keypoint-annotated video alongside the predictions CSV: >>> result = model.predict_on_video_file( … β€œpath/to/video.mp4”, … generate_labeled_video=True, … )

predict_on_video_file_multiview(video_file_per_view: list[str] | list[Path], output_dir: str | Path | None = 'unspecified', compute_metrics: bool = True, generate_labeled_video: bool = False, progress_file: Path | None = None) MultiviewPredictionResult[source]

Version of predict_on_video_file that accesses multiple camera views of each frame.

Parameters:
  • video_file_per_view – a list of video files each from a different view of the same session; number of files must match view_names in the config; order does not matter as files are matched to views by filename.

  • output_dir – directory to save outputs to; defaults to {model_dir}/video_preds; set to None to skip saving.

  • compute_metrics – whether to compute pixel error and loss metrics on predictions.

  • generate_labeled_video – whether to save a labeled video.

  • progress_file – path to a file to save progress information for the App.

Returns:

object containing the predictions and metrics for each view.

video_preds_dir() Path[source]

Return the directory where video predictions are saved.

Return types

class lightning_pose.data.datatypes.PredictionResult[source]
metrics: ComputeMetricsSingleResult | None = None
predictions: DataFrame = <dataclasses._MISSING_TYPE object>
to_dict() dict[str, Any][source]

Return predictions and metrics as a flat dict of named numpy arrays.

All arrays have shape (n_frames, n_keypoints) and share the same row order. Metric arrays are None when the metric was not computed.

Returns:

  • keypoint_names: list of keypoint name strings.

  • index: list of frame identifiers (file paths or integer indices).

  • x: float array of predicted x coordinates.

  • y: float array of predicted y coordinates.

  • confidence: float array of per-keypoint likelihood in [0, 1].

  • pixel_error: float array or None.

  • temporal_norm: float array or None.

  • pca_singleview_error: float array or None.

  • pca_multiview_error: float array or None.

Return type:

dict with keys

class lightning_pose.data.datatypes.MultiviewPredictionResult[source]
metrics: dict[str, ComputeMetricsSingleResult] | None = None
predictions: dict[str, DataFrame] = <dataclasses._MISSING_TYPE object>
to_dict() dict[str, dict[str, Any]][source]

Return predictions and metrics for each view as a flat dict of named numpy arrays.

Wraps PredictionResult.to_dict() for each view.

Returns:

dict keyed by view name, where each value is the to_dict() output for that view.

class lightning_pose.data.datatypes.ComputeMetricsSingleResult[source]
pca_mv_df: DataFrame | None = None
pca_sv_df: DataFrame | None = None
pixel_error_df: DataFrame | None = None
temporal_norm_df: DataFrame | None = None

Lightning Pose Internal API