Lightning Pose 3D
This repo and the Lightning Pose App support multi-camera projects with an arbitrary number of cameras (tested with up to six). Before starting, cameras must be synchronized across views, and the resulting video files for a given session must each contain the same number of frames.
On this page:
Camera setup
We recommend using at least three cameras to maximize the number of views in which each keypoint is unoccluded. Cameras should be positioned at relatively orthogonal angles to one another so that each view provides complementary information.
Camera calibration (optional)
Camera calibration determines the intrinsic parameters of each camera (focal length, principal point, distortion coefficients) and the extrinsic parameters that describe how the cameras are positioned and oriented relative to each other. Together, these parameters make it possible to map 2D pixel coordinates in any view to a shared 3D world coordinate system.
We recommend using the Anipose package for calibration. If you use a different calibration tool, you will need to convert your files into the expected format.
How Lightning Pose uses calibration:
3D data augmentation: calibration parameters allow geometrically consistent augmentation across views during training (see 3D augmentations and loss for details).
3D reprojection loss: calibration enables a training loss that penalizes geometrically inconsistent 2D predictions across views.
Note
Camera calibration is not required to train a multi-view Lightning Pose model. However, calibration is required to obtain 3D coordinates unified across cameras (see 3D inference below).
Data organization
Using the App
Create a multi-view project by following the Create your first project guide. The App will store your data in the correct format automatically.
Without the App (or converting from another format)
See the multi-view directory structure reference for the expected layout.
Important for all users
Calibration files must be saved manually in the correct location, regardless of whether you use the App. See the calibration file format reference for the required location and format.
Data annotation
The App provides a multi-view annotation tool that lets you label a keypoint in two views and then uses the calibration information to automatically project those labels into the remaining views. In general, we recommend keeping the automatically projected label in each view even when the body part is occluded; doing so helps the 3D data augmentation and reprojection loss learn the geometric structure of the scene.
Multi-view annotation is time-consuming even with this assistance. We recommend the following workflow:
Label approximately 100 frames across as many individuals as possible.
Train an initial model.
Run inference on videos from new individuals (preferred) or new sessions from the same individuals to surface difficult frames.
Use the Viewer tab to identify those difficult frames and add them to your labeled set.
In general, labeling a smaller number of frames from a larger number of individuals leads to better generalization. For example, if your labeling budget is 200 frames, labeling 20 frames from 10 separate individuals is preferable to labeling 200 frames from a single individual.
Model training
Model training in the App is straightforward; see the Create your first project guide for a walkthrough.
For training via the CLI, see:
Training and inference (multi-view) — general training procedure for multi-view setups.
Patch masking and 3D loss — multi-view specific training features including patch masking and the 3D reprojection loss.
Model inference
Inference in the App follows the same workflow as for single-view projects.
For inference via the CLI, see:
Training and inference (multi-view) — covers the multi-view inference procedure and expected file layout.
3D inference
Note
Camera calibration information is required for 3D inference.
Once per-view 2D predictions are available, 3D coordinates can be reconstructed across cameras. We recommend the Ensemble Kalman Smoother (EKS) tool for this step (paper). EKS can operate on predictions from a single model or from an ensemble of models; ensembling improves accuracy and provides better-calibrated uncertainty estimates than the likelihood outputs of any single network.
Installation
git clone https://github.com/paninski-lab/eks
cd eks
pip install -e .
Alternatively, install from PyPI (no bundled example data):
pip install ensemble-kalman-smoother
Workflow
The recommended workflow is:
Train several Lightning Pose models with different random seeds (3+ recommended).
Run
litpose predictwith each model to produce per-view CSV files.Organise the CSVs into a directory following the layout described below.
Run EKS to produce smoothed, ensembled predictions (and optionally 3D coordinates).
Input file layout
EKS expects Lightning Pose / DLC-format CSVs (three-row header: scorer, bodyparts, coords). For multi-camera setups with one CSV per view per seed, place all files in a single directory and include the camera name as a substring of each filename:
input_dir/
session_Cam-A_rng=0.csv
session_Cam-A_rng=1.csv
session_Cam-A_rng=2.csv
session_Cam-B_rng=0.csv
session_Cam-B_rng=1.csv
session_Cam-B_rng=2.csv
calibration.toml # Anipose-format calibration file
Note
Camera names must appear as substrings of the filenames, and no camera name may be a substring of another camera name.
Running EKS
For calibrated multi-camera data (nonlinear EKS with 3D triangulation):
eks multicam \
--input-dir /path/to/input_dir \
--camera-names Cam-A Cam-B \
--calibration /path/to/input_dir/calibration.toml \
--make-plot
For multi-camera data without calibration (linear EKS, smoothing only):
eks multicam \
--input-dir /path/to/input_dir \
--camera-names Cam-A Cam-B \
--make-plot
See the EKS documentation for the full list of subcommands and options, including specialised workflows for mirrored multi-camera setups.