.. _lightning_pose_3d: Lightning Pose 3D ================== This repo and the Lightning Pose App support multi-camera projects with an arbitrary number of cameras (tested with up to six). Before starting, cameras must be synchronized across views, and the resulting video files for a given session must each contain the same number of frames. **On this page:** * :ref:`Camera setup ` * :ref:`Camera calibration (optional) ` * :ref:`Data organization ` * :ref:`Data annotation ` * :ref:`Model training ` * :ref:`Model inference ` * :ref:`3D inference ` -------- .. _lp3d_camera_setup: Camera setup ------------ We recommend using at least three cameras to maximize the number of views in which each keypoint is unoccluded. Cameras should be positioned at relatively orthogonal angles to one another so that each view provides complementary information. .. _lp3d_camera_calibration: Camera calibration (optional) ------------------------------ Camera calibration determines the intrinsic parameters of each camera (focal length, principal point, distortion coefficients) and the extrinsic parameters that describe how the cameras are positioned and oriented relative to each other. Together, these parameters make it possible to map 2D pixel coordinates in any view to a shared 3D world coordinate system. We recommend using the `Anipose `_ package for calibration. If you use a different calibration tool, you will need to convert your files into the :doc:`expected format `. **How Lightning Pose uses calibration:** * **3D data augmentation:** calibration parameters allow geometrically consistent augmentation across views during training (see :ref:`3D augmentations and loss ` for details). * **3D reprojection loss:** calibration enables a training loss that penalizes geometrically inconsistent 2D predictions across views. .. note:: Camera calibration is **not** required to train a multi-view Lightning Pose model. However, calibration **is** required to obtain 3D coordinates unified across cameras (see :ref:`3D inference ` below). .. _lp3d_data_organization: Data organization ----------------- **Using the App** Create a multi-view project by following the :doc:`Create your first project ` guide. The App will store your data in the correct format automatically. **Without the App (or converting from another format)** See the :doc:`multi-view directory structure reference ` for the expected layout. **Important for all users** Calibration files must be saved manually in the correct location, regardless of whether you use the App. See the :doc:`calibration file format reference ` for the required location and format. .. _lp3d_data_annotation: Data annotation --------------- The App provides a multi-view annotation tool that lets you label a keypoint in two views and then uses the calibration information to automatically project those labels into the remaining views. In general, we recommend keeping the automatically projected label in each view even when the body part is occluded; doing so helps the 3D data augmentation and reprojection loss learn the geometric structure of the scene. Multi-view annotation is time-consuming even with this assistance. We recommend the following workflow: 1. Label approximately 100 frames across as many individuals as possible. 2. Train an initial model. 3. Run inference on videos from new individuals (preferred) or new sessions from the same individuals to surface difficult frames. 4. Use the Viewer tab to identify those difficult frames and add them to your labeled set. In general, labeling a smaller number of frames from a larger number of individuals leads to better generalization. For example, if your labeling budget is 200 frames, labeling 20 frames from 10 separate individuals is preferable to labeling 200 frames from a single individual. .. _lp3d_model_training: Model training -------------- Model training in the App is straightforward; see the :doc:`Create your first project ` guide for a walkthrough. For training via the CLI, see: * :doc:`Training and inference (multi-view) ` — general training procedure for multi-view setups. * :doc:`Patch masking and 3D loss ` — multi-view specific training features including patch masking and the 3D reprojection loss. .. _lp3d_model_inference: Model inference --------------- Inference in the App follows the same workflow as for single-view projects. For inference via the CLI, see: * :doc:`Training and inference (multi-view) ` — covers the multi-view inference procedure and expected file layout. .. _lp3d_3d_inference: 3D inference ------------ .. note:: Camera calibration information is required for 3D inference. Once per-view 2D predictions are available, 3D coordinates can be reconstructed across cameras. We recommend the `Ensemble Kalman Smoother (EKS) `_ tool for this step (`paper `_). EKS can operate on predictions from a single model or from an ensemble of models; ensembling improves accuracy and provides better-calibrated uncertainty estimates than the likelihood outputs of any single network. **Installation** .. code-block:: bash git clone https://github.com/paninski-lab/eks cd eks pip install -e . Alternatively, install from PyPI (no bundled example data): .. code-block:: bash pip install ensemble-kalman-smoother **Workflow** The recommended workflow is: 1. Train several Lightning Pose models with different random seeds (3+ recommended). 2. Run ``litpose predict`` with each model to produce per-view CSV files. 3. Organise the CSVs into a directory following the layout described below. 4. Run EKS to produce smoothed, ensembled predictions (and optionally 3D coordinates). **Input file layout** EKS expects Lightning Pose / DLC-format CSVs (three-row header: scorer, bodyparts, coords). For multi-camera setups with one CSV per view per seed, place all files in a single directory and include the camera name as a substring of each filename: .. code-block:: text input_dir/ session_Cam-A_rng=0.csv session_Cam-A_rng=1.csv session_Cam-A_rng=2.csv session_Cam-B_rng=0.csv session_Cam-B_rng=1.csv session_Cam-B_rng=2.csv calibration.toml # Anipose-format calibration file .. note:: Camera names must appear as substrings of the filenames, and no camera name may be a substring of another camera name. **Running EKS** For calibrated multi-camera data (nonlinear EKS with 3D triangulation): .. code-block:: bash eks multicam \ --input-dir /path/to/input_dir \ --camera-names Cam-A Cam-B \ --calibration /path/to/input_dir/calibration.toml \ --make-plot For multi-camera data without calibration (linear EKS, smoothing only): .. code-block:: bash eks multicam \ --input-dir /path/to/input_dir \ --camera-names Cam-A Cam-B \ --make-plot See the `EKS documentation `_ for the full list of subcommands and options, including specialised workflows for mirrored multi-camera setups.