HeatmapTrackerο
- class lightning_pose.models.HeatmapTrackerο
Bases:
BaseSupervisedTrackerBase model that produces heatmaps of keypoints from images.
Methods Summary
forward(images)Forward pass through the network.
get_loss_inputs_labeled(batch_dict)Return predicted heatmaps and their softmaxes (estimated keypoints).
predict_step(batch_dict,Β batch_idx[,Β ...])Predict heatmaps and keypoints for a batch of video frames.
Methods Documentation
- forward(images: ~torch.Annotated[~torch.Tensor, {'__torchtyping__': True, 'details': ('batch', channels: 3, 'image_height', 'image_width',), 'cls_name': 'TensorType'}] | ~torch.Annotated[~torch.Tensor, {'__torchtyping__': True, 'details': ('batch', 'views', channels: 3, 'image_height', 'image_width',), 'cls_name': 'TensorType'}]) Tensor, {'__torchtyping__': True, 'details': ('num_valid_outputs', 'num_keypoints', 'heatmap_height', 'heatmap_width',), 'cls_name': 'TensorType'}][source]ο
Forward pass through the network.
- get_loss_inputs_labeled(batch_dict: HeatmapLabeledBatchDict | MultiviewHeatmapLabeledBatchDict) dict[source]ο
Return predicted heatmaps and their softmaxes (estimated keypoints).
- predict_step(batch_dict: HeatmapLabeledBatchDict | MultiviewHeatmapLabeledBatchDict | UnlabeledBatchDict | MultiviewUnlabeledBatchDict, batch_idx: int, return_heatmaps: bool | None = False) Tuple[Tensor, Tensor] | Tuple[Tensor, Tensor, Tensor][source]ο
Predict heatmaps and keypoints for a batch of video frames.
Assuming a DALI video loader is passed in > trainer = Trainer(devices=8, accelerator=βgpuβ) > predictions = trainer.predict(model, data_loader)
- __init__(num_keypoints: int, num_targets: int | None = None, loss_factory: LossFactory | None = None, backbone: Literal['resnet18', 'resnet34', 'resnet50', 'resnet101', 'resnet152', 'resnet50_contrastive', 'resnet50_animal_apose', 'resnet50_animal_ap10k', 'resnet50_human_jhmdb', 'resnet50_human_res_rle', 'resnet50_human_top_res', 'resnet50_human_hand', 'efficientnet_b0', 'efficientnet_b1', 'efficientnet_b2', 'vits_dino', 'vits_dinov2', 'vits_dinov3', 'vitb_dino', 'vitb_dinov2', 'vitb_dinov3', 'vitb_imagenet', 'vitb_sam'] = 'resnet50', downsample_factor: Literal[1, 2, 3] = 2, pretrained: bool = True, torch_seed: int = 123, optimizer: str = 'Adam', optimizer_params: DictConfig | dict | None = None, lr_scheduler: str = 'multisteplr', lr_scheduler_params: DictConfig | dict | None = None, **kwargs: Any) None[source]ο
Initialize a heatmap-based pose estimation model with conv or transformer backbone.
- Parameters:
num_keypoints β number of body parts
loss_factory β object to orchestrate loss computation
backbone β ResNet or EfficientNet variant to be used
downsample_factor β make heatmap smaller than original frames to save memory; subpixel operations are performed for increased precision
pretrained β True to load pretrained imagenet weights
torch_seed β make weight initialization reproducible
lr_scheduler β how to schedule learning rate
lr_scheduler_params β params for specific learning rate schedulers multisteplr: milestones, gamma
- __new__(**kwargs)ο