HeatmapMHCRNNHead

class lightning_pose.models.heads.HeatmapMHCRNNHead

Bases: Module

Multi-head convolutional recurrent neural network head.

This head converts a sequence of 2D feature maps to per-keypoint heatmaps for the center frame. The head is composed of two heads: - single frame head: several deconvolutional layers followed by a 2D spatial softmax to

generate normalized heatmaps from low-resolution feature maps for a single frame.

  • multi-frame head: several deconvolutional layers are applied to each set of features in a temporal sequence; the resulting heatmaps are fed into a convolutional recurrent neural network to produce heatmaps for the center frame

Methods Summary

forward(features, batch_shape, is_multiview)

Handle context frames then upsample to get final heatmaps.

run_subpixelmaxima(heatmaps)

Methods Documentation

forward(features: Tensor, {'__torchtyping__': True, 'details': ('batch', 'features', 'rep_height', 'rep_width', 'frames'), 'cls_name': 'TensorType'}], batch_shape: tensor, is_multiview: bool) Tensor, {'__torchtyping__': True, 'details': ('batch', 'num_keypoints', 'heatmap_height', 'heatmap_width',), 'cls_name': 'TensorType'}]][source]

Handle context frames then upsample to get final heatmaps.

Parameters:
  • features – outputs of backbone

  • batch_shape – identifies whether or not we need to do some reshaping

  • is_multiview – if batch has a view dimension

run_subpixelmaxima(heatmaps)[source]
__init__(backbone_arch: str, in_channels: int, out_channels: int, deconv_out_channels: int | None = None, downsample_factor: int = 2, upsampling_factor: int = 2)[source]
Parameters:
  • backbone_arch – string denoting backbone architecture; to remove in future release

  • in_channels – number of channels in the input feature map

  • out_channels – number of channels in the output heatmap (i.e. number of keypoints)

  • deconv_out_channels – output channel number for each intermediate deconv layer; defaults to number of keypoints

  • downsample_factor – make heatmaps smaller than input frames by this factor; subpixel operations are performed for increased precision

  • upsampling_factor – upsample features before feeding to crnn

__new__(**kwargs)