API Reference¶

This section contains the complete API documentation, auto-generated from docstrings.

Main Classes¶

`libreyolo.LIBREYOLO`(model_path[, size, ...])	Unified Libre YOLO factory that automatically detects model version (8 or 11) from the weights file and returns the appropriate model instance.
`libreyolo.LIBREYOLO8`	Libre YOLO8 model for object detection.
`libreyolo.LIBREYOLO11`	Libre YOLO11 model for object detection.
`libreyolo.LIBREYOLOOnnx`	ONNX runtime inference backend for LIBREYOLO models.

Factory Function¶

libreyolo.LIBREYOLO(model_path, size=None, reg_max=16, nb_classes=80, save_feature_maps=False, save_eigen_cam=False, cam_method='eigencam', cam_layer=None, device='auto', tiling=False)[source]¶

Unified Libre YOLO factory that automatically detects model version (8 or 11) from the weights file and returns the appropriate model instance.

Parameters:

model_path (str) – Path to model weights file (.pt) or ONNX file (.onnx)
size (str) – Model size variant. Required for .pt files (“n”, “s”, “m”, “l”, “x”), ignored for .onnx
reg_max (int) – Regression max value for DFL (default: 16)
nb_classes (int) – Number of classes (default: 80 for COCO)
save_feature_maps (bool) – If True, saves backbone feature map visualizations on each inference (default: False)
save_eigen_cam (bool) – If True, saves EigenCAM heatmap visualizations on each inference (default: False)
cam_method (str) – Default CAM method for explain(). Options: “eigencam”, “gradcam”, “gradcam++”, “xgradcam”, “hirescam”, “layercam”, “eigengradcam” (default: “eigencam”)
cam_layer (str) – Target layer for CAM computation (default: “neck_c2f22”)
device (str) – Device for inference. “auto” (default) uses CUDA if available, else MPS, else CPU.
tiling (bool) – Enable tiling for large images (default: False). When enabled, images larger than 640x640 are split into overlapping tiles for inference.

Returns:

Instance of LIBREYOLO8, LIBREYOLO11, or LIBREYOLOOnnx

Example

>>> model = LIBREYOLO("yolo11n.pt", size="n", cam_method="gradcam")
>>> result = model.explain("image.jpg", save=True)

LIBREYOLO8¶

class libreyolo.LIBREYOLO8[source]¶

Bases: object

Libre YOLO8 model for object detection.

Parameters:

model_path – Path to model weights file (required)
size – Model size variant (required). Must be one of: “n”, “s”, “m”, “l”, “x”
reg_max – Regression max value for DFL (default: 16)
nb_classes – Number of classes (default: 80 for COCO)
save_feature_maps – Feature map saving mode. Options: - False: Disabled (default) - True: Save all layers - List of layer names: Save only specified layers (e.g., [“backbone_p1”, “neck_c2f21”])
save_eigen_cam – If True, saves EigenCAM heatmap visualizations on each inference (default: False)
cam_method – CAM method for explain(). Options: “eigencam”, “gradcam”, “gradcam++”, “xgradcam”, “hirescam”, “layercam”, “eigengradcam” (default: “eigencam”)
cam_layer – Target layer for CAM computation (default: “neck_c2f22”)
device – Device for inference. “auto” (default) uses CUDA if available, else MPS, else CPU. Can also specify directly: “cuda”, “cuda:0”, “mps”, “cpu”.
tiling – Enable tiling for processing large/high-resolution images (default: False). When enabled, large images are automatically split into overlapping 640x640 tiles, inference is run on each tile, and results are merged using NMS.

Example

>>> model = LIBREYOLO8(model_path="path/to/weights.pt", size="x", save_feature_maps=True)
>>> detections = model(image=image_path, save=True)
>>> # Use explain() for XAI heatmaps
>>> heatmap = model.explain("image.jpg", method="gradcam")

__init__(model_path, size, reg_max=16, nb_classes=80, save_feature_maps=False, save_eigen_cam=False, cam_method='eigencam', cam_layer=None, device='auto', tiling=False)[source]¶

Initialize the Libre YOLO8 model.

Parameters:

model_path (str | dict) – Path to user-provided model weights file or loaded state dict
size (str) – Model size variant. Must be “n”, “s”, “m”, “l”, or “x”
reg_max (int) – Regression max value for DFL (default: 16)
nb_classes (int) – Number of classes (default: 80)
save_feature_maps (bool | List[str]) – Feature map saving mode. Options: - False: Disabled - True: Save all layers - List[str]: Save only specified layer names
save_eigen_cam (bool) – If True, saves EigenCAM heatmap visualizations
cam_method (str) – Default CAM method for explain() (default: “eigencam”)
cam_layer (str | None) – Target layer for CAM computation (default: “neck_c2f22”)
device (str) – Device for inference (“auto”, “cuda”, “mps”, “cpu”)
tiling (bool) – Enable tiling for large images (default: False). When enabled, images larger than 640x640 are split into overlapping tiles for inference.

get_available_layer_names()[source]¶

Get list of available layer names for feature map saving.

Returns:: List of layer names that can be used with save_feature_maps parameter.
Return type:: List[str]

__call__(image, save=False, output_path=None, conf_thres=0.25, iou_thres=0.45, color_format='auto', batch_size=1)[source]¶

Run inference on an image or directory of images.

Parameters:

image (str | Path | Image | ndarray | Tensor | bytes | BytesIO) – Input image or directory. Supported types: - str: Local file path, directory path, or URL (http/https/s3/gs) - pathlib.Path: Local file path or directory path - PIL.Image: PIL Image object - np.ndarray: NumPy array (HWC or CHW, RGB or BGR) - torch.Tensor: PyTorch tensor (CHW or NCHW) - bytes: Raw image bytes - io.BytesIO: BytesIO object containing image data
save (bool) – If True, saves the image with detections drawn. Defaults to False.
output_path (str) – Optional path to save the annotated image. If not provided, saves to ‘runs/detections/’ with a timestamped name.
conf_thres (float) – Confidence threshold (default: 0.25)
iou_thres (float) – IoU threshold for NMS (default: 0.45)
color_format (str) – Color format hint for NumPy/OpenCV arrays. - “auto”: Auto-detect (default) - “rgb”: Input is RGB format - “bgr”: Input is BGR format (e.g., OpenCV)
batch_size (int) – Number of images to process per batch when handling multiple images (e.g., directories). Currently used for chunking at the Python level; true batched model inference is planned for future versions. Default: 1 (process one image at a time).

Returns:

Dictionary containing detection results with keys:

boxes: List of bounding boxes in xyxy format
scores: List of confidence scores
classes: List of class IDs
num_detections: Number of detections
source: Source image path (if available)
saved_path: Path to saved image (if save=True)

For directory: List of dictionaries, one per image processed.

Return type:

For single image

export(output_path=None, input_size=640, opset=12)[source]¶

Export the model to ONNX format.

Parameters:

output_path (str) – Path to save the ONNX file. If None, uses the model’s weights path with .onnx extension.
input_size (int) – The image size to export for (default: 640).
opset (int) – ONNX opset version (default: 12).

Returns:

Path to the exported ONNX file.

Return type:

str

predict(image, save=False, output_path=None, conf_thres=0.25, iou_thres=0.45, color_format='auto', batch_size=1)[source]¶

Alias for __call__ method.

Parameters:

image (str | Path | Image | ndarray | Tensor | bytes | BytesIO) – Input image or directory. Supported types: - str: Local file path, directory path, or URL (http/https/s3/gs) - pathlib.Path: Local file path or directory path - PIL.Image: PIL Image object - np.ndarray: NumPy array (HWC or CHW, RGB or BGR) - torch.Tensor: PyTorch tensor (CHW or NCHW) - bytes: Raw image bytes - io.BytesIO: BytesIO object containing image data
save (bool) – If True, saves the image with detections drawn. Defaults to False.
output_path (str) – Optional path to save the annotated image.
conf_thres (float) – Confidence threshold (default: 0.25)
iou_thres (float) – IoU threshold for NMS (default: 0.45)
color_format (str) – Color format hint for NumPy/OpenCV arrays (“auto”, “rgb”, “bgr”)
batch_size (int) – Number of images to process per batch when handling multiple images (e.g., directories). Default: 1.

Returns:

Dictionary containing detection results. For directory: List of dictionaries, one per image processed.

Return type:

For single image

explain(image, method=None, target_layer=None, eigen_smooth=False, save=False, output_path=None, alpha=0.5, color_format='auto')[source]¶

Generate explainability heatmap for the given image using CAM methods.

This method provides visual explanations of what the model focuses on when making predictions. It supports multiple CAM (Class Activation Mapping) techniques including gradient-based and gradient-free methods.

Parameters:

image (str | Path | Image | ndarray | Tensor | bytes | BytesIO) – Input image. Supported types: - str: Local file path or URL (http/https/s3/gs) - pathlib.Path: Local file path - PIL.Image: PIL Image object - np.ndarray: NumPy array (HWC or CHW, RGB or BGR) - torch.Tensor: PyTorch tensor (CHW or NCHW) - bytes: Raw image bytes - io.BytesIO: BytesIO object containing image data
method (str | None) – CAM method to use. Options: - “eigencam”: Gradient-free, SVD-based (default) - “gradcam”: Gradient-weighted class activation - “gradcam++”: Improved GradCAM with second-order gradients - “xgradcam”: Axiom-based GradCAM - “hirescam”: High-resolution CAM - “layercam”: Layer-wise CAM - “eigengradcam”: Eigen-based gradient CAM
target_layer (str | None) – Layer name for CAM computation. Use get_available_layer_names() to see options. Defaults to “neck_c2f22”.
eigen_smooth (bool) – Apply SVD smoothing to the heatmap (default: False).
save (bool) – If True, saves the heatmap visualization to disk.
output_path (str | None) – Optional path to save the visualization.
alpha (float) – Blending factor for overlay (default: 0.5).
color_format (str) – Color format hint for NumPy/OpenCV arrays (“auto”, “rgb”, “bgr”).

Returns:

heatmap: Grayscale heatmap array of shape (H, W) with values in [0, 1]
overlay: RGB overlay image as numpy array
original_image: Original image as PIL Image
method: CAM method used
target_layer: Target layer used
saved_path: Path to saved visualization (if save=True)

Return type:

Dictionary containing

Example

>>> model = LIBREYOLO8("yolo8n.pt", size="n")
>>> result = model.explain("image.jpg", method="gradcam", save=True)
>>> heatmap = result["heatmap"]
>>> overlay = result["overlay"]

static get_available_cam_methods()[source]¶

Get list of available CAM methods.

Returns:: List of CAM method names that can be used with explain().
Return type:: List[str]

LIBREYOLO11¶

class libreyolo.LIBREYOLO11[source]¶

Bases: object

Libre YOLO11 model for object detection.

Parameters:

model_path – Path to model weights file (required)
size – Model size variant (required). Must be one of: “n”, “s”, “m”, “l”, “x”
reg_max – Regression max value for DFL (default: 16)
nb_classes – Number of classes (default: 80 for COCO)
save_feature_maps – Feature map saving mode. Options: - False: Disabled (default) - True: Save all layers - List of layer names: Save only specified layers (e.g., [“backbone_p1”, “neck_c2f21”])
save_eigen_cam – If True, saves EigenCAM heatmap visualizations on each inference (default: False)
cam_method – CAM method for explain(). Options: “eigencam”, “gradcam”, “gradcam++”, “xgradcam”, “hirescam”, “layercam”, “eigengradcam” (default: “eigencam”)
cam_layer – Target layer for CAM computation (default: “neck_c2f22”)
device – Device for inference. “auto” (default) uses CUDA if available, else MPS, else CPU. Can also specify directly: “cuda”, “cuda:0”, “mps”, “cpu”.
tiling – Enable tiling for processing large/high-resolution images (default: False). When enabled, large images are automatically split into overlapping 640x640 tiles, inference is run on each tile, and results are merged using NMS.

Example

>>> model = LIBREYOLO11(model_path="path/to/weights.pt", size="x", save_feature_maps=True)
>>> detections = model(image=image_path, save=True)
>>> # Use explain() for XAI heatmaps
>>> heatmap = model.explain("image.jpg", method="gradcam")

Initialize the Libre YOLO11 model.

Parameters:

model_path (str | dict) – Path to user-provided model weights file or loaded state dict
size (str) – Model size variant. Must be “n”, “s”, “m”, “l”, or “x”
reg_max (int) – Regression max value for DFL (default: 16)
nb_classes (int) – Number of classes (default: 80)
save_feature_maps (bool | List[str]) – Feature map saving mode. Options: - False: Disabled - True: Save all layers - List[str]: Save only specified layer names
save_eigen_cam (bool) – If True, saves EigenCAM heatmap visualizations
cam_method (str) – Default CAM method for explain() (default: “eigencam”)
cam_layer (str | None) – Target layer for CAM computation (default: “neck_c2f22”)
device (str) – Device for inference (“auto”, “cuda”, “mps”, “cpu”)
tiling (bool) – Enable tiling for large images (default: False). When enabled, images larger than 640x640 are split into overlapping tiles for inference.

get_available_layer_names()[source]¶

Get list of available layer names for feature map saving.

Returns:: List of layer names that can be used with save_feature_maps parameter.
Return type:: List[str]

__call__(image, save=False, output_path=None, conf_thres=0.25, iou_thres=0.45, color_format='auto', batch_size=1)[source]¶

Run inference on an image or directory of images.

Parameters:

image (str | Path | Image | ndarray | Tensor | bytes | BytesIO) – Input image or directory. Supported types: - str: Local file path, directory path, or URL (http/https/s3/gs) - pathlib.Path: Local file path or directory path - PIL.Image: PIL Image object - np.ndarray: NumPy array (HWC or CHW, RGB or BGR) - torch.Tensor: PyTorch tensor (CHW or NCHW) - bytes: Raw image bytes - io.BytesIO: BytesIO object containing image data
save (bool) – If True, saves the image with detections drawn. Defaults to False.
output_path (str) – Optional path to save the annotated image. If not provided, saves to ‘runs/detections/’ with a timestamped name.
conf_thres (float) – Confidence threshold (default: 0.25)
iou_thres (float) – IoU threshold for NMS (default: 0.45)
color_format (str) – Color format hint for NumPy/OpenCV arrays. - “auto”: Auto-detect (default) - “rgb”: Input is RGB format - “bgr”: Input is BGR format (e.g., OpenCV)
batch_size (int) – Number of images to process per batch when handling multiple images (e.g., directories). Currently used for chunking at the Python level; true batched model inference is planned for future versions. Default: 1 (process one image at a time).

Returns:

Dictionary containing detection results with keys:

boxes: List of bounding boxes in xyxy format
scores: List of confidence scores
classes: List of class IDs
num_detections: Number of detections
source: Source image path (if available)
saved_path: Path to saved image (if save=True)

For directory: List of dictionaries, one per image processed.

Return type:

For single image

export(output_path=None, input_size=640, opset=12)[source]¶

Export the model to ONNX format.

Parameters:

output_path (str) – Path to save the ONNX file. If None, uses the model’s weights path with .onnx extension.
input_size (int) – The image size to export for (default: 640).
opset (int) – ONNX opset version (default: 12).

Returns:

Path to the exported ONNX file.

Return type:

str

predict(image, save=False, output_path=None, conf_thres=0.25, iou_thres=0.45, color_format='auto', batch_size=1)[source]¶

Alias for __call__ method.

Parameters:

image (str | Path | Image | ndarray | Tensor | bytes | BytesIO) – Input image or directory. Supported types: - str: Local file path, directory path, or URL (http/https/s3/gs) - pathlib.Path: Local file path or directory path - PIL.Image: PIL Image object - np.ndarray: NumPy array (HWC or CHW, RGB or BGR) - torch.Tensor: PyTorch tensor (CHW or NCHW) - bytes: Raw image bytes - io.BytesIO: BytesIO object containing image data
save (bool) – If True, saves the image with detections drawn. Defaults to False.
output_path (str) – Optional path to save the annotated image.
conf_thres (float) – Confidence threshold (default: 0.25)
iou_thres (float) – IoU threshold for NMS (default: 0.45)
color_format (str) – Color format hint for NumPy/OpenCV arrays (“auto”, “rgb”, “bgr”)
batch_size (int) – Number of images to process per batch when handling multiple images (e.g., directories). Default: 1.

Returns:

Dictionary containing detection results. For directory: List of dictionaries, one per image processed.

Return type:

For single image

explain(image, method=None, target_layer=None, eigen_smooth=False, save=False, output_path=None, alpha=0.5, color_format='auto')[source]¶

Generate explainability heatmap for the given image using CAM methods.

Parameters:

image (str | Path | Image | ndarray | Tensor | bytes | BytesIO) – Input image. Supported types: - str: Local file path or URL (http/https/s3/gs) - pathlib.Path: Local file path - PIL.Image: PIL Image object - np.ndarray: NumPy array (HWC or CHW, RGB or BGR) - torch.Tensor: PyTorch tensor (CHW or NCHW) - bytes: Raw image bytes - io.BytesIO: BytesIO object containing image data
method (str | None) – CAM method to use. Options: - “eigencam”: Gradient-free, SVD-based (default) - “gradcam”: Gradient-weighted class activation - “gradcam++”: Improved GradCAM with second-order gradients - “xgradcam”: Axiom-based GradCAM - “hirescam”: High-resolution CAM - “layercam”: Layer-wise CAM - “eigengradcam”: Eigen-based gradient CAM
target_layer (str | None) – Layer name for CAM computation. Use get_available_layer_names() to see options. Defaults to “neck_c2f22”.
eigen_smooth (bool) – Apply SVD smoothing to the heatmap (default: False).
save (bool) – If True, saves the heatmap visualization to disk.
output_path (str | None) – Optional path to save the visualization.
alpha (float) – Blending factor for overlay (default: 0.5).
color_format (str) – Color format hint for NumPy/OpenCV arrays (“auto”, “rgb”, “bgr”).

Returns:

heatmap: Grayscale heatmap array of shape (H, W) with values in [0, 1]
overlay: RGB overlay image as numpy array
original_image: Original image as PIL Image
method: CAM method used
target_layer: Target layer used
saved_path: Path to saved visualization (if save=True)

Return type:

Dictionary containing

Example

>>> model = LIBREYOLO11("yolo11n.pt", size="n")
>>> result = model.explain("image.jpg", method="gradcam", save=True)
>>> heatmap = result["heatmap"]
>>> overlay = result["overlay"]

static get_available_cam_methods()[source]¶

Get list of available CAM methods.

Returns:: List of CAM method names that can be used with explain().
Return type:: List[str]

ONNX Model¶

class libreyolo.LIBREYOLOOnnx[source]¶

Bases: object

ONNX runtime inference backend for LIBREYOLO models.

Provides the same API as LIBREYOLO8/LIBREYOLO11 but uses ONNX Runtime instead of PyTorch for inference.

Parameters:

onnx_path – Path to the ONNX model file.
nb_classes – Number of classes (default: 80 for COCO).
device – Device for inference. “auto” (default) uses CUDA if available, else CPU.

Example

>>> model = LIBREYOLOOnnx("model.onnx")
>>> detections = model("image.jpg", save=True)

__init__(onnx_path, nb_classes=80, device='auto')[source]¶

Parameters:

onnx_path (str)
nb_classes (int)
device (str)

__call__(image, save=False, output_path=None, conf_thres=0.25, iou_thres=0.45, color_format='auto', batch_size=1)[source]¶

Run inference on an image or directory of images.

Parameters:

image (str | Path | Image | ndarray) – Input image or directory (file path, directory path, PIL Image, or numpy array).
save (bool) – If True, saves annotated image to disk.
output_path (str) – Optional path to save the annotated image.
conf_thres (float) – Confidence threshold (default: 0.25).
iou_thres (float) – IoU threshold for NMS (default: 0.45).
color_format (str) – Color format hint for NumPy/OpenCV arrays (“auto”, “rgb”, “bgr”).
batch_size (int) – Number of images to process per batch when handling multiple images (e.g., directories). Currently used for chunking at the Python level; true batched model inference is planned for future versions. Default: 1 (process one image at a time).

Returns:

Dictionary with boxes, scores, classes, source, and num_detections. For directory: List of dictionaries, one per image processed.

Return type:

For single image

predict(image, save=False, output_path=None, conf_thres=0.25, iou_thres=0.45, color_format='auto', batch_size=1)[source]¶

Alias for __call__ method.

Parameters:

image (str | Path | Image | ndarray) – Input image or directory.
save (bool) – If True, saves annotated image to disk.
output_path (str) – Optional path to save the annotated image.
conf_thres (float) – Confidence threshold (default: 0.25).
iou_thres (float) – IoU threshold for NMS (default: 0.45).
color_format (str) – Color format hint for NumPy/OpenCV arrays.
batch_size (int) – Number of images to process per batch when handling multiple images (e.g., directories). Default: 1.

Returns:

Dictionary containing detection results. For directory: List of dictionaries, one per image processed.

Return type:

For single image

CAM Methods¶

libreyolo.CAM_METHODS = {'eigencam': <class 'libreyolo.common.cam.eigen_cam.EigenCAM'>, 'eigengradcam': <class 'libreyolo.common.cam.eigengradcam.EigenGradCAM'>, 'gradcam': <class 'libreyolo.common.cam.gradcam.GradCAM'>, 'gradcam++': <class 'libreyolo.common.cam.gradcampp.GradCAMPlusPlus'>, 'gradcampp': <class 'libreyolo.common.cam.gradcampp.GradCAMPlusPlus'>, 'hirescam': <class 'libreyolo.common.cam.hirescam.HiResCAM'>, 'layercam': <class 'libreyolo.common.cam.layercam.LayerCAM'>, 'xgradcam': <class 'libreyolo.common.cam.xgradcam.XGradCAM'>}¶

dict() -> new empty dictionary dict(mapping) -> new dictionary initialized from a mapping object’s

(key, value) pairs

dict(iterable) -> new dictionary initialized as if via:: d = {} for k, v in iterable:

d[k] = v
dict(**kwargs) -> new dictionary initialized with the name=value pairs: in the keyword argument list. For example: dict(one=1, two=2)

class libreyolo.EigenCAM[source]¶

Bases: BaseCAM

EigenCAM: Class Activation Map using Principal Components.

This is a gradient-free method that uses SVD to find the first principal component of the 2D activations. It produces class-agnostic saliency maps that highlight generally important regions.

Reference:: Muhammad, M. B., & Yeasin, M. (2020). Eigen-CAM: Class Activation Map using Principal Components. arXiv:2008.00299

__init__(model, target_layers, reshape_transform=None)[source]¶

Initialize EigenCAM.

Parameters:

model (Module) – The neural network model.
target_layers (List[Module]) – List of target layers for CAM computation.
reshape_transform (Callable | None) – Optional transform for activation shapes.

Return type:

None

get_cam_weights(input_tensor, target_layer, targets, activations, grads)[source]¶

EigenCAM doesn’t use weights - it directly computes SVD projection.

This method returns ones since the actual computation happens in get_cam_image which is overridden.

Parameters:

input_tensor (Tensor)
target_layer (Module)
targets (List | None)
activations (ndarray)
grads (ndarray | None)

Return type:

ndarray

get_cam_image(input_tensor, target_layer, targets, activations, grads, eigen_smooth=False)[source]¶

Compute EigenCAM using SVD on activations.

Parameters:

input_tensor (Tensor) – The input image tensor.
target_layer (Module) – The layer being processed.
targets (List | None) – Ignored for EigenCAM.
activations (ndarray) – The layer activations of shape (B, C, H, W).
grads (ndarray | None) – Ignored for EigenCAM.
eigen_smooth (bool) – Ignored (always uses eigen method).

Returns:

CAM array of shape (B, H, W).

Return type:

ndarray

class libreyolo.GradCAM[source]¶

Bases: BaseCAM

GradCAM: Gradient-weighted Class Activation Mapping.

Weights the 2D activations by the average gradient to produce class-discriminative localization maps.

Reference:: Selvaraju, R. R., et al. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. arXiv:1610.02391

__init__(model, target_layers, reshape_transform=None)[source]¶

Initialize GradCAM.

Parameters:

model (Module) – The neural network model.
target_layers (List[Module]) – List of target layers for CAM computation.
reshape_transform (Callable | None) – Optional transform for activation shapes.

Return type:

None

get_cam_weights(input_tensor, target_layer, targets, activations, grads)[source]¶

Compute GradCAM weights by global average pooling the gradients.

The weight for each channel is the mean gradient value across the spatial dimensions (H, W).

Parameters:

input_tensor (Tensor) – The input image tensor.
target_layer (Module) – The layer being processed.
targets (List | None) – Optional target specifications.
activations (ndarray) – The layer activations of shape (B, C, H, W).
grads (ndarray) – The gradients of shape (B, C, H, W).

Returns:

Weights array of shape (B, C).

Return type:

ndarray

class libreyolo.GradCAMPlusPlus[source]¶

Bases: BaseCAM

GradCAM++: Improved Gradient-weighted Class Activation Mapping.

Uses second-order gradients (squared gradients) for better weighting, particularly effective when multiple instances of the same class appear in the image.

Reference:: Chattopadhyay, A., et al. (2018). Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks. arXiv:1710.11063

__init__(model, target_layers, reshape_transform=None)[source]¶

Initialize GradCAM++.

Parameters:

model (Module) – The neural network model.
target_layers (List[Module]) – List of target layers for CAM computation.
reshape_transform (Callable | None) – Optional transform for activation shapes.

Return type:

None

get_cam_weights(input_tensor, target_layer, targets, activations, grads)[source]¶

Compute GradCAM++ weights using second-order gradient information.

GradCAM++ formula:: alpha_kc = grad^2 / (2*grad^2 + sum(A * grad^3)) weights = sum(alpha * ReLU(grad))

Parameters:

input_tensor (Tensor) – The input image tensor.
target_layer (Module) – The layer being processed.
targets (List | None) – Optional target specifications.
activations (ndarray) – The layer activations of shape (B, C, H, W).
grads (ndarray) – The gradients of shape (B, C, H, W).

Returns:

Weights array of shape (B, C).

Return type:

ndarray

Helper Functions¶

libreyolo.create_model(version, config, reg_max=16, nb_classes=80, img_size=640)[source]¶

Create a fresh model instance for training.

Parameters:

version (str) – “8”, “11”, etc.
config (str) – Model size (“n”, “s”, “m”, “l”, “x”)
reg_max (int) – Regression max
nb_classes (int) – Number of classes
img_size (int) – Input image size (default: 640)

Returns:

Model instance (nn.Module)