Full checklist for adding a new transform to AlbumentationsX. Use when the user asks to add, implement, or create a new transform/augmentation.
Follow this checklist in order. Do not skip steps.
Put the transform in the most specific matching subpackage:
albumentations/augmentations/geometric/ — spatial transforms (flip, rotate, warp, etc.)albumentations/augmentations/pixel/ — pixel-level (color, brightness, noise, etc.)albumentations/augmentations/dropout/ — masking/dropoutalbumentations/augmentations/blur/ — blurringalbumentations/augmentations/crops/ — croppingalbumentations/augmentations/mixing/ — multi-image mixingalbumentations/augmentations/transforms3d/ — 3D/volumealbumentations/augmentations/other/ — everything elseAdd the pure function in the corresponding functional.py file (no class state, no RNG):
def my_transform(img: np.ndarray, param1: float, param2: int) -> np.ndarray:
...
np.ndarray, return np.ndarrayget_params / get_params_dependent_on_datacv2 over numpy for performance (see benchmarking rules)cv2.LUT for lookup-based pixel ops (fastest)@uint8_io / @float32_io decorators if dtype conversion is neededapply or apply_to_* methods; the transform class docstring and transforms_interface are sufficient.class MyTransform(DualTransform): # or ImageOnlyTransform / NoOp
"""First paragraph (120–160 chars): elevator pitch — what the transform does, how it works in one sentence, when to use it. No "Parameters: x, y", "Targets:", return type, or "Supports uint8/float32"; no "Used by X". Two lines, wrap at 120.
More detail about what the transform does.
Args:
param_range: (min, max) tuple controlling X. Default: (0.1, 0.3).
fill: Padding value for image. Default: 0.
fill_mask: Padding value for masks. Default: 0.
p: Probability. Default: 0.5.
Targets:
image, mask, bboxes, keypoints, volume, mask3d
Image types:
uint8, float32
Examples:
>>> import numpy as np
>>> import albumentations as A
>>> image = np.random.randint(0, 256, (100, 100, 3), dtype=np.uint8)
>>> mask = np.random.randint(0, 2, (100, 100), dtype=np.uint8)
>>> bboxes = np.array([[10, 10, 50, 50]], dtype=np.float32)
>>> bbox_labels = [1]
>>> keypoints = np.array([[20, 30]], dtype=np.float32)
>>> keypoint_labels = [0]
>>>
>>> transform = A.Compose([
... A.MyTransform(param_range=(0.1, 0.3), p=1.0)
... ], bbox_params=A.BboxParams(coord_format='pascal_voc', label_fields=['bbox_labels']),
... keypoint_params=A.KeypointParams(coord_format='xy', label_fields=['keypoint_labels']))
>>>
>>> result = transform(
... image=image, mask=mask,
... bboxes=bboxes, bbox_labels=bbox_labels,
... keypoints=keypoints, keypoint_labels=keypoint_labels,
... )
"""
class InitSchema(BaseTransformInitSchema):
param_range: Annotated[tuple[float, float], AfterValidator(nondecreasing)]
# NO default values here (except discriminator fields)
def __init__(self, param_range: tuple[float, float], p: float = 0.5):
super().__init__(p=p)
self.param_range = param_range
def apply(self, img: ImageType, param1: float, **params: Any) -> ImageType:
# NO default values for param1 here
return fpixel.my_transform(img, param1)
def get_params(self) -> dict[str, Any]:
return {
"param1": self.py_random.uniform(*self.param_range),
}
get_transform_init_args_names() override — the base class auto-infers init arg names from __init__ via MRO introspection. Do not define this method._range suffix: brightness_range, not brightness_limitfill not fill_value, fill_mask not fill_mask_valueborder_mode not mode or pad_modeInitSchema (except Pydantic discriminator fields)tuple[T, T], never T | tuple[T, T] — no union with a scalar. Users always pass a tuple.apply_* methods (other than self, **params)get_params or get_params_dependent_on_data, never in apply_*self.py_random for simple random ops, self.random_generator only when numpy arrays needednp.random.* or random.* module directlyImageType for image/mask/volume type hints, np.ndarray only for bboxes/keypointsx, y, dx, dy, cx, cy. Prefer pixel_cols, norm_x, center_col, run_starts, col_x, etc. Names should read like documentation.(H, W, C) — num_channels = img.shape[-1] always. Never write img.shape[-1] if img.ndim >= 3 else 1 or guard with if img.ndim == NUM_MULTI_CHANNEL_DIMENSIONS.functional.py, never in the transform class file.apply_to_images)Override apply_to_images only if you can beat the default per-image loop. Priority patterns:
Pre-compute expensive setup once per batch (kernels, LUTs, gradient maps):
def apply_to_images(self, images: ImageType, *args: Any, **params: Any) -> ImageType:
kernel = create_kernel(params["size"]) # once, not N times
return self._apply_to_batch(images, lambda img: convolve(img, kernel))
Direct 4D indexing for simple array ops:
def apply_to_images(self, images: ImageType, channels_to_drop: list[int], **params: Any) -> ImageType:
result = images.copy()
result[:, :, :, channels_to_drop] = self.fill
return result
Pre-allocated loop as fallback when params vary per image:
def apply_to_images(self, images: ImageType, *args: Any, **params: Any) -> ImageType:
result = np.empty_like(images)
for i, image in enumerate(images):
result[i] = self.apply(image, **params)
return result
DO NOT reshape
(N,H,W,1)to(H,W,N)to call cv2 once — this is 2–4× slower in practice (transpose → non-contiguous copy + cv2 sequential channel processing).
Add to albumentations/__init__.py:
from albumentations.augmentations.<module>.transforms import MyTransform
Add to albumentations/augmentations/<module>/__init__.py if one exists.
Add to tests/test_transforms.py or tests/test_<category>.py:
@pytest.mark.parametrize(
("param_range", "expected_..."),
[
((0.1, 0.3), ...),
((0.5, 0.8), ...),
],
)
def test_my_transform(param_range, expected_...):
image = TestDataFactory.create_image((100, 100, 3), dtype=np.uint8, seed=137)
aug = A.MyTransform(param_range=param_range, p=1.0)
result = aug(image=image)
# use np.testing assertions, not plain assert
np.testing.assert_...
Also add it to the parametrized lists in tests/utils.py:
get_dual_transforms() if it's a DualTransformget_image_only_transforms() if it's ImageOnlyTransformCheck edge cases: uint8, float32, single channel, multichannel.
get_transform_init_args_names() override (auto-inferred from __init__)_range suffix on range paramsfill / fill_mask (not fill_value / fill_mask_value)InitSchemaapply_* method argsget_params / get_params_dependent_on_dataself.py_random or self.random_generator (not np.random / random)ImageType for image type hintsapply_to_images if expensive setup can be shared across batchArgs, Targets, Image types, Examples sectionsalbumentations/__init__.pynp.testing assertions)pre-commit run --all-filesuv run pytest -m "not slow"