torchimage.padding package¶

Submodules¶

torchimage.padding.pad_1d module¶

Pad a torch tensor along a certain dimension.

All 1-d padding utility functions share the same set of arguments.

Important note: This is an in-place function.

To avoid re-copying the input tensor, these 1d padding utility functions only accept an empty tensor that has the same shape as the final padded output and copies the values of the input tensor at its center.

The function will return its input x after modifying it.

Efficiency Warning: Currently, PyTorch doesn’t support returning negative-strided views like [::-1]. torch.flip() is said to be computationally expensive due to copying the data, which might put symmetric and reflect to a disadvantage.

They are for lower-level utility only. Do NOT expect them to behave the same way as F.pad does.

param x

The input tensor to be padded.

See the important note above. Let u be the original tensor, then x is an empty tensor holding u values at center such that x[idx] == u

type x

torch.Tensor

param idx

Indices for the ground truth tensor located at the center of the empty-padded tensor x.

Has the same length as the number of dimensions len(idx) == x.ndim. Each element is a slice(beg, end, 1) where at dimension dim, x.shape[dim] - end is the amount of padding in the end and beg is the amount of padding in the beginning.

Note that this has to be a tuple to properly index a high-dimensional tensor.

This tuple of index slices prevents computing padding for empty values at this dimension.

type idx

tuple of slice

param dim

The dimension to pad.

type dim

int

param These keyword arguments are used in some padding functions only.

param negate

Whether to flip signs (+, -) when flipping the signal. Default: False.

This parameter only applies to symmetric mode. When it is enabled, turns into half-sample antisymmetric mode:

antisymmetric: -d -c -b -a | a b c d | -d -c -b -a

type negate

bool

param before

For linear_ramp_1d, they are the new edge values for the padded tensor. Default: 0

For constant_1d, they are the constants used for padding before and after ground truth.

For stat_1d, they are the lengths at the border to compute statistics with.

type before

float

param after

For linear_ramp_1d, they are the new edge values for the padded tensor. Default: 0

For constant_1d, they are the constants used for padding before and after ground truth.

For stat_1d, they are the lengths at the border to compute statistics with.

type after

float

returns

x – The same tensor after the padded values at dim are filled in.

rtype

torch.Tensor

torchimage.padding.pad_1d.circular_1d(x, idx, dim)¶

torchimage.padding.pad_1d.constant_1d(x, idx, dim, before, after)¶

torchimage.padding.pad_1d.linear_ramp_1d(x, idx, dim, before, after)¶

torchimage.padding.pad_1d.odd_reflect_1d(x, idx, dim)¶

torchimage.padding.pad_1d.odd_symmetric_1d(x, idx, dim)¶

torchimage.padding.pad_1d.periodize_1d(x, idx, dim)¶

torchimage.padding.pad_1d.reflect_1d(x, idx, dim)¶

torchimage.padding.pad_1d.replicate_1d(x, idx, dim)¶

torchimage.padding.pad_1d.smooth_1d(x, idx, dim)¶

torchimage.padding.pad_1d.stat_1d(x, idx, dim, before, after, mode)¶

torchimage.padding.pad_1d.symmetric_1d(x, idx, dim, negate=False)¶

torchimage.padding.pad_1d.zeros_1d(x, idx, dim)¶

torchimage.padding.tensor_pad module¶

torchimage.padding.utils module¶

Private utility functions for padding

torchimage.padding.utils.make_idx(*args, dim, ndim)¶

Make an index that slices exactly along a specified dimension. e.g. [:, … :, slice(*args), :, …, :]

This helper function is similar to numpy’s _slice_at_axis.

Parameters

*args (int or None) – constructor arguments for the slice object at target axis
dim (int) – target axis; can be negative or positive
ndim (int) – total number of axes

Returns

idx – Can be used to index np.ndarray and torch.Tensor

Return type

tuple of slice

torchimage.padding.utils.modify_idx(*args, idx, dim)¶

Make an index that slices a specified dimension while keeping the slices for other dimensions the same.

Parameters

*args (tuple of int or None) – constructor arguments for the slice object at target axis
idx (tuple of slice) – tuple of slices in the original region of interest
dim (int) – target axis

Returns

new_idx – New tuple of slices with dimension dim substituted by slice(*args)

Can be used to index np.ndarray and torch.Tensor

Return type

tuple of slice

torchimage.padding.utils.pad_width_format(padding, source='numpy', target='torch', ndim=None)¶

Convert between 2 padding width formats.

This function converts pad from source format to target format.

Padding width refers to the number of padded elements before and after the original tensor at a certain axis. Numpy and PyTorch have different formats to specify the padding widths. Because Numpy works with n-dimensional arrays while PyTorch more frequently works with (N, C, [D, ]H, W) data tensors. In the latter case, starting from the last dimension seems more intuitive.

Numpy padding width format is ((before_0, after_0), (before_1, after_1), ..., (before_{n-1}, after_{n-1})).

PyTorch padding format is (before_{n-1}, after_{n-1}, before_{n-2}, after_{n-2}, ..., before_{dim}, after_{dim}), such that before after dim is not padded.

Parameters

padding (tuple of int, or tuple of tuple of int) – the input padding width format to convert
source (str) – Format specification for padding width. Either “numpy” or “torch”.
target (str) – Format specification for padding width. Either “numpy” or “torch”.
ndim (int) –
Number of dimensions in the tensor of interest.

Only used when converting from torch to numpy format.

Returns

padding – the new padding width specification

Return type

tuple of int, or tuple of tuple of int

torchimage.padding.utils.same_padding_width(kernel_size, stride=1, in_size=None)¶

Calculate the padding width before and after a certain axis using “same padding” method.

When stride is 1, input size at that axis doesn’t matter and the output tensor will have the same shape as the input tensor, hence the name “same padding”.

When stride is greater than 1, same padding can be intuitively described as “letting the kernel cover every element of the original tensor, while making padding width before and after the axis roughly the same.” (unlike valid padding, which doesn’t pad at all and the last pixels will be ignored if input tensor’s side length doesn’t match kernel size and stride)

This convention is taken from a TensorFlow documentation page which no longer exists.

Parameters

kernel_size (int) – The convolution kernel size at that axis
stride (int) – The convolution stride at that axis. Default: 1.
in_size (int) –
The side length of the input tensor at that axis.

Can be None if stride is 1.

Returns

pad_before, pad_after – The number of padded elements required by same padding before and after the axis.

Return type

int

Module contents¶

General padding functionalities.

Padding extends the border of the original signal so the output has a certain shape.

We design this padding package with these principles:

Maximal compatibility with PyTorch.
We will implicitly use F.pad as much as we can. (For example, we do not implement zero and constant padding) We also adjust the names of the arguments to be compatible with PyTorch
Versatility
We try to incorporate as many functionalities available in other packages as possible. Specifically, we try to reproduce the behavior of numpy.pad, MatLab dwtmode, and PyWavelet signal extension modes.

Comparing with torch.nn.functional.pad, we make the following modifcations:

Symmetric padding is added
Higher-dimension non-constant padding
To the date of this release, PyTorch has not implemented reflect (ndim >= 3+2), replicate (ndim >= 4+2), and circular (ndim >= 4+2) for high-dimensional tensors (the +2 refers to the initial batch and channel dimensions).

We achieve n-dimensional padding by sequentially applying 1-dimensional padding from the first axis (dim 0) to the last (dim n-1). In some cases (such as constant padding with different values before and after an axis) where different orders of applying 1d padding can change the final result, we follow numpy’s convention going from the first to the last dimension.
Wider padding size
Padding modes reflect and circular will cause PyTorch to fail when padding size is greater than the tensor’s shape at a certain dimension. i.e. Padding value causes wrapping around more than once.

Comparing with numpy.pad, we make the following modifcations:

Bug fixes for wider padding
For modes such as symmetric and circular, numpy padding doesn’t make proper repetition. For example, notice how numpy fails to repeat or flip the signal [0, 1] in on the right. >>> import numpy as np >>> np.pad([0, 1],(1, 10),mode=”wrap”) array([1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1]) >>> np.pad([0, 1],(1, 10),mode=”symmetric”) array([0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1])

The same bug has been observed for antisymmetric as well. After reading the source code, I believe numpy’s error comes from extending the padding for directions (before and after), which breaks the cycle of the original signal if before and after are relatively prime.

Padding Modes¶

"empty" - pads with undefined values

? ? ? ? | a b c d | ? ? ? ? where the empty values are from torch.empty

"constant" - pads with a constant value

p p p p | a b c d | p p p p where p is supplied by constant_values argument.

p1 p1 p1 p1 | a b c d | p2 p2 p2 p2 where p1 and p2 are different constant values.

"zeros" - pads with 0; a special case of constant padding

0 0 0 0 | a b c d | 0 0 0 0

"symmetric" - extends signal by mirroring samples. Also known as half-sample symmetric

d c b a | a b c d | d c b a

"antisymmetric" - extends signal by mirroring and negating samples. Also known as half-sample antisymmetric.

-d -c -b -a | a b c d | -d -c -b -a

"reflect" - signal is extended by reflecting samples. This mode is also known as whole-sample symmetric

d c b | a b c d | c b a

"replicate" - replicates the border pixels

a a a a | a b c d | d d d d

"circular" - signal is treated as a periodic one

a b c d | a b c d | a b c d

"periodize" - same as circular, except the last element is replicated when signal length is odd.

a b c -> a b c c | a b c c | a b c c Note that it first extends the signal to an even length prior to using periodic boundary conditions

"odd_reflect" - extend the signal by a point-reflection across the edge element

2a-d 2a-c 2a-b | a b c d | 2d-c 2d-b 2d-a | 2(2d-a)-(2d-b) 2(2d-a)-(2d-c) 2(2d-a)-d Also known has whole-sample antisymmetric.

"odd_symmetric" - extend the signal by a point-reflection across a hypothetical midpoint between the edge

and the symmetrically reflected edge. 2a-d 2a-c 2a-b a | a b c d | d 2d-c 2d-b 2d-a | 2d-a 2(2d-a)-(2d-b) 2(2d-a)-(2d-c) 2(2d-a)-d

"smooth" - extend the signal according to the first derivatives calculated on the edges

(straight line extrapolation) a-4(b-a) a-3(b-a) a-2(b-a) a-(b-a) | a b c d | d+(d-c) d+2(d-c) d+3(d-c) d+4(d-c)

If there’s only 1 element, smooth should behave the same as replicate because we can only assume a first derivative of 0.

"linear_ramp" - pads with the linear ramp between end value and the array border value.

| a b c d | d+s d+2s ... d+n*s e where e is the end value, n is the number of elements between d and e, and s = (e-d)/(n+1)

The end values are specified by argument end_values

"maximum", "mean", "median", "minimum" - pads with a statistical value

of all or part of the vector along each axis.

The length of the vector used for computing the statistical value is specified by argument stat_length.

Note that PyTorch’s median behaves differently from numpy’s median. When there is an even number of elements, torch.median returns the left element at the center, whereas numpy.median returns the arithmetic mean between the two elements at the center. We retain PyTorch’s behavior here.

<function> - extend the signal with a customized function

The function should have signature pad_func(x, idx, dim), like other functions in pad_1d.py. You may find reading the source code from pad_1d.py and using the _modify_index function useful.

To pass customized keyword arguments (like end_values, constant_values, and stat_length) to a self-defined padding function, especially when they depend on dim, the user can define a separate function f: dim -> kwargs and call f inside pad_func to get the keyword arguments from dimension. I use the same method for constant (when there are multiple padding values), linear ramp and stat padding.

This is not the most elegant design but it works when it needs to, so please let me know if any improvement is urgently needed and I can fix it.

torchimage	PyWavelets	Matlab	numpy.pad	Scipy
zeros	zero	zpd	constant, cval=0	N/A
constant	N/A	N/A	constant	constant
replicate	constant	sp0	edge	nearest
smooth	smooth	spd, sp1	N/A	N/A
circular	periodic	ppd	wrap	wrap
periodize	periodization	per	N/A	N/A
symmetric	symmetric	sym, symh	symmetric	reflect
reflect	reflect	symw	reflect	mirror
antisymmetric	antisymmetric	asym, asymh	N/A	N/A
odd_reflect	antireflect	asymw	reflect, reflect_type=’odd’	N/A
odd_symmetric	N/A	N/A	symmetric, reflect_type=’odd’	N/A
linear_ramp	N/A	N/A	linear_ramp	N/A
maximum	N/A	N/A	maximum	N/A
mean	N/A	N/A	mean	N/A
median	N/A	N/A	median	N/A
minimum	N/A	N/A	minimum	N/A
empty	N/A	N/A	empty	N/A
<function>	N/A	N/A	<function>	N/A

class torchimage.padding.Padder(pad_width=0, mode='constant', constant_values=0, end_values=0.0, stat_length=None)¶

Bases: object

forward(x: torch.Tensor, axes=slice(2, None, None))¶

Pads a tensor sequentially at all specified dimensions

Parameters

x (torch.Tensor) – The input n-dimensional tensor to be padded.
axes (sequence of int, slice, None) –
The sequence of dimensions to be padded with the exact ordering.

If axes is not provided (None), the padder will automatically right-justify the NdSpecs such that the “rightmost” axis corresponds to the “rightmost” item in an NdSpec and other entries are aligned accordingly.

If the padder has larger ndim than x (or axes), the leftmost dimensions in the padder are ignored. If x (or axes) has larger ndim than the padder, the leftmost dimensions of x are left unchanged.

If the input format is PyTorch batch-like (first 2 dimensions are batch and channel dimensions), we recommend using axes=slice(2, None).

Returns

y – Padded tensor.

Return type

torch.Tensor

pad_axis(x: torch.Tensor, i: int, axis: int)¶

Pad a specific axis according to predefined padder parameters

This method will create a new tensor that is larger at the axis of interest (unless padding width is 0, in which case the original tensor will be returned).

It is only useful together with separable filtering and in very specific circumstances. The padder pads each axis right before the separable filtering (unfold -> matrix-vector multiplication) at that axis. This algorithm leads to the creation of many more intermediate tensors, which not only takes time copying but also consumes memory if the input tensor requires gradient.

Therefore, it is only useful when the dimension is extremely high and padding width is much larger than the size of the input tensor.

Parameters

x (torch.Tensor) – Input tensor to be padded.
i (int) – The index of self (padder) to be used
axis (int) – The axis in x to be padded. Can be a positive or negative index.

Returns

y – Padded tensor.

Return type

torch.Tensor

to_same(kernel_size, stride=1, in_shape=None)¶

Modify this padder in-place, such that the padding width follows the convention of same padding.

Parameters

kernel_size (int or sequence of int) – Kernel size of convolution operation
stride (int or sequence of int) – Stride of convolution at each axis
in_shape (int or sequence of int) – Size of input tensor at each axis of interest

Returns

new_padder – self with pad_width re-initialized

Return type

Padder