torchimage.padding package¶
Submodules¶
torchimage.padding.pad_1d module¶
Pad a torch tensor along a certain dimension.
All 1-d padding utility functions share the same set of arguments.
Important note: This is an in-place function.
To avoid re-copying the input tensor, these 1d padding utility functions only accept an empty tensor that has the same shape as the final padded output and copies the values of the input tensor at its center.
The function will return its input x after modifying it.
Efficiency Warning: Currently, PyTorch doesn’t support returning negative-strided
views like [::-1]. torch.flip() is said to be computationally expensive due to copying
the data, which might put symmetric
and reflect
to a disadvantage.
They are for lower-level utility only. Do NOT expect
them to behave the same way as F.pad
does.
- param x
The input tensor to be padded.
See the important note above. Let
u
be the original tensor, thenx
is an empty tensor holdingu
values at center such thatx[idx] == u
- type x
torch.Tensor
- param idx
Indices for the ground truth tensor located at the center of the empty-padded tensor x.
Has the same length as the number of dimensions
len(idx) == x.ndim
. Each element is aslice(beg, end, 1)
where at dimensiondim
,x.shape[dim] - end
is the amount of padding in the end andbeg
is the amount of padding in the beginning.Note that this has to be a tuple to properly index a high-dimensional tensor.
This tuple of index slices prevents computing padding for empty values at this dimension.
- type idx
tuple of slice
- param dim
The dimension to pad.
- type dim
int
- param These keyword arguments are used in some padding functions only.
- param negate
Whether to flip signs (+, -) when flipping the signal. Default: False.
This parameter only applies to
symmetric
mode. When it is enabled, turns into half-sample antisymmetric mode:antisymmetric
:-d -c -b -a | a b c d | -d -c -b -a
- type negate
bool
- param before
For
linear_ramp_1d
, they are the new edge values for the padded tensor. Default: 0For
constant_1d
, they are the constants used for padding before and after ground truth.For
stat_1d
, they are the lengths at the border to compute statistics with.- type before
float
- param after
For
linear_ramp_1d
, they are the new edge values for the padded tensor. Default: 0For
constant_1d
, they are the constants used for padding before and after ground truth.For
stat_1d
, they are the lengths at the border to compute statistics with.- type after
float
- returns
x – The same tensor after the padded values at
dim
are filled in.- rtype
torch.Tensor
- torchimage.padding.pad_1d.circular_1d(x, idx, dim)¶
- torchimage.padding.pad_1d.constant_1d(x, idx, dim, before, after)¶
- torchimage.padding.pad_1d.linear_ramp_1d(x, idx, dim, before, after)¶
- torchimage.padding.pad_1d.odd_reflect_1d(x, idx, dim)¶
- torchimage.padding.pad_1d.odd_symmetric_1d(x, idx, dim)¶
- torchimage.padding.pad_1d.periodize_1d(x, idx, dim)¶
- torchimage.padding.pad_1d.reflect_1d(x, idx, dim)¶
- torchimage.padding.pad_1d.replicate_1d(x, idx, dim)¶
- torchimage.padding.pad_1d.smooth_1d(x, idx, dim)¶
- torchimage.padding.pad_1d.stat_1d(x, idx, dim, before, after, mode)¶
- torchimage.padding.pad_1d.symmetric_1d(x, idx, dim, negate=False)¶
- torchimage.padding.pad_1d.zeros_1d(x, idx, dim)¶
torchimage.padding.tensor_pad module¶
torchimage.padding.utils module¶
Private utility functions for padding
- torchimage.padding.utils.make_idx(*args, dim, ndim)¶
Make an index that slices exactly along a specified dimension. e.g. [:, … :, slice(*args), :, …, :]
This helper function is similar to numpy’s
_slice_at_axis
.- Parameters
*args (int or None) – constructor arguments for the slice object at target axis
dim (int) – target axis; can be negative or positive
ndim (int) – total number of axes
- Returns
idx – Can be used to index np.ndarray and torch.Tensor
- Return type
tuple of slice
- torchimage.padding.utils.modify_idx(*args, idx, dim)¶
Make an index that slices a specified dimension while keeping the slices for other dimensions the same.
- Parameters
*args (tuple of int or None) – constructor arguments for the slice object at target axis
idx (tuple of slice) – tuple of slices in the original region of interest
dim (int) – target axis
- Returns
new_idx – New tuple of slices with dimension dim substituted by slice(*args)
Can be used to index np.ndarray and torch.Tensor
- Return type
tuple of slice
- torchimage.padding.utils.pad_width_format(padding, source='numpy', target='torch', ndim=None)¶
Convert between 2 padding width formats.
This function converts
pad
fromsource
format totarget
format.Padding width refers to the number of padded elements before and after the original tensor at a certain axis. Numpy and PyTorch have different formats to specify the padding widths. Because Numpy works with n-dimensional arrays while PyTorch more frequently works with (N, C, [D, ]H, W) data tensors. In the latter case, starting from the last dimension seems more intuitive.
Numpy padding width format is
((before_0, after_0), (before_1, after_1), ..., (before_{n-1}, after_{n-1}))
.PyTorch padding format is
(before_{n-1}, after_{n-1}, before_{n-2}, after_{n-2}, ..., before_{dim}, after_{dim})
, such that before afterdim
is not padded.- Parameters
padding (tuple of int, or tuple of tuple of int) – the input padding width format to convert
source (str) – Format specification for padding width. Either “numpy” or “torch”.
target (str) – Format specification for padding width. Either “numpy” or “torch”.
ndim (int) –
Number of dimensions in the tensor of interest.
Only used when converting from torch to numpy format.
- Returns
padding – the new padding width specification
- Return type
tuple of int, or tuple of tuple of int
- torchimage.padding.utils.same_padding_width(kernel_size, stride=1, in_size=None)¶
Calculate the padding width before and after a certain axis using “same padding” method.
When stride is 1, input size at that axis doesn’t matter and the output tensor will have the same shape as the input tensor, hence the name “same padding”.
When stride is greater than 1, same padding can be intuitively described as “letting the kernel cover every element of the original tensor, while making padding width before and after the axis roughly the same.” (unlike valid padding, which doesn’t pad at all and the last pixels will be ignored if input tensor’s side length doesn’t match kernel size and stride)
This convention is taken from a TensorFlow documentation page which no longer exists.
- Parameters
kernel_size (int) – The convolution kernel size at that axis
stride (int) – The convolution stride at that axis. Default: 1.
in_size (int) –
The side length of the input tensor at that axis.
Can be None if stride is 1.
- Returns
pad_before, pad_after – The number of padded elements required by same padding before and after the axis.
- Return type
int
Module contents¶
General padding functionalities.
Padding extends the border of the original signal so the output has a certain shape.
We design this padding package with these principles:
- Maximal compatibility with PyTorch.
We will implicitly use
F.pad
as much as we can. (For example, we do not implement zero and constant padding) We also adjust the names of the arguments to be compatible with PyTorch
- Versatility
We try to incorporate as many functionalities available in other packages as possible. Specifically, we try to reproduce the behavior of
numpy.pad
, MatLab dwtmode, and PyWavelet signal extension modes.
Comparing with torch.nn.functional.pad
, we make the following modifcations:
Symmetric padding is added
- Higher-dimension non-constant padding
To the date of this release, PyTorch has not implemented reflect (ndim >= 3+2), replicate (ndim >= 4+2), and circular (ndim >= 4+2) for high-dimensional tensors (the +2 refers to the initial batch and channel dimensions).
We achieve n-dimensional padding by sequentially applying 1-dimensional padding from the first axis (dim 0) to the last (dim n-1). In some cases (such as constant padding with different values before and after an axis) where different orders of applying 1d padding can change the final result, we follow numpy’s convention going from the first to the last dimension.
- Wider padding size
Padding modes reflect and circular will cause PyTorch to fail when padding size is greater than the tensor’s shape at a certain dimension. i.e. Padding value causes wrapping around more than once.
Comparing with numpy.pad
, we make the following modifcations:
- Bug fixes for wider padding
For modes such as symmetric and circular, numpy padding doesn’t make proper repetition. For example, notice how numpy fails to repeat or flip the signal [0, 1] in on the right. >>> import numpy as np >>> np.pad([0, 1],(1, 10),mode=”wrap”) array([1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 1]) >>> np.pad([0, 1],(1, 10),mode=”symmetric”) array([0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1])
The same bug has been observed for antisymmetric as well. After reading the source code, I believe numpy’s error comes from extending the padding for directions (before and after), which breaks the cycle of the original signal if before and after are relatively prime.
Padding Modes¶
"empty"
- pads with undefined values? ? ? ? | a b c d | ? ? ? ?
where the empty values are fromtorch.empty
"constant"
- pads with a constant valuep p p p | a b c d | p p p p
wherep
is supplied byconstant_values
argument.p1 p1 p1 p1 | a b c d | p2 p2 p2 p2
wherep1
andp2
are different constant values."zeros"
- pads with 0; a special case of constant padding0 0 0 0 | a b c d | 0 0 0 0
"symmetric"
- extends signal by mirroring samples. Also known as half-sample symmetricd c b a | a b c d | d c b a
"antisymmetric"
- extends signal by mirroring and negating samples. Also known as half-sample antisymmetric.-d -c -b -a | a b c d | -d -c -b -a
"reflect"
- signal is extended by reflecting samples. This mode is also known as whole-sample symmetricd c b | a b c d | c b a
"replicate"
- replicates the border pixelsa a a a | a b c d | d d d d
"circular"
- signal is treated as a periodic onea b c d | a b c d | a b c d
"periodize"
- same as circular, except the last element is replicated when signal length is odd.a b c -> a b c c | a b c c | a b c c
Note that it first extends the signal to an even length prior to using periodic boundary conditions"odd_reflect"
- extend the signal by a point-reflection across the edge element2a-d 2a-c 2a-b | a b c d | 2d-c 2d-b 2d-a | 2(2d-a)-(2d-b) 2(2d-a)-(2d-c) 2(2d-a)-d
Also known has whole-sample antisymmetric."odd_symmetric"
- extend the signal by a point-reflection across a hypothetical midpoint between the edgeand the symmetrically reflected edge.
2a-d 2a-c 2a-b a | a b c d | d 2d-c 2d-b 2d-a | 2d-a 2(2d-a)-(2d-b) 2(2d-a)-(2d-c) 2(2d-a)-d
"smooth"
- extend the signal according to the first derivatives calculated on the edges(straight line extrapolation)
a-4(b-a) a-3(b-a) a-2(b-a) a-(b-a) | a b c d | d+(d-c) d+2(d-c) d+3(d-c) d+4(d-c)
If there’s only 1 element, smooth should behave the same as replicate because we can only assume a first derivative of 0.
"linear_ramp"
- pads with the linear ramp between end value and the array border value.| a b c d | d+s d+2s ... d+n*s e
wheree
is the end value,n
is the number of elements betweend
ande
, ands = (e-d)/(n+1)
The end values are specified by argument
end_values
"maximum"
,"mean"
,"median"
,"minimum"
- pads with a statistical valueof all or part of the vector along each axis.
The length of the vector used for computing the statistical value is specified by argument
stat_length
.Note that PyTorch’s median behaves differently from numpy’s median. When there is an even number of elements,
torch.median
returns the left element at the center, whereasnumpy.median
returns the arithmetic mean between the two elements at the center. We retain PyTorch’s behavior here.<function>
- extend the signal with a customized functionThe function should have signature
pad_func(x, idx, dim)
, like other functions in pad_1d.py. You may find reading the source code from pad_1d.py and using the_modify_index
function useful.To pass customized keyword arguments (like end_values, constant_values, and stat_length) to a self-defined padding function, especially when they depend on
dim
, the user can define a separate functionf: dim -> kwargs
and callf
insidepad_func
to get the keyword arguments from dimension. I use the same method for constant (when there are multiple padding values), linear ramp and stat padding.This is not the most elegant design but it works when it needs to, so please let me know if any improvement is urgently needed and I can fix it.
torchimage |
PyWavelets |
Matlab |
numpy.pad |
Scipy |
---|---|---|---|---|
zeros |
zero |
zpd |
constant, cval=0 |
N/A |
constant |
N/A |
N/A |
constant |
constant |
replicate |
constant |
sp0 |
edge |
nearest |
smooth |
smooth |
spd, sp1 |
N/A |
N/A |
circular |
periodic |
ppd |
wrap |
wrap |
periodize |
periodization |
per |
N/A |
N/A |
symmetric |
symmetric |
sym, symh |
symmetric |
reflect |
reflect |
reflect |
symw |
reflect |
mirror |
antisymmetric |
antisymmetric |
asym, asymh |
N/A |
N/A |
odd_reflect |
antireflect |
asymw |
reflect, reflect_type=’odd’ |
N/A |
odd_symmetric |
N/A |
N/A |
symmetric, reflect_type=’odd’ |
N/A |
linear_ramp |
N/A |
N/A |
linear_ramp |
N/A |
maximum |
N/A |
N/A |
maximum |
N/A |
mean |
N/A |
N/A |
mean |
N/A |
median |
N/A |
N/A |
median |
N/A |
minimum |
N/A |
N/A |
minimum |
N/A |
empty |
N/A |
N/A |
empty |
N/A |
<function> |
N/A |
N/A |
<function> |
N/A |
- class torchimage.padding.Padder(pad_width=0, mode='constant', constant_values=0, end_values=0.0, stat_length=None)¶
Bases:
object
- forward(x: torch.Tensor, axes=slice(2, None, None))¶
Pads a tensor sequentially at all specified dimensions
- Parameters
x (torch.Tensor) – The input n-dimensional tensor to be padded.
axes (sequence of int, slice, None) –
The sequence of dimensions to be padded with the exact ordering.
If axes is not provided (None), the padder will automatically right-justify the NdSpecs such that the “rightmost” axis corresponds to the “rightmost” item in an NdSpec and other entries are aligned accordingly.
If the padder has larger ndim than x (or axes), the leftmost dimensions in the padder are ignored. If x (or axes) has larger ndim than the padder, the leftmost dimensions of x are left unchanged.
If the input format is PyTorch batch-like (first 2 dimensions are batch and channel dimensions), we recommend using
axes=slice(2, None)
.
- Returns
y – Padded tensor.
- Return type
torch.Tensor
- pad_axis(x: torch.Tensor, i: int, axis: int)¶
Pad a specific axis according to predefined padder parameters
This method will create a new tensor that is larger at the axis of interest (unless padding width is 0, in which case the original tensor will be returned).
It is only useful together with separable filtering and in very specific circumstances. The padder pads each axis right before the separable filtering (unfold -> matrix-vector multiplication) at that axis. This algorithm leads to the creation of many more intermediate tensors, which not only takes time copying but also consumes memory if the input tensor requires gradient.
Therefore, it is only useful when the dimension is extremely high and padding width is much larger than the size of the input tensor.
- Parameters
x (torch.Tensor) – Input tensor to be padded.
i (int) – The index of self (padder) to be used
axis (int) – The axis in x to be padded. Can be a positive or negative index.
- Returns
y – Padded tensor.
- Return type
torch.Tensor
- to_same(kernel_size, stride=1, in_shape=None)¶
Modify this padder in-place, such that the padding width follows the convention of same padding.
- Parameters
kernel_size (int or sequence of int) – Kernel size of convolution operation
stride (int or sequence of int) – Stride of convolution at each axis
in_shape (int or sequence of int) – Size of input tensor at each axis of interest
- Returns
new_padder – self with pad_width re-initialized
- Return type