tensorflow中的padding方式SAME和VALID的区别

时间:2020-12-25 22:48:18   收藏:0   阅读:239

Tensorflow中的padding有两种方式,其中的SAME方式比较特殊,可能产生非对称pad的情况,之前在验证一个tensorflow的网络时踩到这个坑。

Tensorflow的计算公式

二维卷积接口

tf.nn.conv2d(
    input, filters, strides, padding, data_format=‘NHWC‘, dilations=None, name=None
)

padding计算公式

需要注意padding的配置,如果是字符串就有SAMEVALID两种配置,如果是数字list就明确表明padding在各个维度的数量。

首先,padding如果表示确切的数字,其维度是input维度的2倍,因为每个维度两个边需要补pad,比如宽度的左边和右边,高度的上边和下边,但是tensorflow中不会给N维度以及C维度补pad,仅仅在W和H维度补pad,因此对于NHWCpadding =[[0, 0], [pad_top, pad_bottom], [pad_left, pad_right], [0, 0]] ,对于NCHW的pad的顺序换过来。

然后,如果输入的是字符串选项,补的pad都可以映射到padding这个参数上,

参考如下的代码

inputH, inputW = 7, 8
strideH, strideW = 3, 3
filterH = 4
filterW = 4
inputC = 16
outputC = 3
inputData = np.ones([1,inputH,inputW,inputC]) # format [N, H, W, C]
filterData = np.float16(np.ones([filterH, filterW, inputC, outputC]) - 0.33)
strides = (1, strideH, strideW, 1)
convOutputSame = tf.nn.conv2d(inputData, filterData, strides, padding=‘SAME‘)
convOutput = tf.nn.conv2d(inputData, filterData, strides, padding=[[0,0],[1,2],[1,1],[0,0]]) # padded input data
print("output1, ", convOutputSame)
print("output2, ", convOutput)
print("Sum of a - b is ", np.sum(np.square(convOutputSame - convOutput)))

计算结果是

    output1,  tf.Tensor(
    [[[[ 96.46875  96.46875  96.46875]
       [128.625   128.625   128.625  ]
       [ 96.46875  96.46875  96.46875]]

      [[128.625   128.625   128.625  ]
       [171.5     171.5     171.5    ]
       [128.625   128.625   128.625  ]]

      [[ 64.3125   64.3125   64.3125 ]
       [ 85.75     85.75     85.75   ]?
       [ 64.3125   64.3125   64.3125 ]]]])

    output2,  tf.Tensor(
    [[[[ 96.46875  96.46875  96.46875]
       [128.625   128.625   128.625  ]
       [ 96.46875  96.46875  96.46875]]

      [[128.625   128.625   128.625  ]
       [171.5     171.5     171.5    ]
       [128.625   128.625   128.625  ]]

      [[ 64.3125   64.3125   64.3125 ]
       [ 85.75     85.75     85.75   ]
       [ 64.3125   64.3125   64.3125 ]]]], shape=(1, 3, 3, 3), dtype=float64)
    Sum of a - b is  0.0

ONNX计算公式

onnx的接口,参考IR定义如下

std::function<void(OpSchema&)> ConvOpSchemaGenerator(const char* filter_desc) {
  return [=](OpSchema& schema) {
    std::string doc = R"DOC(
The convolution operator consumes an input tensor and {filter_desc}, and
computes the output.)DOC";
    ReplaceAll(doc, "{filter_desc}", filter_desc);
    schema.SetDoc(doc);
    schema.Input(
        0,
        "X",
        "Input data tensor from previous layer; "
        "has size (N x C x H x W), where N is the batch size, "
        "C is the number of channels, and H and W are the "
        "height and width. Note that this is for the 2D image. "
        "Otherwise the size is (N x C x D1 x D2 ... x Dn). "
        "Optionally, if dimension denotation is "
        "in effect, the operation expects input data tensor "
        "to arrive with the dimension denotation of [DATA_BATCH, "
        "DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].",
        "T");
    schema.Input(
        1,
        "W",
        "The weight tensor that will be used in the "
        "convolutions; has size (M x C/group x kH x kW), where C "
        "is the number of channels, and kH and kW are the "
        "height and width of the kernel, and M is the number "
        "of feature maps. For more than 2 dimensions, the "
        "kernel shape will be (M x C/group x k1 x k2 x ... x kn), "
        "where (k1 x k2 x ... kn) is the dimension of the kernel. "
        "Optionally, if dimension denotation is in effect, "
        "the operation expects the weight tensor to arrive "
        "with the dimension denotation of [FILTER_OUT_CHANNEL, "
        "FILTER_IN_CHANNEL, FILTER_SPATIAL, FILTER_SPATIAL ...]. "
        "X.shape[1] == (W.shape[1] * group) == C "
        "(assuming zero based indices for the shape array). "
        "Or in other words FILTER_IN_CHANNEL should be equal to DATA_CHANNEL. ",
        "T");
    schema.Input(
        2,
        "B",
        "Optional 1D bias to be added to the convolution, has size of M.",
        "T",
        OpSchema::Optional);
    schema.Output(
        0,
        "Y",
        "Output data tensor that contains the result of the "
        "convolution. The output dimensions are functions "
        "of the kernel size, stride size, and pad lengths.",
        "T");
    schema.TypeConstraint(
        "T",
        {"tensor(float16)", "tensor(float)", "tensor(double)"},
        "Constrain input and output types to float tensors.");
    schema.Attr(
        "kernel_shape",
        "The shape of the convolution kernel. If not present, should be inferred from input W.",
        AttributeProto::INTS,
        OPTIONAL);
    schema.Attr(
        "dilations",
        "dilation value along each spatial axis of the filter. If not present, the dilation defaults is 1 along each spatial axis.",
        AttributeProto::INTS,
        OPTIONAL);
    schema.Attr(
        "strides",
        "Stride along each spatial axis. If not present, the stride defaults is 1 along each spatial axis.",
        AttributeProto::INTS,
        OPTIONAL);
    schema.Attr(
        "auto_pad",
        auto_pad_doc,
        AttributeProto::STRING,
        std::string("NOTSET"));
    schema.Attr(
        "pads",
        pads_doc,
        AttributeProto::INTS,
        OPTIONAL);
    schema.Attr(
        "group",
        "number of groups input channels and output channels are divided into.",
        AttributeProto::INT,
        static_cast<int64_t>(1));
    schema.TypeAndShapeInferenceFunction([](InferenceContext& ctx) {
      propagateElemTypeFromInputToOutput(ctx, 0, 0);
      convPoolShapeInference(ctx, true, false, 0, 1);
    });
  };
}

需要注意上面的auto_pad选项,与tensorflow类似有3个选项

举例

python - What is the difference between ‘SAME‘ and ‘VALID‘ padding in tf.nn.max_pool of tensorflow? - Stack Overflow上有一个比较具体的例子,可以看到,使用SAME模式可以使得stride刚好完整取完所有的input而不会有剩余或者短缺。

参考资料


原文:https://www.cnblogs.com/bugxch/p/14190955.html

评论(0
© 2014 bubuko.com 版权所有 - 联系我们:wmxa8@hotmail.com
打开技术之扣,分享程序人生!