从零学Paddle系列-1 Paddle框架CNN相关API详解

前言

前面我们对Paddle做了个大致的介绍，这一次我们来详细学习一下cv相关函数的使用

paddle.fluid.dygraph

Conv2D

该函数用于二维卷积，有以下参数 - num_channels 输入通道数 - num_filters 滤波器个数 - filter_size 卷积核大小 - stride 步长大小 - padding 填充大小 - dilation 膨胀系数 - groups 分组数 - act 激活函数形式，比如act='relu'，表示卷积操作后接一个relu激活

from paddle.fluid.dygraph import Conv2D,
conv = Conv2D(input_channel, output_channel, filter_size=3, stride=1)

BatchNorm

这是批量归一化层，paddle中bn层的使用与其他框架不太一样，它需要提供前面tensor的通道数有以下常用的参数 - num_channels 需要归一化的Tensor通道数目，不一致会报错 - act 应用激活函数，跟前面一样，通常都是在BN后接激活函数，这样写比较方便 - momentum 用于指数移动平均估算均值和方差，默认值是0.9

from paddle.fluid.dygraph import BatchNorm
bn1 = BatchNorm(input_channel, act='relu')

Pool2D

这是二维池化层 其参数有以下 - pool_size 池化核大小 - pool_type 池化方式，有'avg', 'max'，分别对应平均池化，最大池化 - pool_stride 池化步长大小 - pool_padding 池化填充大小 - global_pooling 是否全局池化，当为True，则忽略掉pool_size, pool_padding这两个参数，直接做一个全局池化

from paddle.fluid.dygraph import Pool2D
# 大小为3的池化层
pool = Pool2D(pool_size=3,  pool_type='avg', pool_stride=1, global_pooling=False)# 全局平均池化
pool2 = Pool2D(pool_type='avg', global_pooling=True)

Dropout

这个随机丢弃层是在paddle 1.8加入动态图的，以前使用的都是fluid.layers.dropout这个API Dropout参数如下 - p 输入丢弃概率，默认为0.5 - seed 创建随机种子，整数 - dropout_implementation 丢弃单元的方式，有种'downgrade_in_infer'和'upscale_in_train'两种选择，默认：'downgrade_in_infer'

from paddle.fluid.dygraph import Dropout
dropout = Dropout(p=0.5)

如果是fluid.layers.dropout其用法如下

x = fluid.layers.dropout(x, 0.5)

Linear

这是全连接层，旧的API是FC，已经被废弃 - input_dim 输入单元数目 - output_dim 输出单元数目 - act 激活函数

from paddle.fluid.dygraph import Linear
fc = Linear(1024, 512, act='sigmoid')

Sequential

顺序容器，用于存储各个层 可以直接添加层，也可以像字典那种键值对形式添加

# 直接添加
model1 = fluid.dygraph.Sequential(
        fluid.Linear(10, 1), fluid.Linear(1, 2)
    )# 以字典形式添加
model2 = fluid.dygraph.Sequential(
        ('l1', fluid.Linear(10, 2)),
        ('l2', fluid.Linear(2, 3))
    )

上面的API都属于paddle.fluid.dygraph的下面我们介绍细粒度更高的API系列paddle.fluid.layers

paddle.fluid.layers

element_wise_add(sub/mul/div/max等等)

这是对应元素操作系列，包含加减乘除你可以使用这个api，也可以直接使用运算符+ - ，paddle已经重载了运算符参数如下 - x 多维tensor - y 多维tensor - axis y维度对应x的索引，当我们需要对应元素操作时，这个不需要设置 - act 激活函数名称

sumval = fluid.layers.elementwise_add(x, y)

等价于

sumval = x + y

其他的不再过多介绍，使用方法类似

sigmoid

对输入张量，进行sigmoid运算

x = fluid.layers.sigmoid(x)

concat

让一组张量在某一维度上进行连结 - input 输入，由多维tensor构成的列表 - axis 连结维度

from paddle import fluid
import numpy as np
from paddle.fluid.dygraph.base import to_variable

with fluid.dygraph.guard():
    x_1= np.random.randn(1, 32, 24, 24).astype('float32')
    x_2= np.random.randn(1, 32, 24, 24).astype('float32')
    x_1 = to_variable(x_1)
    x_2 = to_variable(x_2)
    # 在通道维上连结
    concat_tensor = fluid.layers.concat([x_1, x_2], axis=1)
    # 打印张量形状，用来debug网络非常有用
    print(concat_tensor.shape)
    # [1, 64, 24, 24]

split

用于切分张量，返回切割后的张量组成的一个张量列表参数如下

input () - 输入张量
num_or_sections 如果 num_or_sections 是一个整数，则表示Tensor平均划分为相同大小子Tensor的数量。如果 num_or_sections 是一个list或tuple，那么它的长度代表子Tensor的数量，它的元素可以是整数或者形状为[1]的Tensor，依次代表子Tensor需要分割成的维度的大小。list或tuple的长度不能超过输入Tensor待分割的维度的大小。至多有一个元素值为-1，-1表示该值是由 input 待分割的维度值和 num_or_sections 的剩余元素推断出来的。
dim 表示分割的维度

from paddle import fluid
import numpy as np
from paddle.fluid.dygraph.base import to_variable
from paddle.fluid import dygraph as dygraph
with fluid.dygraph.guard():
    x = np.random.randn(1, 32, 24, 24).astype('float32')
    x = to_variable(x)
    x1, x2, x3, x4 = fluid.layers.split(x, num_or_sections=4, dim=1)
    print("X1 shape is ", x1.shape)
    print("X2 shape is ", x2.shape)
    print("X3 shape is ", x3.shape)
    print("X4 shape is ", x4.shape)
输出:
X1 shape is  [1, 8, 24, 24]
X2 shape is  [1, 8, 24, 24]
X3 shape is  [1, 8, 24, 24]
X4 shape is  [1, 8, 24, 24]

当传入的num_or_sections为列表时，它代表在dim指定维度需要分割的各个张量数量，允许列表里存在一个-1用于自动推断

from paddle import fluid
import numpy as np
from paddle.fluid.dygraph.base import to_variable
from paddle.fluid import dygraph as dygraph

with fluid.dygraph.guard():
    x = np.random.randn(1, 32, 24, 24).astype('float32')
    x = to_variable(x)
    x1, x2, x3, x4 = fluid.layers.split(x, num_or_sections=[16, 8, 4, -1], dim=1)
    print("X1 shape is ", x1.shape)
    print("X2 shape is ", x2.shape)
    print("X3 shape is ", x3.shape)
    print("X4 shape is ", x4.shape)
输出:
X1 shape is  [1, 16, 24, 24]
X2 shape is  [1, 8, 24, 24]
X3 shape is  [1, 4, 24, 24]
X4 shape is  [1, 4, 24, 24]

16 + 8 + 4 + 4 = dim为1也就是通道维的数目 = 32

stack

该api用于多个张量堆叠到指定维度，注意与concat进行区别 参数如下 - x 输入张量，可以是单个张量，也可以是张量组成的列表 - axis 指定堆叠的维度

from paddle import fluid
import numpy as np
from paddle.fluid.dygraph.base import to_variable
with fluid.dygraph.guard():
    x_1= np.random.randn(1, 32, 24, 24).astype('float32')
    x_2= np.random.randn(1, 32, 24, 24).astype('float32')
    x_1 = to_variable(x_1)
    x_2 = to_variable(x_2)
    # 在第二维度上堆叠
    x = fluid.layers.stack([x_1, x_2], axis=1)
    print(x.shape)
输出:[1, 2, 32, 24, 24]

flatten

用于摊平张量 - x 输入张量 - axis 分割轴，也就是固定该维度，将其他维度展开

# x shape is [4, 4, 3]
out = fluid.layers.flatten(x=x, axis=2)# out shape is [16, 3]

cross_entropy

用于计算交叉熵损失，根据硬标签和软标签有不同的输入方式 - input 输入多维张量，最后一维是类别数 - label 输入input对应的标签值。若soft_label=False，要求label维度为 [N1,N2,...,Nk] 或 [N1,N2,...,Nk,1] ，数据类型为int64，且值必须大于等于0且小于D；若soft_label=True，要求label的维度、数据类型与input相同，且每个样本各软标签的总和为1。- soft_label (bool) – 指明label是否为软标签。默认为False，表示label为硬标签；若soft_label=True则表示软标签。

先看下计算soft label
from paddle import fluid as fluid
import numpy as np
from paddle.fluid.dygraph.base import to_variable

with fluid.dygraph.guard():
    x = np.array([[0.35, 0.65]]).astype('float32')
    # 若soft_label=True，要求label的维度、数据类型与input相同，且每个样本各软标签的总和为1。
    label = np.array([[0.1, 0.9]]).astype('float32')
    x = to_variable(x)
    label = to_variable(label)
    loss = fluid.layers.cross_entropy(x, label, soft_label=True)
    # 将loss张量转化为numpy格式
    print("softloss is", loss.numpy())
输出:
softloss is [[0.49268687]]

我们再看一下计算硬标签，也就是one-hot

with fluid.dygraph.guard():
    x = np.array([[0.35, 0.65]]).astype('float32')
    label = np.array([0]).astype('int64')
    x = to_variable(x)
    label = to_variable(label)
    loss = fluid.layers.cross_entropy(x, label)
    # 将loss张量转化为numpy格式
    print("hardloss is", loss.numpy())
输出为:
hardloss is [1.0498221]

关于参数初始化

我们可以在层中对权重进行初始化，使用到的模块是fluid.initializer（初始化类）和fluid.ParamAttr（参数属性类）限于篇幅，这里只提供一个使用Xavier初始化卷积层的简单示例

def conv3x3(in_planes, out_planes, stride=1):
    return Conv2D(in_planes,
                  out_planes,
                  filter_size=3,
                  stride=stride,
                  padding=1,
                  param_attr=ParamAttr(initializer=Xavier()))

实战：编写一个SEBlock

SEBlock是最简单的注意力模块，下面我们使用paddle编写一下在paddle动态图中，编写网络模块需要继承fluid.dygraph.Layer类并在该类的init方法内部定义好层，在forward函数内，编写前向传播的逻辑

from paddle import fluid as fluid
from paddle.fluid.dygraph import Conv2D, Pool2D, Linear, BatchNorm, Sequential
class SELayer(fluid.dygraph.Layer):
    def __init__(self, channel, reduction=4):
        super(SELayer, self).__init__()
        # 定义全局池化层
        self.avg_pool = Pool2D(global_pooling=True, pool_type='avg')
        # 使用Sequential顺序容器，定义两层缩放的全连接层，第一个全连接层由relu激活，第二个全连接层由sigmoid激活
        self.fc = fluid.dygraph.Sequential(
            (Linear(channel, channel // reduction, act='relu')),
            (Linear(channel // reduction, channel, act='sigmoid'))
        )

    def forward(self, x):
        # 编写前向传播计算逻辑
        batch, channel, _, __ = x.shape
        y = self.avg_pool(x)
        y = fluid.layers.flatten(y)
        y = self.fc(y)
        y = fluid.layers.reshape(y, shape=(batch, channel, 1, 1))
        # 最后将y与x对应元素相乘
        return fluid.layers.elementwise_mul(x, y)

实战：编写Resnet18

这是个非常经典的小网络，我们从基本块开始构建，最后组装到Resnet18上

from paddle import fluid
import numpy as np
from paddle.fluid.dygraph.base import to_variable
from paddle.fluid import dygraph as dygraph
from paddle.fluid.dygraph import BatchNorm, Conv2D, Sequential, Pool2D
from paddle.fluid.layers import relu


class BasicBlock(dygraph.Layer):
    def __init__(self, in_channels, out_channels, stride=1):
        super(BasicBlock, self).__init__()

        self.conv1 = Conv2D(in_channels, out_channels, filter_size=3, stride=stride, padding=1)
        self.bn1 = BatchNorm(out_channels, act='relu')
        self.conv2 = Conv2D(out_channels, out_channels, filter_size=3, padding=1)
        self.bn2 = BatchNorm(out_channels)

        if in_channels != out_channels or stride != 1:
            self.shortcut = Conv2D(in_channels, out_channels, filter_size=1, stride=stride)
            self.bn3 = BatchNorm(out_channels)
        else:
            self.shortcut = None

    def forward(self, x):
        residual = x
        x = self.bn1(self.conv1(x))
        x = self.bn2(self.conv2(x))

        if self.shortcut:
            residual = self.bn3(self.shortcut(residual))

        return relu(x + residual)class Resnet18(dygraph.Layer):
    def __init__(self):
        super(Resnet18, self).__init__()
        self.inplanes = [64, 128, 256, 512]
        self.blocknums = [2, 2, 2, 2]
        self.stride = [1, 2, 2, 2]
        self.FirstConv = Conv2D(3, 64, filter_size=7, stride=2, padding=3)
        self.Pool = Pool2D(pool_type='max', pool_size=3, pool_stride=2, pool_padding=1)

        self.block1 = self.add_layer(0, self.blocknums[0], self.inplanes[0], self.inplanes[0], self.stride[0])
        self.block2 = self.add_layer(1, self.blocknums[1], self.inplanes[0], self.inplanes[1], self.stride[1])
        self.block3 = self.add_layer(2, self.blocknums[2], self.inplanes[1], self.inplanes[2], self.stride[2])
        self.block4 = self.add_layer(3, self.blocknums[3], self.inplanes[2], self.inplanes[3], self.stride[3])

        self.globalPool = Pool2D(pool_type='avg', global_pooling=True)
    def add_layer(self, block_index, block_repeats, in_channels, out_channels, stride=1):
        blocklist = Sequential()
        blocklist.add_sublayer('Block{}_{}'.format(block_index, 0), BasicBlock(in_channels, out_channels, stride))

        for num in range(1, block_repeats):
            blocklist.add_sublayer('Block{}_{}'.format(block_index, num), BasicBlock(out_channels, out_channels))
        return blocklist

    def forward(self, x):
        x = self.FirstConv(x)
        x = self.Pool(x)
        x = self.block1(x)
        x = self.block2(x)
        x = self.block3(x)
        x = self.block4(x)

        return self.globalPool(x)with fluid.dygraph.guard():
    x = np.random.randn(5, 3, 224, 224).astype('float32')
    x = to_variable(x)

    net = Resnet18()
    out = net(x)

    print(out.shape)