解决Keras TensorFlow 混编中 trainable=False设置无效问题
这是最近碰到一个问题,先描述下问题:
首先我有一个训练好的模型(例如vgg16),我要对这个模型进行一些改变,例如添加一层全连接层,用于种种原因,我只能用TensorFlow来进行模型优化,tf的优化器,默认情况下对所有tf.trainable_variables()进行权值更新,问题就出在这,明明将vgg16的模型设置为trainable=False,但是tf的优化器仍然对vgg16做权值更新
以上就是问题描述,经过谷歌百度等等,终于找到了解决办法,下面我们一点一点的来复原整个问题。
trainable=False 无效
首先,我们导入训练好的模型vgg16,对其设置成trainable=False
from keras.applications import VGG16
import tensorflow as tf
from keras import layers
# 导入模型
base_mode = VGG16(include_top=False)
# 查看可训练的变量
tf.trainable_variables()
[<tf.Variable 'block1_conv1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref ,
<tf.Variable 'block1_conv1/bias:0' shape=(64,) dtype=float32_ref ,
<tf.Variable 'block1_conv2/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref ,
<tf.Variable 'block1_conv2/bias:0' shape=(64,) dtype=float32_ref ,
<tf.Variable 'block2_conv1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref ,
<tf.Variable 'block2_conv1/bias:0' shape=(128,) dtype=float32_ref ,
<tf.Variable 'block2_conv2/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref ,
<tf.Variable 'block2_conv2/bias:0' shape=(128,) dtype=float32_ref ,
<tf.Variable 'block3_conv1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv1/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block3_conv2/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv2/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block3_conv3/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv3/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block4_conv1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block4_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv2/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block4_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv3/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv2/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv3/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block1_conv1_1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref ,
<tf.Variable 'block1_conv1_1/bias:0' shape=(64,) dtype=float32_ref ,
<tf.Variable 'block1_conv2_1/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref ,
<tf.Variable 'block1_conv2_1/bias:0' shape=(64,) dtype=float32_ref ,
<tf.Variable 'block2_conv1_1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref ,
<tf.Variable 'block2_conv1_1/bias:0' shape=(128,) dtype=float32_ref ,
<tf.Variable 'block2_conv2_1/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref ,
<tf.Variable 'block2_conv2_1/bias:0' shape=(128,) dtype=float32_ref ,
<tf.Variable 'block3_conv1_1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv1_1/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block3_conv2_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv2_1/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block3_conv3_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv3_1/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block4_conv1_1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv1_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block4_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv2_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block4_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv3_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv1_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv1_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv2_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv3_1/bias:0' shape=(512,) dtype=float32_ref ]
# 设置 trainable=False
# base_mode.trainable = False似乎也是可以的
for layer in base_mode.layers:
layer.trainable = False
设置好trainable=False后,再次查看可训练的变量,发现并没有变化,也就是说设置无效
# 再次查看可训练的变量 tf.trainable_variables()
[<tf.Variable 'block1_conv1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref ,
<tf.Variable 'block1_conv1/bias:0' shape=(64,) dtype=float32_ref ,
<tf.Variable 'block1_conv2/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref ,
<tf.Variable 'block1_conv2/bias:0' shape=(64,) dtype=float32_ref ,
<tf.Variable 'block2_conv1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref ,
<tf.Variable 'block2_conv1/bias:0' shape=(128,) dtype=float32_ref ,
<tf.Variable 'block2_conv2/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref ,
<tf.Variable 'block2_conv2/bias:0' shape=(128,) dtype=float32_ref ,
<tf.Variable 'block3_conv1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv1/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block3_conv2/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv2/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block3_conv3/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv3/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block4_conv1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block4_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv2/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block4_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv3/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv2/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv2/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv3/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv3/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block1_conv1_1/kernel:0' shape=(3, 3, 3, 64) dtype=float32_ref ,
<tf.Variable 'block1_conv1_1/bias:0' shape=(64,) dtype=float32_ref ,
<tf.Variable 'block1_conv2_1/kernel:0' shape=(3, 3, 64, 64) dtype=float32_ref ,
<tf.Variable 'block1_conv2_1/bias:0' shape=(64,) dtype=float32_ref ,
<tf.Variable 'block2_conv1_1/kernel:0' shape=(3, 3, 64, 128) dtype=float32_ref ,
<tf.Variable 'block2_conv1_1/bias:0' shape=(128,) dtype=float32_ref ,
<tf.Variable 'block2_conv2_1/kernel:0' shape=(3, 3, 128, 128) dtype=float32_ref ,
<tf.Variable 'block2_conv2_1/bias:0' shape=(128,) dtype=float32_ref ,
<tf.Variable 'block3_conv1_1/kernel:0' shape=(3, 3, 128, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv1_1/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block3_conv2_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv2_1/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block3_conv3_1/kernel:0' shape=(3, 3, 256, 256) dtype=float32_ref ,
<tf.Variable 'block3_conv3_1/bias:0' shape=(256,) dtype=float32_ref ,
<tf.Variable 'block4_conv1_1/kernel:0' shape=(3, 3, 256, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv1_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block4_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv2_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block4_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block4_conv3_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv1_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv1_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv2_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv2_1/bias:0' shape=(512,) dtype=float32_ref ,
<tf.Variable 'block5_conv3_1/kernel:0' shape=(3, 3, 512, 512) dtype=float32_ref ,
<tf.Variable 'block5_conv3_1/bias:0' shape=(512,) dtype=float32_ref ]
解决的办法
解决的办法就是在导入模型的时候建立一个variable_scope,将需要训练的变量放在另一个variable_scope,然后通过tf.get_collection获取需要训练的变量,最后通过tf的优化器中var_list指定需要训练的变量
from keras import models
with tf.variable_scope('base_model'):
base_model = VGG16(include_top=False, input_shape=(224,224,3))
with tf.variable_scope('xxx'):
model = models.Sequential()
model.add(base_model)
model.add(layers.Flatten())
model.add(layers.Dense(10))
# 获取需要训练的变量
trainable_var = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, 'xxx')
trainable_var
[<tf.Variable ‘xxx_2/dense_1/kernel:0’ shape=(25088, 10) dtype=float32_ref , <tf.Variable ‘xxx_2/dense_1/bias:0’ shape=(10,) dtype=float32_ref ]
# 定义tf优化器进行训练,这里假设有一个loss
loss = model.output / 2; # 随便定义的,方便演示
train_step = tf.train.AdamOptimizer().minimize(loss, var_list=trainable_var)
总结
在keras与TensorFlow混编中,keras中设置trainable=False对于TensorFlow而言并不起作用
解决的办法就是通过variable_scope对变量进行区分,在通过tf.get_collection来获取需要训练的变量,最后通过tf优化器中var_list指定训练
以上这篇解决Keras TensorFlow 混编中 trainable=False设置无效问题就是小编分享给大家的全部内容了,希望能给大家一个参考。
- android service 学习(上)
- 黑帽SEO剖析之隐身篇
- Java中如何判断一个字符串是Java代码还是英文呢?
- 将复杂查询写到SQL配置文件--SOD框架的SQL-MAP技术简介
- Java中实现找到两个数组交集的2种方法,开发实用
- Java Web中JSP的include 指令知识点总结——每日一语法学习
- Java反序列化漏洞从理解到实践
- ORM查询语言(OQL)简介--高级篇(续):庐山真貌
- Java中使用Hibernate系列之映射关联启动工作学习(第五节)
- Java中使用Hibernate系列之单向Set-based的关联学习(第四节)
- Java中使用Hibernate系列之加载并存储对象学习(第三节)
- Java中使用Hibernate系列之启动方法学习(第二节)
- Java中使用Hibernate系列之映射文件学习(第一节)
- Java中为图片添加水印效果的方法——实例代码
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法
- HashMap源码分析
- [周末往期回顾]使用cx_Oracle连接Oracle
- java_数组的定义与操作
- LinkedList源码分析
- MarkDown语法的详细使用教程
- java_内部类、匿名内部类的使用
- [周末往期回顾]使用BBED定位数据位置
- JeecgCloud版,新建项目。
- MySQL忘记root密码,错误号码1045解决办法
- java_流程控制语句、权限修饰符
- [Oracle 故障处理]记一次RMAN备份警告的处理过程
- [Oracle 日常管理]使用oradebug捕获SQL语句
- [Oracle 日常管理]ERRORSTACK使用介绍
- Oracle参数解析(nls_numeric_characters)
- 在Linux系统中安装Tomcat