解决TensorFlow程序无限制占用GPU的方法
时间:2022-07-27
本文章向大家介绍解决TensorFlow程序无限制占用GPU的方法,主要内容包括其使用实例、应用技巧、基本知识点总结和需要注意事项,具有一定的参考价值,需要的朋友可以参考一下。
今天遇到一个奇怪的现象,使用tensorflow-gpu的时候,出现内存超额~~如果我训练什么大型数据也就算了,关键我就写了一个y=W*x…显示如下图所示:
程序如下:
import tensorflow as tf
w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])
y = tf.multiply(w,b)
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
print(sess.run(y))
出错提示:
占用的内存越来越多,程序崩溃之后,整个电脑都奔溃了,因为整个显卡全被吃了
2018-06-10 18:28:00.263424: I T:srcgithubtensorflowtensorflowcoreplatformcpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:28:00.598075: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:28:00.598453: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:28:01.265600: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:28:01.265826: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:929] 0
2018-06-10 18:28:01.265971: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:942] 0: N
2018-06-10 18:28:01.266220: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) - physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-10 18:28:01.331056: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 4.63G (4970853120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.399111: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 4.17G (4473767936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.468293: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 3.75G (4026391040 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.533138: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 3.37G (3623751936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.602452: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 3.04G (3261376768 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.670225: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 2.73G (2935238912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.733120: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 2.46G (2641714944 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.800101: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 2.21G (2377543424 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.862064: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 1.99G (2139789056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.925434: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 1.79G (1925810176 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.986180: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 1.61G (1733229056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.043456: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 1.45G (1559906048 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.103531: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 1.31G (1403915520 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.168973: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 1.18G (1263524096 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.229387: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 1.06G (1137171712 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.292997: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 976.04M (1023454720 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.356714: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 878.44M (921109248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.418167: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 790.59M (828998400 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.482394: E T:srcgithubtensorflowtensorflowstream_executorcudacuda_driver.cc:936] failed to allocate 711.54M (746098688 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
分析原因:
显卡驱动不是最新版本,用__驱动软件__更新一下驱动,或者自己去下载更新。
TF运行太多,注销全部程序冲洗打开。
由于TF内核编写的原因,默认占用全部的GPU去训练自己的东西,也就是像meiguo一样优先政策吧
这个时候我们得设置两个方面:
- 选择什么样的占用方式?优先占用__还是__按需占用
- 选择最大占用多少GPU,因为占用过大GPU会导致其它程序奔溃。最好在0.7以下
先更新驱动:
再设置TF程序:
注意:单独设置一个不行!按照网上大神博客试了,结果效果还是很差(占用很多GPU)
设置TF:
- 按需占用
- 最大占用70%GPU
修改代码如下:
import tensorflow as tf
w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])
y = tf.multiply(w,b)
init_op = tf.global_variables_initializer()
config = tf.ConfigProto(allow_soft_placement=True)
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
sess.run(init_op)
print(sess.run(y))
成功解决:
2018-06-10 18:21:17.532630: I T:srcgithubtensorflowtensorflowcoreplatformcpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:21:17.852442: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:21:17.852817: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:21:18.511176: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:21:18.511397: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:929] 0
2018-06-10 18:21:18.511544: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:942] 0: N
2018-06-10 18:21:18.511815: I T:srcgithubtensorflowtensorflowcorecommon_runtimegpugpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) - physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
[[2. 4.]
[3. 6.]]
参考资料:
主要参考博客
错误实例
到此这篇关于解决TensorFlow程序无限制占用GPU的方法 的文章就介绍到这了,更多相关TensorFlow 占用GPU内容请搜索ZaLou.Cn
- ifconfig: command not found CentOS专版,其他的可以参考
- Codeforces 833E Caramel Clouds
- Codeforces 833D Red-black Cobweb【树分治】
- Codeforces 834E The Bakery【枚举+数位dp】
- 【Java学习笔记之一】java关键字及作用
- 如何让所有实体类用相同名称的主键(很有力的问题,比如所有表实体主键都用ID)
- Codeforces 834D The Bakery【dp+线段树维护+lazy】
- memcached安装及.NET中的Memcached.ClientLibrary使用详解
- AtCoder Beginner Contest 069【A,水,B,水,C,数学,D,暴力】
- 2017"百度之星"程序设计大赛 - 资格赛【1001 Floyd求最小环 1002 歪解(并查集),1003 完全背包 1004 01背包 1005 打表找规律+卡特兰数】
- 洛谷 2634&&BZOJ 2152: 聪聪可可【点分治学习+超详细注释】
- 【经验总结】Java在ACM算法竞赛编程中易错点
- 【Java学习笔记之六】java三种循环(for,while,do......while)的使用方法及区别
- 类A是公共的,应在名为A.java的文件中声明错误
- JavaScript 教程
- JavaScript 编辑工具
- JavaScript 与HTML
- JavaScript 与Java
- JavaScript 数据结构
- JavaScript 基本数据类型
- JavaScript 特殊数据类型
- JavaScript 运算符
- JavaScript typeof 运算符
- JavaScript 表达式
- JavaScript 类型转换
- JavaScript 基本语法
- JavaScript 注释
- Javascript 基本处理流程
- Javascript 选择结构
- Javascript if 语句
- Javascript if 语句的嵌套
- Javascript switch 语句
- Javascript 循环结构
- Javascript 循环结构实例
- Javascript 跳转语句
- Javascript 控制语句总结
- Javascript 函数介绍
- Javascript 函数的定义
- Javascript 函数调用
- Javascript 几种特殊的函数
- JavaScript 内置函数简介
- Javascript eval() 函数
- Javascript isFinite() 函数
- Javascript isNaN() 函数
- parseInt() 与 parseFloat()
- escape() 与 unescape()
- Javascript 字符串介绍
- Javascript length属性
- javascript 字符串函数
- Javascript 日期对象简介
- Javascript 日期对象用途
- Date 对象属性和方法
- Javascript 数组是什么
- Javascript 创建数组
- Javascript 数组赋值与取值
- Javascript 数组属性和方法
- python第三十四课——2.匿名函数配合容器函数的使用
- 专家专栏|使用agent2自定义插件采集通过MQTT协议发送的数据
- Linux系统双网卡绑定配置教程
- python第三十五课——生成器
- python第三十六课——1.可迭代对象
- Linux系统Shell编程——脚本编写思路与过程
- python第三十六课——2.迭代器对象
- python第三十七课——模块
- Linux系统MySQL数据库主从同步实战过程
- 最火的java8新特性:Lambda 表达式
- python第三十九课——面向对象(二)之设计类
- python第三十九课——面向对象(二)之初始化属性
- LVS服务DR模式安装布署过程
- python第四十课——构造函数
- python第四十一课——析构函数