GpuArrayException: No cuda device available尝试解决
作者:互联网
问题:
在import keras或import ttheano时出现了以下:
>>> import keras Using Theano backend. ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module> use(config.device) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init GpuArrayException: No cuda device available
搜索到的解决办法很少,简直奔溃。
尝试了pip uninstall theano并且使用conda install theano安装后,出现了更为奇怪的问题,搜索之后发现是由于theano1.0.4和numpy16.0出现不兼容等问题,所以进行了卸载。
重新使用pip install theano之后,进行操作,仍旧是同样的错误:
>>> import theano ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/data_d/old_home/home/.conda/envs/ib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module> use(config.device) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init GpuArrayException: No cuda device available
其他配置如下:
[global] floatX = float32 device =cuda [cuda] root=/usr/local/cuda-8.0 ##.theanorc文件
echo $PATH /data_d/old_home/home/.conda/envs/bin:/usr/local/cuda-8.0/bin:/data_d/public/miniconda2/bin:/usr/local/cuda-9.0/bin:/usr/local/sbin: /usr/local/bin:/usr/sbin:/usr/bin:/s:/usr/local/cuda-8.0/bin/local/games:/snap/bin:/usr/local/cuda-8.0/bin
CUDA_VISIBLE_DEVICES=1 CUDA_HOME=/usr/local/cuda-8.0 PATH="$PATH:/usr/local/cuda-8.0/bin" LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64" #.bashrc文件
at /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2 #define CUDNN_MAJOR 6 #define CUDNN_MINOR 0 #define CUDNN_PATCHLEVEL 21
所使用的theano版本为1.0.4,对应的pygpu为0.7.6。
又怀疑是否是cuda-8.0文件夹的所有者被改变?一开始安装好应该是我,但是之后变成了root,将所有者重新变为我之后,发现仍旧不行,所以这里的方法是卸载并重新安装cuda。
跑测试程序也是同样的报错:
Using Theano backend. ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module> use(config.device) File "/data_d/old_home/home/.conda/envs/xhs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init GpuArrayException: No cuda device available Training ----------- ('train cost: ', array(4.1908903, dtype=float32)) ('train cost: ', array(0.10415509, dtype=float32)) ('train cost: ', array(0.01151281, dtype=float32)) ('train cost: ', array(0.00458441, dtype=float32)) Testing ------------ 40/40 [==============================] - 0s 5us/step ('test cost:', 0.005374030210077763) ('Weights=', array([[0.56634265]], dtype=float32), '\nbiases=', array([2.001063], dtype=float32))
//所以说为什么cuda检测不到呢?
尝试一:
修改配置文件,改为了cuda0,结果import theano时:
[global] floatX = float32 device =cuda0 [cuda] root=/usr/local/cuda-8.0
>>> import theano ERROR (theano.gpuarray): Could not initialize pygpu, support disabled Traceback (most recent call last): File "/data_d/old_home/home/.conda/env/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 227, in <module> use(config.device) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 214, in use init_dev(device, preallocate=preallocate) File "/data_d/old_home/home/.conda/envs/lib/python2.7/site-packages/theano/gpuarray/__init__.py", line 99, in init_dev **args) File "pygpu/gpuarray.pyx", line 658, in pygpu.gpuarray.init File "pygpu/gpuarray.pyx", line 587, in pygpu.gpuarray.pygpu_init GpuArrayException: GPU is too old for CUDA version
这个问题先放一下,在https://blog.csdn.net/qq_33200967/article/details/80689543看到,需要检查cuda是否安装成功,由于直接用make报错,https://devtalk.nvidia.com/default/topic/1048902/cuda-setup-and-installation/cuda-samples-ubuntu-make-file-errors/,
所以使用了sudo make -k,发现输出结果为:
./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "NVS 315" CUDA Driver Version / Runtime Version 9.0 / 8.0 CUDA Capability Major/Minor version number: 2.1 Total amount of global memory: 963 MBytes (1010040832 bytes) ( 1) Multiprocessors, ( 48) CUDA Cores/MP: 48 CUDA Cores GPU Max Clock rate: 1046 MHz (1.05 GHz) Memory Clock rate: 875 Mhz Memory Bus Width: 64-bit L2 Cache Size: 65536 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per multiprocessor: 1536 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (65535, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 2 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = NVS 315 Result = PASS
查看nvidia显卡驱动版本:https://blog.csdn.net/s_sunnyy/article/details/64121826
cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.130 Wed Mar 21 03:37:26 PDT 2018 GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10)
查看本机nvidia显卡:
:/dev$ ls -l nvidia* crw-rw-rw- 1 root root 195, 0 5月 17 12:53 nvidia0 crw-rw-rw- 1 root root 195, 1 5月 17 12:53 nvidia1 crw-rw-rw- 1 root root 195, 255 5月 17 12:53 nvidiactl crw-rw-rw- 1 root root 195, 254 5月 17 12:53 nvidia-modeset crw-rw-rw- 1 root root 240, 0 5月 17 12:53 nvidia-uvm
查看cudnn的版本:, conda list -n username
cudatoolkit 10.0.130 0 cudnn 7.3.1 cuda10.0_0
似乎版本过高,https://blog.csdn.net/li57681522/article/details/82491617
安装的cudatoolkit和cudnn程序包版本是:10.0
but实际上,我根本就没有安装过cuda10.0。
所以尝试卸载
conda uninstall cudnn Fetching package metadata ........... Solving package specifications: . Package plan for package removal in environment /data_d/old_home/home/xhs/.conda/envs: The following packages will be REMOVED: cudnn: 7.3.1-cuda10.0_0 Proceed ([y]/n)? y
conda uninstall cudatoolkit Fetching package metadata ........... Solving package specifications: . Package plan for package removal in environment /data_d/old_home/home/xhs/.conda/envs: The following packages will be REMOVED: cudatoolkit: 10.0.130-0 cupti: 10.0.130-0 Proceed ([y]/n)? y
标签:available,GpuArrayException,No,init,gpuarray,theano,home,pygpu,cuda 来源: https://www.cnblogs.com/BlueBlueSea/p/10989917.html