首页 > 其他分享> > RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changi
RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changi
作者:互联网
最近在训练网络模型时,出现以下错误:
/home/xw/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero. (Triggered internally at /opt/conda/conda-bld/pytorch_1623448265233/work/c10/cuda/CUDAFunctions.cpp:115.)
return torch._C._cuda_getDeviceCount() > 0
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda-11.1'
Traceback (most recent call last):
File "configs/trainval/tinaface/test_widerface.py", line 132, in <module>
main()
File "configs/trainval/tinaface/test_widerface.py", line 125, in main
engine, data_loader = prepare(cfg, args.checkpoint,args.device)
File "configs/trainval/tinaface/test_widerface.py", line 77, in prepare
device = torch.cuda.current_device()
File "/home/xw/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/cuda/__init__.py", line 432, in current_device
_lazy_init()
File "/home/xw/anaconda3/envs/openmmlab/lib/python3.7/site-packages/torch/cuda/__init__.py", line 172, in _lazy_init
torch._C._cuda_init()
RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.
检查显卡驱动nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Tue_Sep_15_19:10:02_PDT_2020
Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0
查看显卡:watch nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.103.01 Driver Version: 470.103.01 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:19:00.0 Off | N/A |
| 0% 57C P8 26W / 370W | 11MiB / 10018MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:1A:00.0 Off | N/A |
| 0% 57C P8 25W / 370W | 11MiB / 10018MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:67:00.0 Off | N/A |
| 0% 84C P2 275W / 370W | 9381MiB / 10018MiB | 96% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... Off | 00000000:68:00.0 Off | N/A |
| 0% 83C P2 282W / 370W | 7904MiB / 10015MiB | 97% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------
运行:print(torch.cuda.is_available())
报错如下:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/kumar/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/cuda/__init__.py", line 430, in current_device
_lazy_init()
File "/home/kumar/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/cuda/__init__.py", line 170, in _lazy_init
torch._C._cuda_init()
RuntimeError: CUDA unknown error - this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero.
解决方案:
apt-get install nvidia-modprobe
一分钟秒解决
标签:set,may,unknown,torch,init,cuda,py,line,CUDA 来源: https://blog.csdn.net/weixin_44777827/article/details/123232314