其他分享
首页 > 其他分享> > PaddlePaddle使用paddle.utils.run_check()检测出现PaddlePaddle meets some problem with 8 GPUs

PaddlePaddle使用paddle.utils.run_check()检测出现PaddlePaddle meets some problem with 8 GPUs

作者:互联网

WARNING:root:PaddlePaddle meets some problem with 8 GPUs. This may be caused by:
1. There is not enough GPUs visible on your system
2. Some GPUs are occupied by other process now
3. NVIDIA-NCCL2 is not installed correctly on your system. Please follow instruction on https://github.com/NVIDIA/nccl-tests
to test your NCCL, or reinstall it following https://docs.nvidia.com/deeplearning/sdk/nccl-install-guide/index.html
WARNING:root:
Original Error is: (External) NCCL error(2), unhandled system error.
[Hint: 'ncclSystemError'. A call to the system failed.] (at /paddle/paddle/fluid/platform/device/gpu/nccl_helper.h:155)

解决办法:

创建容器时加上--shm-size 8g参数

docker run --name paddle_docker_v2 --gpus all --shm-size 8g -it -v $PWD:/paddle paddlepaddle/paddle:2.3.1-gpu-cuda11.2-cudnn8 /bin/bash

标签:meets,--,PaddlePaddle,some,system,paddle,GPUs,your
来源: https://www.cnblogs.com/codeit/p/16453640.html