TensorFlow : Docker イメージ (GPU) を利用する2021/04/14 |
機械学習ライブラリー, TensorFlow をインストールします。
当例では、TensorFlow 公式の Docker イメージをダウンロードして、コンテナーから TensorFlow を利用します。
Docker イメージは GPU サポート有りのイメージを使用します。 |
|
[1] | |
[2] | TensorFlow Docker (GPU) の root ユーザーでの利用例です。 |
# TensorFlow 2.4 イメージを Pull [root@dlp ~]# podman pull tensorflow/tensorflow:2.4.1-gpu
podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tensorflow/tensorflow 2.4.1-gpu edb49f6a133b 2 months ago 5.55 GB # [nvidia-smi] 動作確認 [root@dlp ~]# podman run -e NVIDIA_VISIBLE_DEVICES=all --rm tensorflow:2.4.1-gpu nvidia-smi Wed Apr 14 02:40:21 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GTX 1070 Off | 00000000:05:00.0 Off | N/A | | 26% 34C P5 24W / 180W | 0MiB / 8119MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ # TensorFlow 動作確認 [root@dlp ~]# podman run -e NVIDIA_VISIBLE_DEVICES=all --rm tensorflow:2.4.1-gpu \ python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))" 2021-04-14 02:40:55.090070: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-04-14 02:40:57.452613: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-04-14 02:40:57.453836: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2021-04-14 02:40:57.682842: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:57.683734: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:05:00.0 name: GeForce GTX 1070 computeCapability: 6.1 coreClock: 1.7845GHz coreCount: 15 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s 2021-04-14 02:40:57.683793: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-04-14 02:40:57.690427: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2021-04-14 02:40:57.690515: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11 2021-04-14 02:40:57.693389: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-04-14 02:40:57.695669: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-04-14 02:40:57.703652: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-04-14 02:40:57.705661: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2021-04-14 02:40:57.705995: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2021-04-14 02:40:57.706190: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:57.707262: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:57.708184: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-04-14 02:40:57.709312: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-04-14 02:40:57.709470: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:57.710391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:05:00.0 name: GeForce GTX 1070 computeCapability: 6.1 coreClock: 1.7845GHz coreCount: 15 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s 2021-04-14 02:40:57.710445: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-04-14 02:40:57.710475: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2021-04-14 02:40:57.710529: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11 2021-04-14 02:40:57.710555: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-04-14 02:40:57.710578: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-04-14 02:40:57.710602: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-04-14 02:40:57.710652: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2021-04-14 02:40:57.710680: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2021-04-14 02:40:57.710781: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:57.711748: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:57.712652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-04-14 02:40:57.712736: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-04-14 02:40:58.437327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-04-14 02:40:58.437377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-04-14 02:40:58.437393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-04-14 02:40:58.437676: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:58.438425: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:58.439151: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:40:58.439816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7446 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:05:00.0, compute capability: 6.1) tf.Tensor(916.9441, shape=(), dtype=float32) |
[3] | SELinux を有効にしている場合は、ポリシーの変更が必要です。 |
[root@dlp ~]#
vi my-python.te # 以下の内容で新規作成 module my-python 1.0; require { type container_t; type xserver_misc_device_t; type device_t; class chr_file { getattr ioctl map open read write }; } #============= container_t ============== allow container_t device_t:chr_file map; allow container_t device_t:chr_file { getattr ioctl open read write }; allow container_t xserver_misc_device_t:chr_file map; checkmodule -m -M -o my-python.mod my-python.te [root@dlp ~]# semodule_package --outfile my-python.pp --module my-python.mod [root@dlp ~]# semodule -i my-python.pp |
[4] | 一般ユーザーで実行したい場合は、設定変更が必要です。 |
[root@dlp ~]#
vi /etc/nvidia-container-runtime/config.toml disable-require = false #swarm-resource = "DOCKER_RESOURCE_GPU" [nvidia-container-cli] #root = "/run/nvidia/driver" #path = "/usr/bin/nvidia-container-cli" environment = [] #debug = "/var/log/nvidia-container-toolkit.log" #ldcache = "/etc/ld.so.cache" load-kmods = true # コメント解除して [true] に変更 no-cgroups = true #user = "root:video" ldconfig = "@/sbin/ldconfig" #alpha-merge-visible-devices-envvars = false [nvidia-container-runtime] #debug = "/var/log/nvidia-container-runtime.log" # 任意の一般ユーザーでログインして動作確認
[cent@dlp ~]$
[cent@dlp ~]$ podman pull tensorflow/tensorflow:2.4.1-gpu
podman images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tensorflow/tensorflow 2.4.1-gpu edb49f6a133b 2 months ago 5.55 GB # [nvidia-smi] 動作確認 [cent@dlp ~]$ podman run --rm --security-opt=label=disable \ --hooks-dir=/usr/share/containers/oci/hooks.d/ \ tensorflow:2.4.1-gpu /usr/bin/nvidia-smi Wed Apr 14 02:48:39 2021 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GeForce GTX 1070 Off | 00000000:05:00.0 Off | N/A | | 27% 34C P5 19W / 180W | 0MiB / 8119MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ # Hello World テストスクリプトで動作確認 [cent@dlp ~]$ podman run -e NVIDIA_VISIBLE_DEVICES=all --rm --security-opt=label=disable \ --hooks-dir=/usr/share/containers/oci/hooks.d/ \ tensorflow:2.4.1-gpu \ python -c "import tensorflow as tf; hello = tf.constant('Hello, TensorFlow World'); tf.print(hello)" 2021-04-14 02:49:23.279865: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-04-14 02:49:25.600040: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-04-14 02:49:25.602451: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2021-04-14 02:49:25.826271: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:25.830279: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:05:00.0 name: GeForce GTX 1070 computeCapability: 6.1 coreClock: 1.7845GHz coreCount: 15 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s 2021-04-14 02:49:25.830343: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-04-14 02:49:25.843274: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2021-04-14 02:49:25.843332: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11 2021-04-14 02:49:25.851911: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-04-14 02:49:25.855268: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-04-14 02:49:25.864591: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-04-14 02:49:25.868827: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2021-04-14 02:49:25.871655: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2021-04-14 02:49:25.873914: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:25.878057: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:25.882200: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-04-14 02:49:25.885576: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-04-14 02:49:25.887802: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:25.891768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:05:00.0 name: GeForce GTX 1070 computeCapability: 6.1 coreClock: 1.7845GHz coreCount: 15 deviceMemorySize: 7.93GiB deviceMemoryBandwidth: 238.66GiB/s 2021-04-14 02:49:25.891820: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-04-14 02:49:25.891851: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2021-04-14 02:49:25.891878: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11 2021-04-14 02:49:25.891901: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-04-14 02:49:25.891924: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-04-14 02:49:25.891948: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-04-14 02:49:25.891971: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2021-04-14 02:49:25.891997: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2021-04-14 02:49:25.912150: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:25.916250: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:25.920061: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-04-14 02:49:25.921970: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-04-14 02:49:26.648781: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-04-14 02:49:26.648842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-04-14 02:49:26.648861: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-04-14 02:49:26.654316: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:26.659628: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:26.663482: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-14 02:49:26.667005: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7446 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:05:00.0, compute capability: 6.1) Hello, TensorFlow World |
Sponsored Link |