Runtimeerror: cuda error: all cuda-capable devices are busy or unavailable

Topic: "all CUDA-capable devices are busy or unavailable (46) at line 33" when aligning  (Read 3714 times)

Hello,

I have a very simple project with 20 photos which aligns easily on CPU, but fails to align when using the GPU. Since this fails, I cannot do any of the other steps. Any ideas?

Error message: "Error: cudaMemGetInfo(&free_mem_size, &total_mem_size): all CUDA-capable devices are busy or unavailable (46) at line 33"

Specs:

Agisoft Metashape Professional Version: 1.5.5 build 9097 (64 bit)
Platform: Windows
CPU: Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz (server)
CPU family: 6 model: 79 signature: 406F1h
RAM: 511.9 GB
OpenGL Vendor: NVIDIA Corporation
OpenGL Renderer: GeForce GTX TITAN X/PCIe/SSE2
OpenGL Version: 4.6.0 NVIDIA 461.72

Thanks in advance!

Logged

Hello Marcel,

I can suggest to make a clean driver install and check if it helps.

Also please check, if you are observing any similar issues in versions 1.6.6 and 1.7.1.

Logged

Best regards,
Alexey Pasumansky,
Agisoft LLC

Logged

Hello Marcel,

If you plan to update Windows version, I recommend to deactivate Metashape license first and then re-activate it once all the changes are made and one of the latest Metashape versions is installed.
Depth maps generation algorithms do differ considerably in 1.6 and 1.7 versions, so the behavior could be different.

Logged

Best regards,
Alexey Pasumansky,
Agisoft LLC

Hello Alexey,

I did the following:

1) Updated Windows 10 latest version:
Edition   Windows 10 Pro
Version   20H2
Installed on ‎10.‎03.‎2021
OS Build 19042.867
Windows Feature Experience Pack 120.2212.551.0

The issue persisted.

2) Installed Nvidia Driver 461.72 (clean install!)

The issue persisted

3) Installed Metashape 1.6.6 (1.5.5 was deleted in the process)

Ran the procedure twice and got 2 different results (possibly with different data sets):

First: "Error: CUDA_ERROR_UNKNOWN (999) at line 146" . Console output in output1.txt

Second: "Error: cudaMemGetInfo(&free_mem_size, &total_mem_size): all CUDA-capable devices are busy or unavailable (46) at line 40" . Console output in output2.txt

4) Installed Metashape 1.7.2 (I renamed the 1.6.6 folder to keep both versions)

Ran the procedure twice and got the same 2 results, in different order:

First: "Error: cudaMemGetInfo(&free_mem_size, &total_mem_size): all CUDA-capable devices are busy or unavailable (46) at line 40" . Console output in output3.txt

Second: "Error: CUDA_ERROR_UNKNOWN (999) at line 146" . Console output in output4.txt

The output of nvidia-smi when the last project was running:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 461.72       Driver Version: 461.72       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT... WDDM  | 00000000:03:00.0  On |                  N/A |
| 22%   47C    P8    27W / 250W |    331MiB / 12288MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1224    C+G   Insufficient Permissions        N/A      |
|    0   N/A  N/A      3524    C+G   ...nputApp\TextInputHost.exe    N/A      |
|    0   N/A  N/A      8920    C+G   ...5n1h2txyewy\SearchApp.exe    N/A      |
|    0   N/A  N/A      9612    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A     10652    C+G   ...lPanel\SystemSettings.exe    N/A      |
|    0   N/A  N/A     11056    C+G   ...artMenuExperienceHost.exe    N/A      |
|    0   N/A  N/A     12632    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     14696    C+G   ...b3d8bbwe\WinStore.App.exe    N/A      |
|    0   N/A  N/A     16064    C+G   ...e Pro 1.7.2\metashape.exe    N/A      |
+-----------------------------------------------------------------------------+

I don't have any more ideas what to try. Do you habe a guess what we could do?

Thanks
Marcel

Logged

Hello Marcel,

Which output do you get after creating main/gpu_enable_cuda tweak and setting its value to False (application should be re-started)?

Logged

Best regards,
Alexey Pasumansky,
Agisoft LLC

Hello Alexey,

Thanks for the suggestion. I tried it and the results are below.

However what I tried first was restarting the computer multiple times and trying to run Metashape every time. It always crashed either with the error 999 or 46. But one time it did not crash and successfully aligned the photos! So I opened the proper project with over 1000 photos without closing Metashape in between. I ran the whole processing without issues. Then I shortly restarted Metashape (the PC ran continuously) and again it did not run even a small alignment process. Crashed with the same errors as always.

What struck me is that it did not throw the error in the beginning and then never again while the software was running.

Then we had a fresh Windows install and used the drivers 461.72 (I know that in between 461.92 are available, but did not try installing those). With this setup - still crashing.

Last thing I tried was disabling CUDA as described here: //www.agisoft.com/forum/index.php?topic=10100.msg46133#msg46133
It worked and the output is attached. However we would prefer using CUDA, as other users are unhappy with the processing times when using OpenCL (//www.agisoft.com/forum/index.php?topic=10100.msg46142#msg46142). Any suggestions?

Thanks,
Marcel

Logged

Hello Marcel,

I don't think there should be considerable performance difference between OpenCL and CUDA implementation of the algorithms in 1.7 version.

You are referring to the version 1.5.0 pre-release, almost 2.5 years have passed since then.

Logged

Best regards,
Alexey Pasumansky,
Agisoft LLC

Related Posts

Toplist

Latest post

TAGs