Description
I’m using TensorRT with an engine I created using the C++ API. When I try to load the model using nvinfer1::createInferRuntime()
, I randomly get the error: createInferRuntime: Error Code 6: API Usage Error (CUDA initialization failure with error: 3). Since this is basically the first TensorRT method I call, I suspect it’s something related to my environment.
The only potentially relevant detail is that I’ve been porting my CMake files and dependencies from Linux to Windows, which required me to change how I enable CUDA. Initially I was using find_package(CUDAToolkit)
and then linking with CUDA::cudart
. Now, I simply only use enable_language(CUDA)
.
Does anyone have an idea of what might cause this error, or if there’s a way to get more information about what’s triggering it?
Environment
TensorRT Version: 10.11, 10.12
GPU Type: NVIDIA GeForce RTX 2080
Nvidia Driver Version: 575.64
CUDA Version: 12.9
cuDNN Version: 9.10.2.21-1
Operating System + Version: Linux arch2080x 6.15.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Tue, 10 Jun 2025 21:32:33 +0000 x86_64 GNU/Linux
Steps To Reproduce
Call nvinfer1::createInferRuntime()
.
Hi @diederick ,
The error you’re encountering, “CUDA initialization failure with error 3,” typically indicates that there’s an issue with the CUDA initialization process.
- CUDA Version Compatibility: Ensure that the CUDA version you’re using is compatible with your NVIDIA driver version. The error might occur if there’s a mismatch between the CUDA toolkit version and the NVIDIA driver version. Check the CUDA documentation for version compatibility.
- CUDA Initialization: The fact that you’ve changed how you enable CUDA in your project might be the culprit. When using the C API, especially in a cross-platform scenario, it’s crucial to ensure that CUDA is properly initialized before using any CUDA or TensorRT functions. Make sure you’re calling
cudaFree(0)
or a similar function to ensure CUDA is initialized before creating the TensorRT inference runtime. - DLL Dependencies: On Windows, ensure that all necessary DLLs are in the PATH or in the same directory as your executable. This includes CUDA DLLs (e.g.,
cudart64_110.dll
for CUDA 11.0) and potentially other dependencies required by TensorRT. - Static vs. Dynamic Linking: Changing from static to dynamic linking of CUDA libraries might affect how CUDA is initialized. If you’re dynamically linking, ensure that the CUDA DLLs are found by your application.
- Environment Variables: Sometimes, environment variables can affect how CUDA and TensorRT behave. Check that your
PATH
, CUDA_PATH
, and other relevant environment variables are correctly set. - Driver Updates: Ensure your NVIDIA drivers are up to date. Outdated drivers can cause compatibility issues with the CUDA toolkit and TensorRT.
Additionally, 1. ncrease the verbosity of TensorRT logging to see if it provides more detailed information about the error. This can be done by setting the log level to VERBOSE or INTERNAL_ERROR before creating the inference runtime.
Please let us know if issue persist.
Thanks
Hi @AakankshaS,
Thanks for your eleborate reply! My main development platform is Linux which is where I’ve been encountering this issue.
Also thanks for sharing all these potential issues. What I don’t understand is why this behavior is so random. It can happen that I start my application ~50 times without any issues, then suddenly it starts failing several times after which it continues working fine for many restarts. If I’m not initializing Cuda correctly, I would expect it to fail each time. Or can Cuda somehow sometimes initialize itself correctly and sometimes not?
I’ve added maximum logging and will paste some logs soon, to get to the bottom of this issue.
Thanks!