NVIDIA NIM API invoked by Langchain returns statuscode 500

ugljes11 · September 3, 2024, 9:56am

Hi! When I tried to invoke the NVIDIA NIM API (hosted by NVIDIA, not me) via Langchain (using the meta/llama-3.1-70b-instruct mode), and parsing the output as structured, I always get this error:

Traceback (most recent call last):   File "/localhome/wtest/nv_wso copy.py", line 154, in <module>     agent.invoke(   File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 1334, in invoke     for chunk in self.stream(   File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 1020, in stream     _panic_or_proceed(all_futures, loop.step)   File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/__init__.py", line 1450, in _panic_or_proceed     raise exc   File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/executor.py", line 60, in done     task.result()   File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result     return self.__get_result()            ^^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result     raise self._exception   File "/usr/local/lib/python3.12/concurrent/futures/thread.py", line 58, in run     result = self.fn(*self.args, **self.kwargs)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langgraph/pregel/retry.py", line 26, in run_with_retry     task.proc.invoke(task.input, task.config)   File "/usr/local/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 2876, in invoke     input = context.run(step.invoke, input, config, **kwargs)             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langgraph/utils.py", line 102, in invoke     ret = context.run(self.func, input, **kwargs)           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   File "/localhome/wtest/nv_wso copy.py", line 104, in respond     response = structured_llm.invoke(                ^^^^^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 2876, in invoke     input = context.run(step.invoke, input, config, **kwargs)             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langchain_core/runnables/base.py", line 5092, in invoke     return self.bound.invoke(            ^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 277, in invoke     self.generate_prompt(   File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 777, in generate_prompt     return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 634, in generate     raise e   File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 624, in generate     self._generate_with_cache(   File "/usr/local/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py", line 846, in _generate_with_cache     result = self._generate(              ^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/chat_models.py", line 289, in _generate     response = self._client.get_req(payload=payload)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 449, in get_req     response, session = self._post(self.infer_url, payload)                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^   File "/usr/local/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 346, in _post     self._try_raise(response)   File "/usr/local/lib/python3.12/site-packages/langchain_nvidia_ai_endpoints/_common.py", line 439, in _try_raise     raise Exception(f"{header}\n{body}") from None Exception: [500] Internal Server Error 'bool' object has no attribute 'get' RequestID: 75efc63a-f9c1-4891-b83a-c8a76987c2c8

Does anyone know how to fix this error? By the way, I also noticed while using NIM on the NVIDIA’s web interface and via API that tool calls takes so much time (28 seconds), and for example, while using Ollama, time is normal?!

Here’s also screenshot of Langsmith’s output for it:

If anyone knows how to solve this problems, please help. Thanks in advance!

calexiuk · September 4, 2024, 5:31pm

Can you provide a bit more context on how this is being called?

Topic		Replies	Views
Not connect to endpoint https://integrate.api.nvidia.com/v1 Access/Accounts nim , llama	1	385	February 17, 2025
NIM HTTP API Inference (Run Anywhere) Taking Extremely Long! Models nim , llama-31-70b-instruct , llama-31-405b-instruct , llama	1	264	September 11, 2024
ChatNVIDIA - HTTPError: 404 Client Error: Not Found Models nim	5	451	September 22, 2024
Result of nvidia nims in openai SDK and API inconsistent NVIDIA Nemotron nim , llama-31-405b-instruct , llama	0	34	January 7, 2025
Aunch NVIDIA NIM (llama3-8b-instruct) for LLMs locally Access/Accounts nim , llama3-8b-instruct	3	132	November 8, 2024
API connect Models nim , llama-31-8b-instruct , llama	1	160	September 20, 2024
NVIDIA NIM API / openai.API: Error code: 402,Cloud credits expired - Please contact NVIDIA representatives Models nim , llama-31-405b-instruct , llama	8	463	January 19, 2025
404 error on Introduction to NVIDIA NIM™ Microservices Models nim	2	19	August 19, 2025
NVIDIA NIM 마이크로서비스 및 LangChain으로 AI 에이전트 구축하기 Technical Blog - South Korea nim , llama	1	33	August 9, 2024
NIM Llama3 8B Instruct - Running container with "CUDA_ERROR_NO_DEVICE" cuDNN docker , nim , llama3-8b-instruct	1	61	March 28, 2025

NVIDIA NIM API invoked by Langchain returns statuscode 500

Related topics