Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnxruntime-gpu doesn't find libcudnn_adv.so.9 #3

Open
yuzhichang opened this issue Jan 12, 2025 · 1 comment
Open

onnxruntime-gpu doesn't find libcudnn_adv.so.9 #3

yuzhichang opened this issue Jan 12, 2025 · 1 comment

Comments

@yuzhichang
Copy link

My Dockerfile:

#FROM pytorch/torchserve:latest-gpu
FROM fabridamicelli/torchserve:latest-gpu-python3.10

ARG NEED_MIRROR=0
ENV DEBIAN_FRONTEND=noninteractive

USER root
SHELL ["/bin/bash", "-c"]

RUN if [ "$NEED_MIRROR" == "1" ]; then \
        sed -i 's|http://archive.ubuntu.com|https://mirrors.tuna.tsinghua.edu.cn|g' /etc/apt/sources.list; \
        pip3 config set global.index-url https://mirrors.pku.edu.cn/pypi/web/simple; \
        pip3 config set global.trusted-host mirrors.pku.edu.cn; \
        export UV_INDEX="https://mirrors.pku.edu.cn/pypi/web/simple"; \
    fi && \
    pip3 install uv && \
    uv pip install -U transformers onnxruntime-gpu numpy pillow levenshtein nltk

COPY config.properties ./

The container failed to initialize a model which depends on onnxruntime-gpu:

2025-01-12T07:22:36,124 [WARN ] W-9001-bge-m3_1.0-stderr MODEL_LOG - 2025-01-12 07:22:36.122822954 [E:onnxruntime:Default, provider_bridge_ort.cc:1862 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1539 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn_adv.so.9: cannot open shared object file: No such file or directory

Some info on this container:

root@73a268bc8e2f:/home/model-server# ls /home/venv/lib
python3.10
root@73a268bc8e2f:/home/model-server# find / -name libcudnn_adv.so.9
/home/venv/lib/python3.10/site-packages/nvidia/cudnn/lib/libcudnn_adv.so.9
root@73a268bc8e2f:/home/model-server# pip list|grep onnxruntime
onnxruntime-gpu          1.20.1

However, switching back to pytorch/torchserve:latest-gpu, everything is OK:

root@09559d5a29e2:/home/model-server# ls /home/venv/lib
python3.9
root@09559d5a29e2:/home/model-server# pip list|grep onnxruntime
onnxruntime-gpu          1.19.2
root@09559d5a29e2:/home/model-server# find / -name libcudnn_adv.so.9
/home/venv/lib/python3.9/site-packages/nvidia/cudnn/lib/libcudnn_adv.so.9
@fabridamicelli
Copy link
Owner

Hi @yuzhichang,
thanks for your feedback. I'll take a look at it and get back to you as soon as I have any insights/questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants