-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImportError: No module named 'nvdiffrast_plugin' #46
Comments
I install nvdiffrast in my own docker and I install dependencies as the Dockerfile, but this issue still exists. |
It looks like the building of plugin somehow fails silently. This should not happen with the ninja build system, and without an error message telling what went wrong, it is difficult to debug the issue. Just to double check: Are you seeing this behavior using the provided docker setup or only in your own? |
I also meet this problem! Could someone tell me how to solve this problem? |
Hi @LCY850729436, can you be a bit more specific? Is this with the Docker configuration provided by us, or in a different environment? If latter, do you have the Ninja build system installed? |
I have solved this problem. I think the problem should be the version adaptation of GPU to CUDA. This problem occurs when I use 2080ti, but not when I use Titan. |
I use a 3090 GPU |
I use two 2080ti on docker, same problem occured! |
Hi everyone, I'm eager to help in solving this problem, but more information is needed of what exactly goes wrong. We know there are plenty of working installations out there, so something must be different in the setups that exhibit this problem. To start, I repeat my question to everyone that experiences this problem: Is this with the Docker configuration provided by us, or in a different environment? If latter, do you have the Ninja build system installed? Second, I would like to ask you to change Finally, if someone has seen this problem and found a way to fix it, please share your solution. The error indicates that the nvdiffrast C++/Cuda plugin could not be loaded, and the most likely reason is that it could not be compiled. I imagine this could occur for a variety of reasons, and therefore there could be multiple different root causes for the same issue. |
Hi @s-laine, I use the Docker conf provided by you as below: ARG BASE_IMAGE=pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel RUN apt-get update && apt-get install -y --no-install-recommends #x forward update ENV PYTHONDONTWRITEBYTECODE=1 #for GLEW #nvidia-container-runtime #Default pyopengl to EGL for good headless rendering support COPY docker/10_nvidia.json /usr/share/glvnd/egl_vendor.d/10_nvidia.json RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple imageio imageio-ffmpeg COPY nvdiffrast /tmp/pip/nvdiffrast/ And when I run 'triangle.py' the importError will happen. |
@xjcvip007, thank you for the information. It appears that you are not running the Dockerfile provided in our repo, as the base image in yours is Can you try the same experiment with a container built using our Dockerfile? |
@s-laine, I can not use your default dockerfile for our gpu cloud platform support, so we change the base image from pytorch/pytorch:1.7.1-cuda11.0-cudnn8-devel to pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel, and add some installation for our sshd support, but all needed config and file included in the dockerfile. |
I tried this with a Linux machine, and I'm unfortunately unable to replicate the problem even when using your Dockerfile (with the missing backslashes added, and imageio/imageio-ffmpeg installed from the default source). My test machine has the following operating system, as reported by And
As the container looks to be fine, I'm suspecting you may have outdated graphics drivers, because those depend on the host operating system instead of the container. Alternatively, building the container does not produce the same result for one reason or another, but I don't know enough about docker to tell why this might happen. What I don't understand is why there are no useful error messages so I still don't know what exactly fails when you try to run the example. For reference, below is the exact Dockerfile that I used: ARG BASE_IMAGE=pytorch/pytorch:1.6.0-cuda10.1-cudnn7-devel
FROM $BASE_IMAGE
RUN apt-get update && apt-get install -y --no-install-recommends \
pkg-config \
libglvnd0 \
libgl1 \
libglx0 \
libegl1 \
libgles2 \
libglvnd-dev \
libgl1-mesa-dev \
libegl1-mesa-dev \
libgles2-mesa-dev \
cmake \
curl \
build-essential \
git \
curl \
vim \
wget \
ca-certificates \
libjpeg-dev \
libpng-dev \
apt-utils \
bzip2 \
tmux \
gcc \
g++ \
openssh-server \
software-properties-common \
xauth \
zip \
unzip \
&& apt-get clean
#x forward update
RUN echo "X11UseLocalhost no" >> /etc/ssh/sshd_config \
&& mkdir -p /run/sshd
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
#for GLEW
ENV LD_LIBRARY_PATH /usr/lib64:$LD_LIBRARY_PATH
#nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility,graphics
#Default pyopengl to EGL for good headless rendering support
ENV PYOPENGL_PLATFORM egl
COPY docker/10_nvidia.json /usr/share/glvnd/egl_vendor.d/10_nvidia.json
RUN pip install imageio imageio-ffmpeg
#RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple imageio imageio-ffmpeg
COPY nvdiffrast /tmp/pip/nvdiffrast/
COPY README.md setup.py /tmp/pip/
RUN cd /tmp/pip && pip install . I built the container with I also tried launching a shell into the container by doing
and running |
@s-laine thanks for your effort, I will try the dockerfile on the more new graphics drivers, and below is my 'nvidia-smi' result: |
This appears to be an incompatibility between PyTorch and the C++ compiler in the Linux distribution. A discussion here mentions this error when trying to build PyTorch extensions on Arch Linux. So this issue isn't specific to nvdiffrast, but prevents the building of any C++ based PyTorch extensions on your system. If PyTorch refuses to work with the compiler on the system, there unfortunately isn't anything we can do about it. We recommend using an Ubuntu distribution as that's what we have tested everything on. |
I have solved the problem. I meet the problem on Windows, and this is due to ninja fails to compile the plugin. |
|
Got same problems on Ubuntu18.04, WSL2(Windows Subsystem for Linux), with RTX3060_laptop. |
OpenGL/Cuda interop isn't currently supported in WSL2 and thus it won't be able to run the OpenGL rasterizer in nvdiffrast. The next release of nvdiffrast will include a Cuda-based rasterizer that sidesteps the compatibility issues on platforms where OpenGL doesn't work. The release should be out early next week. |
The Cuda rasterizer is now released in v0.3.0. Documentation notes here. |
有人在windows上成功解决这个问题了吗,或者Linux上 |
I have an interesting experience when using I used to download the CuDNN and add the path to the system environment variables. The way I add CuDNN to So I manually compiled the nvdiffrast via I think my case may not cover the general case, but I hope my sharing can help some people who makes the same mistakes like me. |
@icewired-yy Thanks for the report! Nvdiffrast does not do anything special about CuDNN or look for the related environment variables, but PyTorch's cpp extension builder seems to have some logic related to it here. Upon a quick glance, it looks like PyTorch expects Good to have this noted here if others bump into the same issue. |
I spent... like 10 hours trying to get this to work today on Windows 10 and Visual Studio 2022 using Git Bash (note the unix style c paths). I was able to solve the ninja compilation issues with: # fixes functional crtdbg.h basetsd.h
export INCLUDE="/c/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/include:/c/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/ucrt:/c/Program Files (x86)/Windows Kits/10/Include/10.0.22621.0/shared"
# fixes kernel32.Lib ucrt.lib
export LIB="/c/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/lib/x64/:/c/Program Files (x86)/Windows Kits/10/Lib/10.0.22621.0/um/x64/:/c/Program Files (x86)/Windows Kits/10/Lib/10.0.22621.0/ucrt/x64" There are no more errors building the plugin with ninja. However, I still see:
when building the plugin from threestudio on export. |
Has anyone found a fix?
I clone this repo and run
But then if I try to import:
I get the same issue:
EDIT: |
Long story short, nvdiffrast_plugin is build against
|
@s-laine @nurpax @jannehellsten I think this should be mentioned in official documentation as one of the pre-installation steps. As many ML-activists (you included) have different CUDA versions for different projects, guessing where things went wrong might be sometimes tricky. |
I have the same problem, also 3090, looking at the comments, most people have the problem with 3090 |
Successfully solve the problem. Here is my full solution and referrence link:
Solved! |
我解决了这个问题: 注意,我这里还有一个问题,就是安装cuda12.1.0后,在系统环境变量中,没有自动生成CUDA_PATH环境变量,需要手动指向cuda12.1.0的目录。这也是为什么在ninja时,nvcc使用的是conda的nvcc,导致ninja编译失败! |
When I run codes in ./samples/torch,there is always an error: No module named 'nvdiffrast_plugin'
Traceback (most recent call last):
File "triangle.py", line 21, in
glctx = dr.RasterizeGLContext()
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 142, in init
self.cpp_wrapper = _get_plugin().RasterizeGLStateWrapper(output_db, mode == 'automatic')
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/nvdiffrast/torch/ops.py", line 83, in _get_plugin
torch.utils.cpp_extension.load(name=plugin_name, sources=source_paths, extra_cflags=opts, extra_cuda_cflags=opts, extra_ldflags=ldflags, with_cuda=True, verbose=False)
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1091, in load
keep_intermediates=keep_intermediates)
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1317, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "/opt/conda/envs/fomm/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1706, in _import_module_from_library
file, path, description = imp.find_module(module_name, [path])
File "/opt/conda/envs/fomm/lib/python3.7/imp.py", line 299, in find_module
raise ImportError(_ERR_MSG.format(name), name=name)
ImportError: No module named 'nvdiffrast_plugin'
It seems like that some packages are lost.
I install nvdiffrast as the instruction in document ----cd ./nvdiffrast and pip install .
I uninstall and install many times but this error still exists. I try installing in cuda10.0, torch 1.6, cuda11.1, torch 1.8.1, and Cuda 9.0, torch 1.6, but all these situations have this error. I use an Nvidia 3090 GPU.
Is there anyone who can solve this problem? Thanks.
The text was updated successfully, but these errors were encountered: