Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: wrong argtype definitions when calling the c++ lib from python #124

Open
Naplesoul opened this issue Sep 19, 2024 · 0 comments
Open

Comments

@Naplesoul
Copy link

Naplesoul commented Sep 19, 2024

Describe the bug
std::bad_alloc when python calls convolution() in libintel_npu_acceleration_library.so
Maybe caused by the wrong argtype definition in bindings.py

To Reproduce
Steps to reproduce the behavior:
python==3.10, torch==2.4.1+cpu, torchvision==0.19.1+cpu, openvino==2024.3.0, intel_npu_acceleration_library==1.3.0
run the following python code:

import torch
from torchvision import models
import intel_npu_acceleration_library

pytorch_model = models.resnet50(weights=models.ResNet50_Weights.IMAGENET1K_V1)
optimized_model = intel_npu_acceleration_library.compile(pytorch_model, dtype=torch.int8)
input = torch.rand(1, 3, 224, 224, dtype=torch.float32)
output = optimized_model(input)

output:

terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

Possible Reason
stack trace:
ps: -O0 -ggdb is added in CMakeLists.txt to export debug symbols and install the lib from src by python setup.py install
run python with gdb and get the following stack backtrace when throwing std::bad_alloc:
image
[BUG]: in stack 15 convolution, argument pad_begins_size=140733193388034 (it is 0x7fff00000002)

In the source of 1.3.0 and later, the argtypes in intel_npu_acceleration_library/backend/bindings.py do not match with the ones in src/bindings.cpp.

image image

In the python file, the sizes are defined as ctypes.c_int, while in the c++ source, they are defined as size_t, which are incompatible with each other.

Possible Solution
Change the argtypes to ctypes.c_uint64 will fix the bug.
BTW, many other argtypes are wrong in bindings.py, e.g. dim0 & dim1 in linear(), they have not been fixed yet in the latest commit.

Platform

  • HW: Intel Ultra 9 185H (Meteor Lake)
  • OS: Ubuntu 22.04
  • Version 1.3.0 and later.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant