Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for fnmatch patterns in --exclude args #411

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rdbisme
Copy link

@rdbisme rdbisme commented Feb 22, 2023

Context: need to build a wheel where I exclude libarrow libraries. I'd love to avoid changing the build script everytime arrow is updated. Using libarrow* would make the repair script more future proof.

The PR also integrates a couple of commit that I needed to add to make checks and test work.

@codecov
Copy link

codecov bot commented Feb 22, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change

Comparison is base (3b26567) 92.47% compared to head (b0e22dc) 92.48%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #411   +/-   ##
=======================================
  Coverage   92.47%   92.48%           
=======================================
  Files          23       23           
  Lines        1289     1290    +1     
  Branches      300      300           
=======================================
+ Hits         1192     1193    +1     
  Misses         56       56           
  Partials       41       41           
Impacted Files Coverage Δ
src/auditwheel/repair.py 86.82% <100.00%> (+0.10%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@auvipy auvipy requested a review from mayeut March 4, 2023 12:25
@mayeut
Copy link
Member

mayeut commented Mar 5, 2023

I'm not sure we want to allow that.

There's probably a high chance that you also require to change a requirement somewhere else (are you using pyarrow to bring libarrow libraries ?) and keeping this a non wildcard feature probably is a good thing to keep things in check.

@rdbisme, please do expand on the use-case if I misinterpreted something here.

+cc @lkollar to get another opinion.

@lkollar
Copy link
Contributor

lkollar commented Mar 5, 2023

I'm with @mayeut here. This could make it really easy to accidentally exclude a library and produce a bad wheel. But yes, maybe we are missing something, so a real-world example would help.

@rdbisme
Copy link
Author

rdbisme commented Mar 5, 2023

The use case is shipping a wheel of a package containing a compiled extension that leverages the Arrow C++ API and links against the libarrow<version>.so and libarrow_python.so libraries coming from pyarrow. Since, indeed, they are shipped by the pyarrow package, we don't want to embed them in the wheel (even if there is a problem with exclude that is: the detected manylinux policy doesn't take into account the excluded libraries, therefore it falls back to manylinux_2_5 while it should be manylinux_2_17 as pyarrow is manylinux_2_17. As a reference, relocate seems to ship a copy_filt_func to handle this?)

Right now, in our dependency specification, we specify just pyarrow. Without version pinning. That means whenever pyarrow will get an update, We'll need to update the cibuildwheel repair script in which I hard-coded libarrow1100.so and libarrow_python.so (or I could hack a script to introspect the version and produce the right name for the library dynamically... but I thought this should be eventually handled at the auditwheel level). Or we will accidentally onboard them and make the wheel very fat.

Allowing fnmatch like patterns, will allow to not touch the repair script anymore potentially. And won´t change the current behaviour, won´t it? People that want to be more safe, will be able to keep listing fully qualified library names?

Otherwise, what workflow do you suggest? Pinning pyarrow version in the runtime dependencies, in the build-system.requires and in the repair script? That looks a lot of places to keep in sync, isn´t it?

@mayeut
Copy link
Member

mayeut commented Apr 8, 2023

@rdbisme,
thanks for the detailed answer & sorry for the long delay to answer it.

the detected manylinux policy doesn't take into account the excluded libraries, therefore it falls back to manylinux_2_5 while it should be manylinux_2_17 as pyarrow is manylinux_2_17 only

I don't have a clear view on this one. There's nothing to say pyarrow couldn't be built with a compatible ABI on glibc linux<2.17 (or even if there's a compatible manylinux_2_5 on PyPI)
The easiest is probably to just use the --only-plat to enforce the wheel only get's the manylinux_2_17 tag rather than manylinux_2_5.manylinux_2_17.

Otherwise, what workflow do you suggest? Pinning pyarrow version in the runtime dependencies, in the build-system.requires and in the repair script?

Yes, you do have a hard requirement on pyarrow version in your wheel, it should be reflected everywhere that matters.
I agree that it's painful but I don't see a safe way around that.
It might not be required in the build-system.requires but in this case it should be dynamically updated in the runtime dependencies and repair script at build time.
The most concerning is to get the upper bound pinning right depending how arrow manages that.

That looks a lot of places to keep in sync, isn´t it?

Yes it does and would most likely require some script to keep in sync rather than manually update all of these places which would be error prone. The build might also be able to generate the runtime dependencies and the repair script for example ?

Or we will accidentally onboard them and make the wheel very fat.

This one could benefit from an fnmatch for a new blacklist option but that would be a new option.

Allowing fnmatch like patterns, will allow to not touch the repair script anymore potentially. And won´t change the current behaviour, won´t it? People that want to be more safe, will be able to keep listing fully qualified library names?

Yes, It would't change the current behavior however, IMHO, accepting fnmatch sends a message that "it's ok to use fnmatching without taking care of version compatibility" and this will ultimately lower the quality of the wheels uploaded to PyPI with end-users trying to figure out how to pin the correct versions of their dependencies when it should be reflected in the wheel's metadata.

@rdbisme
Copy link
Author

rdbisme commented Jun 26, 2023

Or we will accidentally onboard them and make the wheel very fat.

This one could benefit from an fnmatch for a new blacklist option but that would be a new option.
How do you think this blacklist option should work? I can try to implement it (since I'm again dealing with pyarrow and the will of not onboarding the binaries since they're shipped with pyarrow wheel).

How do you think this blacklist option should work? I can try to implement it (since I'm again dealing with pyarrow and the will of not onboarding the binaries since they're shipped with pyarrow wheel).

@kroggen
Copy link

kroggen commented Apr 20, 2024

Why this was not merged yet?
I thought this was already working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants