Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split file utils into dedicated submodule #1873

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

MusicalNinjaDad
Copy link
Contributor

@MusicalNinjaDad MusicalNinjaDad commented Jun 12, 2024

This is a first step towards #1857 designed to get feedback and validate the approach.

Considerations

  1. Don't make a breaking change: anyone currently using from cibuildwheel.util import *, from cibuildwheel.util import foo or import cibuildwheel.util should not notice the change. This has driven the implementation of util/__init__.py Based on comments below by @mayeut & @henryiii ... Expect from util import x to be explicit in future: Adjust the relevant imports within the codebase while keeping __all__ in place for cases where from utils import * is in use.
  2. Split by family: grouping functions based on the domain that they support rather than keeping functions together that rely on each other. This means we will be importing between util submodules - via from util.submodule import bar. To me it feels like it is more likely to find valid and obvious fracture plains with this approach. Also if something is worth an independent function then it most likely either expects reuse in multiple contexts, abstracts away something that doesn't belong in the original flow but belongs somewhere else, or both.
  3. Provide insights to developers in IDE: prefer docstrings to inline comments to explain the "what" as these are rendered on mouse-over etc. Use comments to explain the "why" behind specfic design / implementation decisions

Checks performed

  • pylint doesn't produce any E0611 (no-name-in-module) errors
  • pylint doesn't produce any errors including the term "import" which are related to the these changes
  • pytest unit_test passes

Relying on the pipeline to perform more extensive integration testing

@MusicalNinjaDad
Copy link
Contributor Author

@mayeut , @henryiii - does this look like the right sort of approach for splitting out utils? I've detailed my thought patterns in the description and would welcome your insights.

@mayeut
Copy link
Member

mayeut commented Jun 23, 2024

IMHO, we don't need the first item. cibuildwheel is not a library so this module is internal.
I would move download to files.py

@henryiii
Copy link
Contributor

henryiii commented Jul 1, 2024

I think the next step is correcting the imports so we can avoid import *.

@joerick
Copy link
Contributor

joerick commented Jul 2, 2024

Hey @MusicalNinjaDad ! Thanks for putting this together. However, I must say I'm not totally convinced by the need for this change. Admittedly, the utils module is large, but utils are generally self-contained chunks of code that have very few interdependencies - i.e. they're simple. But happy to be outvoted, perhaps I'm missing some extra context here, please let me know if so - I've been out of the project for a few weeks!

(I think perhaps this started from a problem with circular imports. Regarding that, utils should never import another module in the codebase, with the possible exception of a file of basic types or constants or something.)

Having said that, your points around docstrings are valid, and a bit of effort here could improve things for sure.

@henryiii
Copy link
Contributor

henryiii commented Jul 2, 2024

IMO, having a bit more separation here would improve readability and one massive "utils" file for everything is not ideal. (See various articles on this, like https://breadcrumbscollector.tech/stop-naming-your-python-modules-utils/).

I think it will help when this PR is closer to a final form, without the import *s.

@henryiii
Copy link
Contributor

henryiii commented Jul 2, 2024

This could also be done as utils.py and fileutils.py, rather than nested. I like nested, but utils/misc.py seems even worse than utils.py! :D

@MusicalNinjaDad
Copy link
Contributor Author

This could also be done as utils.py and fileutils.py, rather than nested. I like nested, but utils/misc.py seems even worse than utils.py! :D

The goal would be that utils/misc.py is only temporary, and a few more PRs over the (northern) summer would break it down into more meaningful units.

@MusicalNinjaDad
Copy link
Contributor Author

@joerick , I agree this is very much a style question and not a must-do.

As a new contributor I personally found that utils.py came across as a jumble of functions and it was hard jumping around in the file to find the "flow" I was trying to understand. Not a big problem, but "could be better". My personal view on style is: if you need to implement an internal structure and grouping within a module; it's worthwhile thinking about why not to split it into more logical units.

The original need was driven by a circular import with logging.py which needs to use CI-identification functionality from utils. Currently there is no logging used in utils, but I would guess that if / as additional clean-ups are done around consolidating output, error handling, etc. then this kind of circular import risk will grow.

For that reason I'm happy to invest time on this change, if it is considered worthwhile, or, at least, harmless and possibly useful.

Copy link
Contributor Author

@MusicalNinjaDad MusicalNinjaDad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, we don't need the first item. cibuildwheel is not a library so this module is internal.
I would move download to files.py

Done & done :)

@MusicalNinjaDad MusicalNinjaDad force-pushed the MusicalNinjaDad/issue1857 branch from 642a9dc to ea7577d Compare July 7, 2024 13:19
@joerick
Copy link
Contributor

joerick commented Jul 13, 2024

I think that the change is either going to be worth it done at once, or not at all. Here's a go at how we might organise things- how does this feel to people?

util/cmd.py:
    call
    shell

util/file.py:
    download
    extract_zip
    extract_tar
    move_file
    chdir
    resources_dir
    install_certifi_script
    free_thread_enable_313
    test_fail_cwd_file
    DEFAULT_CIBW_CACHE_PATH
    CIBW_CACHE_PATH

util/selectors.py:
    BuildSelector
    TestSelector
    selector_matches

util/ci.py:
    detect_ci_provider
    CIProvider
    fix_ansi_codes_for_github_actions

util/packaging.py:
    virtualenv
    _parse_constraints_for_virtualenv
    _ensure_virtualenv
    find_compatible_wheel
    BuildFrontendName
    BuildFrontendConfig
    DependencyConstraints
    get_pip_version
    combine_constraints
    find_uv
    get_build_verbosity_extra_flags
    split_config_settings

util/helpers.py:
    format_safe
    prepare_command
    strtobool
    unwrap
    parse_key_value_string
    read_python_configs

move to __main__:
    FileReport
    print_new_wheels

move to pyodide.py:
    ensure_node

move to options.py:
    MANYLINUX_ARCHS
    MUSLLINUX_ARCHS

inline where needed:
    IS_WIN

I think that could work. I don't think there are any circular dependencies there, but that is a concern with breaking the file up.

@joerick joerick marked this pull request as draft September 6, 2024 11:29
This was referenced Oct 21, 2024
@henryiii henryiii added the Hold for future release This PR might be complete, but is scheduled to be merged in a future release. Don't merge yet. label Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Hold for future release This PR might be complete, but is scheduled to be merged in a future release. Don't merge yet.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants