Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oidc-exchange: warn on reusable workflow #306

Open
wants to merge 2 commits into
base: unstable/v1
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ repos:
WPS102,
WPS110,
WPS111,
WPS202,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: I disabled this check because it's noisy on what (IMO) are reasonable levels of complexity within files: it's unhappy that oidc-exchange.py contains 9 > 7 module members, but I think decomposing those members further would make the code harder, not easier to read.

I can re-enable this and figure out a workaround, but I figured I'd leave a rationale here 🙂

Copy link
Member

@webknjaz webknjaz Dec 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@woodruffw I prefer rationale as a code comment where it's disabled. Though, ideally, # noqas should be used in place rather than disabling rules globally. There's also per-file ignores in flake8. But if the complexity level is known to be higher across the project, there's --max-module-members that can be increased instead of disabling the rule too: https://wemake-python-styleguide.readthedocs.io/en/latest/pages/usage/violations/complexity.html#wemake_python_styleguide.violations.complexity.TooManyModuleMembersViolation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for decomposing the members, I don't think it'd make it harder to read. We just have to choose wisely. For example, the helpers for interfacing with GHA (like messages/warnings) could as well be put into a shared module that more scripts would reuse. This wouldn't make it harder to reason about.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the helpers for interfacing with GHA (like messages/warnings) could as well be put into a shared module that more scripts would reuse. This wouldn't make it harder to reason about.

That's true, although in that case I think we need to switch to a whole package structure here, rather than just Python files in the repo root 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was going to look into it at some point. Although, having it in the root as the first step isn't half-bad. We may need to set a PYTHONPATH or turn the invocation into a python -m modulename for it to be importable until the structure is changed.

WPS305,
WPS326,
WPS332,
Expand Down
66 changes: 60 additions & 6 deletions oidc-exchange.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
import json
import os
import sys
import typing
from http import HTTPStatus
from pathlib import Path
from typing import NoReturn
from urllib.parse import urlparse

import id # pylint: disable=redefined-builtin
Expand Down Expand Up @@ -90,6 +90,28 @@
See https://docs.pypi.org/trusted-publishers/troubleshooting/ for more help.
"""

_REUSABLE_WORKFLOW_WARNING = """
The claims in this token suggest that the calling workflow is a reusable workflow.

In particular, this action was initiated by:

{job_workflow_ref}

Whereas its parent workflow is:

{workflow_ref}

Reusable workflows are **not currently supported** by PyPI's Trusted Publishing
functionality, and are subject to breakage. Users are **strongly encouraged**
to avoid using reusable workflows for Trusted Publishing until support
becomes official.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would

Suggested change
becomes official.
becomes official. Please, do not report bugs if this breaks.

make this message stronger?


For more information, see:

* https://docs.pypi.org/trusted-publishers/troubleshooting/#reusable-workflows-on-github
* https://github.com/pypa/gh-action-pypi-publish/issues/166
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about asking the users to subscribe for updates. Can you think of some wording to communicated that?

"""

# Rendered if the package index's token response isn't valid JSON.
_SERVER_TOKEN_RESPONSE_MALFORMED_JSON = """
Token request failed: the index produced an unexpected
Expand All @@ -110,7 +132,7 @@
""" # noqa: S105; not a password


def die(msg: str) -> NoReturn:
def die(msg: str) -> typing.NoReturn:
with _GITHUB_STEP_SUMMARY.open('a', encoding='utf-8') as io:
print(_ERROR_SUMMARY_MESSAGE.format(message=msg), file=io)

Expand All @@ -122,6 +144,14 @@ def die(msg: str) -> NoReturn:
sys.exit(1)


def warn(msg: str) -> None:
with _GITHUB_STEP_SUMMARY.open('a', encoding='utf-8') as io:
print(msg, file=io)

msg = msg.replace('\n', '%0A')
print(f'::warning::Potential workflow misconfiguration: {msg}', file=sys.stderr)


def debug(msg: str):
print(f'::debug::{msg.title()}', file=sys.stderr)

Expand Down Expand Up @@ -161,13 +191,15 @@ def assert_successful_audience_call(resp: requests.Response, domain: str):
)


def render_claims(token: str) -> str:
def extract_claims(token: str) -> dict[str, typing.Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would object work here?

Suggested change
def extract_claims(token: str) -> dict[str, typing.Any]:
def extract_claims(token: str) -> dict[str, object]:

I learned recently that MyPy docs recommend this instead of Any, whenever possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so! TIL they recommend that, I've been using Any for years.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been exposed to more MyPy things this year, between observing what the pytest and aiohttp co-maintainers do, and experimenting a lot with MyPy coverage reports which I didn't know were a thing a few years ago — I even started uploading those to Codecov and will hopefully bring that practice here.

@woodruffw could you also apply object to other places that I didn't explicitly mark?

_, payload, _ = token.split('.', 2)

# urlsafe_b64decode needs padding; JWT payloads don't contain any.
payload += '=' * (4 - (len(payload) % 4))
claims = json.loads(base64.urlsafe_b64decode(payload))
return json.loads(base64.urlsafe_b64decode(payload))


def render_claims(claims: dict[str, typing.Any]) -> str:
def _get(name: str) -> str: # noqa: WPS430
return claims.get(name, 'MISSING')

Expand All @@ -182,6 +214,21 @@ def _get(name: str) -> str: # noqa: WPS430
)


def warn_on_reusable_workflow(claims: dict[str, typing.Any]) -> None:
# A reusable workflow is identified by having different values
# for its workflow_ref (the initiating workflow) and job_workflow_ref
# (the reusable workflow).
if claims.get('workflow_ref') == claims.get('job_workflow_ref'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, store these into vars and reuse below?

return

warn(
_REUSABLE_WORKFLOW_WARNING.format(
job_workflow_ref=claims.get('job_workflow_ref'),
workflow_ref=claims.get('workflow_ref'),
),
)


def event_is_third_party_pr() -> bool:
# Non-`pull_request` events cannot be from third-party PRs.
if os.getenv('GITHUB_EVENT_NAME') != 'pull_request':
Expand Down Expand Up @@ -223,12 +270,19 @@ def event_is_third_party_pr() -> bool:
oidc_token = id.detect_credential(audience=oidc_audience)
except id.IdentityError as identity_error:
cause_msg_tmpl = (
_TOKEN_RETRIEVAL_FAILED_FORK_PR_MESSAGE if event_is_third_party_pr()
_TOKEN_RETRIEVAL_FAILED_FORK_PR_MESSAGE
if event_is_third_party_pr()
else _TOKEN_RETRIEVAL_FAILED_MESSAGE
)
for_cause_msg = cause_msg_tmpl.format(identity_error=identity_error)
die(for_cause_msg)


# Perform a non-fatal check to see if we're running on a reusable
# workflow, and emit a warning if so.
oidc_claims = extract_claims(oidc_token)
warn_on_reusable_workflow(oidc_claims)

# Now we can do the actual token exchange.
mint_token_resp = requests.post(
token_exchange_url,
Expand All @@ -255,7 +309,7 @@ def event_is_third_party_pr() -> bool:
for error in mint_token_payload['errors']
)

rendered_claims = render_claims(oidc_token)
rendered_claims = render_claims(oidc_claims)

die(
_SERVER_REFUSED_TOKEN_EXCHANGE_MESSAGE.format(
Expand Down
Loading