[frontend] KFP v1 Pipeline Run Details Page Component Status shows "Execution was skipped" and "ML Metadata not found" #11457

ahsan-habib-ta · 2024-12-10T23:04:03Z

Environment

How did you deploy Kubeflow Pipelines (KFP)?

Kubeflow manifest 1.9.1 (multi-user)
k8s version: 1.29 (AWS EKS)

KFP version:

kubeflow pipeline 2.3.0
Argo-workflow: v3.4.17
KFP Python SDK: 1.8.2

Steps to reproduce

Crerate a KFP v1 Pipeline based on hello_world example and also disable component caching by setting max_cache_staleness = "P0D" .
Create a Experiment from UI and create a run for the hello_world pipeline.
Go to the Run Details page and wait for the run to complete and refresh the page.
Component status shows "Execution was skipped and outputs were taken from cache" though the component was executed.
Click on "ML Metadata" tab from component details section and it gives "Corresponding ML Metadata not found." though database inspection shows the metadata exists.

Expected result

Don't expect to see wrong status message "Execution was skipped and outputs were taken from cache" and expect to see Metadata information on the UI.

Materials and Reference

This issue is related to Argo-workflow version v3.4.17 because from version v3.4.0 (Changelog) argo-workflow use Pod naming format v2.

Argo-workflow v2 pod-naming format: [workflow-name]-[step-template-name]-[random-number-string] [Ref] and it's different from node_id format [workflow-name]-[random-number-string]

Because of this formatting change, function wasNodeCached fails to determine cache status [similar issue].

Updating the of the function to the following fixes the cache status issue


function wasNodeCached(node: NodeStatus): boolean {
  const artifacts = node.outputs?.artifacts;
  // HACK: There is a way to detect the skipped pods based on the WorkflowStatus alone.
  // All output artifacts have the pod name (same as node ID) in the URI. But for skipped
  // pods, the pod name does not match the URIs.
  // (And now there are always some output artifacts since we've enabled log archiving).
  const split = node.id.split('-');
  const hash = split[split.length - 1];
  const prefix = split.slice(0, split.length - 1).join('-');
  const pod_name = prefix.concat('-', node.templateName).concat('-',hash);

  return !artifacts || !node.id || node.type !== 'Pod'
    ? false
    : artifacts.some(artifact => artifact.s3 && !artifact.s3.key.includes(node.id));
    : artifacts.some(artifact => artifact.s3 && !(artifact.s3.key.includes(node.id) || artifact.s3.key.includes(pod_name)));
}

Because of the same component pod name formatting change UI also fails to fetch Metadata formation.

    const selectedExecution = mlmdExecutions?.find(
      execution => ExecutionHelpers.getKfpPod(execution) === selectedNodeId,
    );

Metadata display issue on UI can be fixed with following modification to the above

    const selectedExecution = mlmdExecutions?.find(
      execution => (ExecutionHelpers.getKfpPod(execution) === selectedNodeId || ExecutionHelpers.getKfpPod(execution) === selectedNodeName),
    );

Impacted by this bug? Give it a 👍.

The text was updated successfully, but these errors were encountered:

ahsan-habib-ta added area/frontend kind/bug labels Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[frontend] KFP v1 Pipeline Run Details Page Component Status shows "Execution was skipped" and "ML Metadata not found" #11457

[frontend] KFP v1 Pipeline Run Details Page Component Status shows "Execution was skipped" and "ML Metadata not found" #11457

ahsan-habib-ta commented Dec 10, 2024 •

edited

Loading

[frontend] KFP v1 Pipeline Run Details Page Component Status shows "Execution was skipped" and "ML Metadata not found" #11457

[frontend] KFP v1 Pipeline Run Details Page Component Status shows "Execution was skipped" and "ML Metadata not found" #11457

Comments

ahsan-habib-ta commented Dec 10, 2024 • edited Loading

Environment

Steps to reproduce

Expected result

Materials and Reference

ahsan-habib-ta commented Dec 10, 2024 •

edited

Loading