Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FusedLinearCrossEntropy #2485

Closed
wants to merge 6 commits into from
Closed

Add FusedLinearCrossEntropy #2485

wants to merge 6 commits into from

Conversation

FindHao
Copy link
Member

@FindHao FindHao commented Oct 2, 2024

As discussed in pytorch/pytorch#136168, I'm going to migrate implementations of operator benchmarking. This PR adds different implementations for FusedLinearCrossEntropy as a starting example.

Execution command:

python run_benchmark.py triton --op FusedLinearCrossEntropy

Example output:

x_val    LMHeadCE-latency    LigerLMHeadCE-latency    inductor_fused_linear_cross_entropy-latency
-------  ------------------  -----------------------  ---------------------------------------------
      0             98.0041                  389.87                                         95.0412
      1            196.12                    652.619                                       193.219
      2            417.242                  1248.75                                        416.725
      3            824.906                  2356.25                                        809.56

BenchmarkOperator,
register_benchmark,
)
from liger_kernel.transformers.fused_linear_cross_entropy import LigerFusedLinearCrossEntropyLoss
Copy link
Contributor

@xuzhao9 xuzhao9 Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably also need to setup how to install linger_kernel (e.g., adding it as a submodule, or install it in https://github.com/pytorch/benchmark/blob/main/userbenchmark/triton/install.py).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the installation for liger-kernel in
9580b0f

Copy link
Member Author

@FindHao FindHao Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, we need to skip this operator until the pinned transformers version bump . because liger-kernel https://github.com/linkedin/Liger-Kernel/blob/main/pyproject.toml#L23 requires "transformers>=4.44.2".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumping transformer version in #2488

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The transformer version has been updated, can you help rebase this PR on the trunk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@xuzhao9
Copy link
Contributor

xuzhao9 commented Oct 3, 2024

Is liger_kernel only available in OSS but not fbcode? If so, we probably need to manually bypass the internal CI.
By default, the internal CI will load and test all operators, but linger_kernel is not available in fbcode.

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@FindHao has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@FindHao merged this pull request in dde8528.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants