Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e5-mistral evaluation via vLLM #1270

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

robertgshaw2-neuralmagic
  • Launch server:
vllm serve intfloat/e5-mistral-7b-instruct
  • Run eval script:
python3 run-e5.py


import numpy as np
from typing import Any
from openai import AsyncOpenAI
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should move imports of external library inside the class to make it optional

languages=None,
similarity_fn_name=None,)

tasks = mteb.get_tasks(tasks=TASKS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When this file will be imported it will try to evaluate model. I think this should be removed

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very happy to see this PR! We already have an implementation of e5 mistral. However, I assume this is notably faster, but it would be nice we an actual comparison of both scores and speed.

We also implement models in mteb/models. I believe we could do this here as well.

I assume it produces different scores (given vLLM) so it might be worth changing the name (related to #1211), but we can take that after a test.

@robertgshaw2-neuralmagic
Copy link
Author

Very happy to see this PR! We already have an implementation of e5 mistral. However, I assume this is notably faster, but it would be nice we an actual comparison of both scores and speed.

We also implement models in mteb/models. I believe we could do this here as well.

I assume it produces different scores (given vLLM) so it might be worth changing the name (related to #1211), but we can take that after a test.

Hey @KennethEnevoldsen - apologies, I did not meant to submit this PR onto upstream it was just a simple POC for myself to eval the embedding models running in vLLM. I managed to match the scores running through GritLM on a few STS tasks. I made this to automate correctness testing in vLLM

If this is something you would like to see in MTEB, I will clean it up

@KennethEnevoldsen
Copy link
Contributor

If this is something you would like to see in MTEB, I will clean it up

If you can get it to match the performance at a faster speed I would love to have it added (cc @Muennighoff; it might be worth integrating this in GritLM)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants