-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e5-mistral
evaluation via vLLM
#1270
base: main
Are you sure you want to change the base?
e5-mistral
evaluation via vLLM
#1270
Conversation
robertgshaw2-neuralmagic
commented
Oct 2, 2024
- Launch server:
- Run eval script:
|
||
import numpy as np | ||
from typing import Any | ||
from openai import AsyncOpenAI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should move imports of external library inside the class to make it optional
languages=None, | ||
similarity_fn_name=None,) | ||
|
||
tasks = mteb.get_tasks(tasks=TASKS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this file will be imported it will try to evaluate model. I think this should be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very happy to see this PR! We already have an implementation of e5 mistral. However, I assume this is notably faster, but it would be nice we an actual comparison of both scores and speed.
We also implement models in mteb/models
. I believe we could do this here as well.
I assume it produces different scores (given vLLM) so it might be worth changing the name (related to #1211), but we can take that after a test.
Hey @KennethEnevoldsen - apologies, I did not meant to submit this PR onto upstream it was just a simple POC for myself to eval the embedding models running in vLLM. I managed to match the scores running through GritLM on a few STS tasks. I made this to automate correctness testing in vLLM If this is something you would like to see in MTEB, I will clean it up |
If you can get it to match the performance at a faster speed I would love to have it added (cc @Muennighoff; it might be worth integrating this in GritLM) |