Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] How to get runtime stats in serve mode? #3052

Open
rankaiyx opened this issue Dec 1, 2024 · 1 comment
Open

[Question] How to get runtime stats in serve mode? #3052

rankaiyx opened this issue Dec 1, 2024 · 1 comment
Labels
question Question about the usage

Comments

@rankaiyx
Copy link

rankaiyx commented Dec 1, 2024

❓ General Questions

How to know the generation speed in serve mode?
Can it display relevant information in the console like chat mode?

@rankaiyx rankaiyx added the question Question about the usage label Dec 1, 2024
@rankaiyx
Copy link
Author

rankaiyx commented Dec 12, 2024

I noticed this endpoint in the following script, but it seems to report "Runtime stats: {'detail':'Not Found'}".

https://raw.githubusercontent.com/mlc-ai/mlc-llm/refs/heads/main/examples/rest/python/sample_client.py

# Get the latest runtime stats
r = requests.get("http://127.0.0.1:8000/stats")
print(f"{color.BOLD}Runtime stats:{color.END} {r.json()}\n")

Has it been abandoned?
Is there an enabling method or alternative method?

@rankaiyx rankaiyx changed the title [Question] How to know the generation speed in serve mode? [Question] How to get runtime stats in serve mode? Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question about the usage
Projects
None yet
Development

No branches or pull requests

1 participant