-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: latest comfyui; fix: better GPU utilization for SD15 #270
Conversation
Despite the name, `--novram` still allows the GPU to be used. However, comfyui uses this flag to much more aggressively avoid leaving tensors in VRAM. I am hoping that this will reduce VRAM OOMs and/or shared memory usage (in windows).
With some recent comfyui changes it appears that the logic prior to this commit was not aggressive enough to avoid OOMs with relying on comfyui's internal decision making alone. This commit causes the worker to unload models from VRAM immediately after an inference result (if it is not about to be used) and right before post processing. Post processing as implemented today almost always overestimates the amount of free VRAM, and tends to cause OOMs or shared memory usage (on window) so more proactively unloading the model should help minimize that problem.
The worker seems to be holding onto too much system RAM on average. I previously relied on comfyui internals to handle this implicitly but recent changes seem to have broken some assumptions I was making. This is an purposely over-zealous attempt to keep system RAM usage down.
Redefines the broken existing `high_memory_mode` to leverage the recent memory management extension
This will clarify when the situations such as the shared model manager failing to load or no models being found occur (e.g., when download_models.py isn't)
- More fallback logic if there are jobs popped, processes available, but nothing happening. - Resolves certain problems with the unresponsive logic - The case of it ending all jobs after a long period of "No Job" messages from the server followed by successful pops. - Now no longer shuts down in error while processes are restarting
Tracks the time spent without any available jobs. This will help worker operators identify potential issues with their configuration. A warning will be logged if the worker spends more than 5 minutes without any jobs, suggesting possible actions to increase job demand.
@CodiumAI-Agent /review |
PR Reviewer Guide 🔍(Review updated until commit f89812c)
|
Persistent review updated to latest commit f89812c |
Changes/fixes:
"Flux.1-Schnell fp8 (Compact)"
to yourmodels_to_load
to offer.bridgeData_template.yaml
for clarity and new configuration options.extra_slow_worker
,limit_max_steps
,unload_from_vram_often
,high_memory_mode
Relies on:
7df42b9a
hordelib#308interrupt_current_processing
from comfyui hordelib#311aggressive_unloading
arg toHordeLib.__new__(...)
hordelib#319ca085976
hordelib#328