You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The hang detection that was implemented in #88 only looks at the (lack of) output to stdout/stderr to detect whether the MPI program is hanging or not.
This may lead to too many false positives, since for CP2K for example the output can be redirected via the -o option of CP2K, meaning there is no output generated to stdout/stderr anymore?
Can we do better?
Workaround in case this occurs: use mympirun --disable-output-check-fatal
The text was updated successfully, but these errors were encountered:
@stdweird Thoughts on this, can we do better, e.g. detect somehow (reliably...) whether or not the program is creating output files?
boegel
changed the title
hang detection solely based on lack of output to stdout/stderr not good enough?
hang detection solely based on lack of output to stdout/stderr not good enough
Jul 7, 2017
yes, don't enable the check by default. i was never a big fan of this default, even with your anecdotal evidence.
replace the current default with a mode that only logs to syslog, we will evaluate in 3 to 6 months how many times this actually reported an issue. worst case, we can couple it with the monitoring to notify us and/or the user.
what you are trying to solve can only be handled at the MPI lib level.
The hang detection that was implemented in #88 only looks at the (lack of) output to stdout/stderr to detect whether the MPI program is hanging or not.
This may lead to too many false positives, since for CP2K for example the output can be redirected via the
-o
option of CP2K, meaning there is no output generated to stdout/stderr anymore?Can we do better?
Workaround in case this occurs: use
mympirun --disable-output-check-fatal
The text was updated successfully, but these errors were encountered: