Feature #10899
closed
Specify meaningful reasons for job termination
Added by Evgeny Novikov over 3 years ago.
Updated over 2 years ago.
Description
Recently Scheduler began to report pretty issues with computational resources when starting new job solution (#8286). This should also be the case when terminating jobs. At the moment there is something like:
Job failed 3a599cb3-d143-4e07-8bb5-ebb637750385: 'Execution of job 3a599cb3-d143-4e07-8bb5-ebb637750385 terminated with an exception: Exited with exit code: 1'
that does not help users much.
This is the case for memory and disk space at least. Sometimes it can take a lot of time to understand the issue.
- Target version changed from 3.3 to 3.4
- Status changed from New to Resolved
- Assignee changed from Ilja Zakharov to Evgeny Novikov
I did this as well as a bunch of minor related fixes/improvements in branch more-user-friendly-scheduler. Let's see on testing results.
- Status changed from Resolved to Closed
Tests passed. So I merged the branch to master in fa62e27.
BTW, I got rid of option "Ignore other instances of Klever Core" since now Native Scheduler always removes former job/task working directories if so, and Core should not care about them anymore. Rationale: in most cases these working directories were removed anyway (by default that option was enabled in the debug mode while working directories are removed completely in the production mode). So this change simplified a workflow and allowed to get rid of some code both in Native Scheduler and Bridge/Core.
Also available in: Atom
PDF