Project

General

Profile

Actions

Feature #10899

closed

Specify meaningful reasons for job termination

Added by Evgeny Novikov over 2 years ago. Updated about 2 years ago.

Status:
Closed
Priority:
High
Category:
Scheduling
Target version:
Start date:
08/05/2021
Due date:
% Done:

0%

Estimated time:
Published in build:

Description

Recently Scheduler began to report pretty issues with computational resources when starting new job solution (#8286). This should also be the case when terminating jobs. At the moment there is something like:

Job failed 3a599cb3-d143-4e07-8bb5-ebb637750385: 'Execution of job 3a599cb3-d143-4e07-8bb5-ebb637750385 terminated with an exception: Exited with exit code: 1'

that does not help users much.

Actions #1

Updated by Evgeny Novikov over 2 years ago

This is the case for memory and disk space at least. Sometimes it can take a lot of time to understand the issue.

Actions #2

Updated by Evgeny Novikov over 2 years ago

  • Target version changed from 3.3 to 3.4
Actions #3

Updated by Evgeny Novikov about 2 years ago

  • Status changed from New to Resolved
  • Assignee changed from Ilja Zakharov to Evgeny Novikov

I did this as well as a bunch of minor related fixes/improvements in branch more-user-friendly-scheduler. Let's see on testing results.

Actions #4

Updated by Evgeny Novikov about 2 years ago

  • Status changed from Resolved to Closed

Tests passed. So I merged the branch to master in fa62e27.

BTW, I got rid of option "Ignore other instances of Klever Core" since now Native Scheduler always removes former job/task working directories if so, and Core should not care about them anymore. Rationale: in most cases these working directories were removed anyway (by default that option was enabled in the debug mode while working directories are removed completely in the production mode). So this change simplified a workflow and allowed to get rid of some code both in Native Scheduler and Bridge/Core.

Actions

Also available in: Atom PDF