Project

General

Profile

Actions

Feature #8466

closed

Improve error messages when Native Scheduler terminates jobs because of reached computational resource limits

Added by Evgeny Novikov about 7 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
Scheduling
Target version:
Start date:
09/29/2017
Due date:
% Done:

0%

Estimated time:
Published in build:

Description

For instance, in case of outs of memory there is such the error message for the Corrupted status:

Execution of job caf96491839144a5566b4145e95df133 terminated with an exception: Execution of job caf96491839144a5566b4145e95df133 finished with non-zero exit code: 9

that doesn't explain well what happened and what users should to do.

Moreover, I would like to see special statuses in such the cases since Corrupted is a special status reflecting some issues within Core and its components. But in case of reaching computational resource limits there can be no such issues.


Related issues 2 (0 open2 closed)

Related to Klever - Bug #8460: Klever stops solving tasksClosedIlja Zakharov09/27/2017

Actions
Related to Klever - Feature #8467: Reduce the default number of task generator workersClosedEvgeny Novikov09/29/2017

Actions
Actions #1

Updated by Ilja Zakharov about 7 years ago

  • Status changed from New to Resolved
  • Assignee set to Ilja Zakharov

Done in 8466-termination-reason.

Actions #2

Updated by Ilja Zakharov about 7 years ago

I only added an error message. The status is correctly passed by the scheduler as "error". To provide more information it is necessary to either change the exchange format or parse error messages. My opinion is that is better to just keep it as is.

Actions #3

Updated by Evgeny Novikov about 7 years ago

  • Priority changed from High to Urgent
  • Target version set to 1.0

Let's include this useful feature in Klever version:0.3.

Actions #4

Updated by Evgeny Novikov about 7 years ago

  • Status changed from Resolved to Closed

I merged the branch to master in a530e6c0 although reported error messages aren't user friendly, but let's consider this one day later.

Actions

Also available in: Atom PDF