Project

General

Profile

Feature #8286

Clarify what resource limits are exceeded

Added by Evgeny Novikov almost 4 years ago. Updated 27 days ago.

Status:
New
Priority:
High
Assignee:
Category:
Scheduling
Target version:
Start date:
07/11/2017
Due date:
% Done:

0%

Estimated time:
Published in build:

Description

At the moment it is unclear what resource limits are too high since Native Scheduler doesn't clarify this:

Given resource limits are two high, we do not have such amount of resources

Besides, it seems that it may be unclear when there is not enough disk space for solving verification jobs (the appropriate error in this case is "Execution of job 0dec5cb3-4d64-4cf8-bc44-37e25ebc2d1f terminated with an exception: Exited with exit code: 1").


Related issues

Has duplicate Klever - Feature #10610: Clarify resource limits errorRejected12/03/2020

Actions
#1

Updated by Evgeny Novikov over 3 years ago

  • Priority changed from Urgent to High

This issue does not have such the high priority.

#2

Updated by Evgeny Novikov 5 months ago

#4

Updated by Evgeny Novikov 3 months ago

  • Target version set to 3.1

Let's do this in Klever 3.1.

#5

Updated by Evgeny Novikov 2 months ago

  • Description updated (diff)
#6

Updated by Evgeny Novikov about 2 months ago

  • Target version changed from 3.1 to 3.2

We need to release Klever 3.1 faster due to an incompatibility with Clade 3.3+ and a new OpenStack cloud.

#7

Updated by Evgeny Novikov 27 days ago

Pavel revealed that the same issue also exists when there is no enough disk space to solve verification tasks. In this case there is RP Unknown with something like this:

Raise exception:
Traceback (most recent call last):
  File "/home/novikov/work/klever/klever/core/components.py", line 395, in run
    self.main()
  File "/home/novikov/work/klever/klever/core/components.py", line 304, in callbacks_caller
    ret = attr(*args, **kwargs)
  File "/home/novikov/work/klever/klever/core/vrp/__init__.py", line 315, in fetcher
    raise RuntimeError('Failed to decide verification task: {0}'.format(self.task_error))
RuntimeError: Failed to decide verification task: Task failed 4214: SchedulerException('Execution of task 4214 terminated with an exception: Exited with exit code: 1')

Just in the scheduler log one can find the clarification:
2021-03-26 12:42:27,975 SchedulerClient  INFO> Going to solve a verification task with identifier 4214
2021-03-26 12:42:27,975 SchedulerClient  INFO> Create session for user "service" at Klever Bridge "localhost:8998" 
Reached disk memory limit of 10000B, killing process 13801
root  INFO> Submit information about the workload to Bridge
13802: Cancelling process 13801
13802: Cancellation of 13801 is successfull, exiting
2021-03-26 12:42:29,102 SchedulerClient WARNING> Traceback (most recent call last):
  File "/home/novikov/work/klever/klever/scheduler/client/__init__.py", line 105, in run_benchexec
    exit_code = solve(logger, conf, mode, srv)
  File "/home/novikov/work/klever/klever/scheduler/client/__init__.py", line 136, in solve
    return solve_task(logger, conf, srv)
  File "/home/novikov/work/klever/klever/scheduler/client/__init__.py", line 175, in solve_task
    exit_code = run(logger, args, conf, logger=logger)
  File "/home/novikov/work/klever/klever/scheduler/client/__init__.py", line 358, in run
    ec = execute(args, logger=logger, disk_limitation=dl, disk_checking_period=dcp)
  File "/home/novikov/work/klever/klever/scheduler/utils/__init__.py", line 390, in execute
    raise RuntimeError("Disk space limitation of {}B is exceeded".format(disk_limitation))
RuntimeError: Disk space limitation of 10000B is exceeded
2021-03-26 12:42:29,103 SchedulerClient  INFO> Exiting with exit code 1
root WARNING> Cannot obtain key 'solutions/Klever/4214' from key-value storage: KeyError('Key not found (solutions/Klever/4214)')
root  INFO> Going to check execution of the task 4214
root  INFO> Future processor of task 4214 returned 1
root WARNING> Exited with exit code: 1
root WARNING> Task failed 4214: SchedulerException('Execution of task 4214 terminated with an exception: Exited with exit code: 1')

Also available in: Atom PDF