Project

General

Profile

Feature #9164

Do not fail tasks and jobs if BenchExec detects CPU throttling

Added by Evgeny Novikov about 1 year ago. Updated about 1 year ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
Scheduling
Target version:
Start date:
07/25/2018
Due date:
% Done:

0%

Estimated time:
Published in build:

Description

Sometimes we observed the following reason for failed tasks and jobs:

2018-07-25 14:39:56 (components.py:435) RP ERROR> Raise exception:
Traceback (most recent call last):
  File "/var/lib/klever/workspace/Branches-and-Tags-Processing/src/core/core/components.py", line 428, in run
    self.main()
  File "/var/lib/klever/workspace/Branches-and-Tags-Processing/src/core/core/components.py", line 332, in callbacks_caller
    ret = attr(*args, **kwargs)
  File "/var/lib/klever/workspace/Branches-and-Tags-Processing/src/core/core/vrp/__init__.py", line 250, in fetcher
    raise RuntimeError('Failed to decide verification task: {0}'.format(self.task_error))
RuntimeError: Failed to decide verification task: Execution of task 758 terminated with an exception: 2018-07-25 14:39:48,786 - WARNING - CPU throttled itself during benchmarking due to overheating. Benchmark results are unreliable!

We definitely need to ignore this.

For example I attached a directory of a failed task. One can see that everything is okay except for several warnings from BenchExec.


Files

task.tar (690 KB) task.tar Evgeny Novikov, 07/25/2018 04:19 PM

Related issues

Related to Klever - Feature #8269: Allow to use swap when solving jobs or tasksNew07/03/2017

Actions

History

#1

Updated by Evgeny Novikov about 1 year ago

  • Description updated (diff)
  • Subject changed from Do not fail tasks if BenchExec detects CPU throttling to Do not fail tasks and jobs if BenchExec detects CPU throttling
#2

Updated by Evgeny Novikov about 1 year ago

  • Related to Feature #8269: Allow to use swap when solving jobs or tasks added
#3

Updated by Evgeny Novikov about 1 year ago

  • Target version deleted (1.1)
  • Priority changed from Urgent to High

There are some workarounds to avoid these issues.

#4

Updated by Evgeny Novikov about 1 year ago

  • Target version set to 1.1
  • Priority changed from High to Urgent

Workarounds do not help unfortunately, so CI (i.e. all computers with some specific CPU) constantly fails.

#5

Updated by Ilja Zakharov about 1 year ago

  • Status changed from New to Resolved

Implemented in 6614-speculative-scheduling.

I added an option for comparing warning messages with messages to ignore. An option "ignore BenchExec warnings" should be set at the native scheduler configuration file and can be a list of parts of messages to ignore or just boolean to ignore all/none messages. Note, that BenchExec fails while it prints some warnings like cgroups or swap warnings, so ignoring messages does not help.

#6

Updated by Evgeny Novikov about 1 year ago

  • Status changed from Resolved to Closed

It works! I merged the branch to master in aa70aed85.

Also available in: Atom PDF