Project

General

Profile

Actions

Feature #8149

closed

Think on proper progress evaluation when several jobs are solved at once or/and much computational resources are available

Added by Evgeny Novikov almost 7 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
09/20/2017
Due date:
10/26/2017
% Done:

100%

Estimated time:
(Total: 0.00 h)
Published in build:

Description

At the moment a progress of solving verification tasks and corresponding remaining times are shown properly just when one job is solving and there are computational resources that are enough just for solving one verification task. This makes the progress and times almost useless when several jobs are solved at once or/and much computational resources are available.


Subtasks 3 (0 open3 closed)

Feature #8445: Visualize useful progressClosedVladimir Gratinskiy09/20/201710/26/2017

Actions
Feature #8446: Fix and improve progress reportingClosedIlja Zakharov10/10/2017

Actions
Feature #8491: Update the progress reporting implementation in BridgeRejected10/10/2017

Actions

Related issues 3 (0 open3 closed)

Related to Klever - Bug #8442: Names of components have been changed and tests and hardcoded names in Bridge should be updatedRejected09/19/2017

Actions
Has duplicate Klever - Feature #8444: Restore progress calculationRejected09/19/2017

Actions
Blocked by Klever - Feature #8536: Untie coverage report from any components nameClosedVladimir Gratinskiy10/31/201711/01/2017

Actions
Actions #1

Updated by Evgeny Novikov almost 7 years ago

BTW, the same problem is for several sub-jobs.

Actions #2

Updated by Evgeny Novikov almost 7 years ago

  • Assignee deleted (Ilja Zakharov)
  • Priority changed from Urgent to High

There are too many very high priority issues. This one can be done after really important features will be supported.

Actions #3

Updated by Evgeny Novikov over 6 years ago

  • Assignee set to Ilja Zakharov
  • Priority changed from High to Urgent
  • Target version set to 1.0

In addition to the requested improvements it will be necessary to fix progress calculation at all (that was mentioned in #8444).

Actions #4

Updated by Evgeny Novikov over 6 years ago

Ilja suggests to change the progress API between Bridge and Core so that there will be additional report fields intended just for progress calculation and there shouldn't any references to concrete Core component names (something similar to #8442).

Actions #5

Updated by Ilja Zakharov over 6 years ago

To restore functionality of progress evaluation I propose to do the following changes in Bridge first:
1) Implement a separate request to Bridge (not as a report) with the following data: {
"total tasks to be generated": 11111,
"tasks failed": 122,
"tasls solved": 500,
"average wall time to finish": 3766,
"wall time spend on solution": 2300
}
This request Core can do several times during the solution.
If Core has several sub-jobs it will not send any data at all or do it as there is the only sub-job.

2) Bridge should just store and visualize the data as is calculating the percentage as (failed + solved) * 100/total.

3) If Core provided data to Bridge and the job has PROCESSING status Bridge should send received data to scheduler during service/get_jobs_and_tasks request adding the following section: {
....
"jobs progress": {
"job_id": {data as is sent by Core}
}
}
Addiction of the section can be done only on update the data to economy traffic.

Actions #6

Updated by Evgeny Novikov over 6 years ago

Ilja Zakharov wrote:

To restore functionality of progress evaluation I propose to do the following changes in Bridge first:
1) Implement a separate request to Bridge (not as a report) with the following data: {
"total tasks to be generated": 11111,
"tasks failed": 122,
"tasls solved": 500,
"average wall time to finish": 3766,
"wall time spend on solution": 2300
}

Minor improvements:
"tasks failed" -> "failed tasks"
"tasls solved" -> "solved tasks"
"average wall time to finish" -> "expected time for solving tasks"
"wall time spend on solution" -> "elapsed time for solving tasks"

This request Core can do several times during the solution.

My suggestion is to send just changed values since, say, "total tasks to be generated" will not ever change.

If Core has several sub-jobs it will not send any data at all or do it as there is the only sub-job.

So, Bridge should expect that there can be no progress reports for some verification jobs.

2) Bridge should just store and visualize the data as is calculating the percentage as (failed + solved) * 100/total.

I suggest the following formula: 100 * solved / (total - failed). This should be calculated just if failed < total. Otherwise if failed = total progress can be hidden because of tasks generation/solution finishes.

Actions #7

Updated by Ilja Zakharov over 6 years ago

My suggestion is to send just changed values since, say, "total tasks to be generated" will not ever change.

I disagree with this point. Since Bridge does not need doing any complicated calculation, let's allow changing the numbers. However, I am not sure that we will change them, but anyway such artificial restrictions are really not necessary.

Actions #8

Updated by Evgeny Novikov over 6 years ago

Ilja Zakharov wrote:

My suggestion is to send just changed values since, say, "total tasks to be generated" will not ever change.

I disagree with this point. Since Bridge does not need doing any complicated calculation, let's allow changing the numbers. However, I am not sure that we will change them, but anyway such artificial restrictions are really not necessary.

I didn't restrict any changes, although it isn't clear. I just suggested to send data incrementally, i.e. send just changed values and likely even after some configurable period of time. For instance, users can request to update a progress just each 30 seconds or each 5 minutes.

Actions #9

Updated by Evgeny Novikov over 6 years ago

Ilja Zakharov wrote:

If Core has several sub-jobs it will not send any data at all or do it as there is the only sub-job.

I suggest a quite simple for implementation and useful for users approach to evaluate a progress for jobs with sub-jobs. Like with task let's evaluate the number of sub-jobs (total, solved, failed) and an average wall time spent on their solution. So, Bridge will show a progress of sub-jobs solution rather than tasks solution. Regarding to data there can be following additional fields:

{
   "sub-jobs to be solved": 50,
   "failed sub-jobs": 5,
   "solved sub-jobs": 25,
   "expected time for solving sub-jobs": 3766,
   "elapsed time for solving sub-jobs": 2300
}

Also, I suggest to name "total tasks to be generated" as "tasks to be generated".

Actions #10

Updated by Ilja Zakharov over 6 years ago

Regarding to data there can be following additional fields

Ok, lets do it also. I would propose to send the data within the same request but make it possible to either attach either data about the tasks and sub-jobs, only about sub-jobs or only about tasks. Because corresponding information core will likely calculate indifferent places.

Actions #11

Updated by Evgeny Novikov over 6 years ago

We need a specification that will conclude this discussion and describe all new requests and their semantics in all details.

Actions #12

Updated by Ilja Zakharov over 6 years ago

Lets discuss it here: https://goo.gl/H2WFsw.

Actions #13

Updated by Ilja Zakharov over 6 years ago

Added issue #8536 as blocking since I am doing progress calculation on base of refactoring of Job.py which I cannot finish without separate total coverage request.

Actions #14

Updated by Ilja Zakharov over 6 years ago

  • Status changed from New to Resolved

The updated progress implementation is available in branch 8149-new-progress. I will perform additional tests, but for simple examples all work nicely for both jobs with subjobs and without them.

Actions #15

Updated by Ilja Zakharov over 6 years ago

  • Status changed from Resolved to Open

Decided to update the implementation.

Actions #16

Updated by Ilja Zakharov over 6 years ago

  • Status changed from Open to Resolved

The final implementation is in branch "8149-new-progress". I see no any explicit bugs there at the moment.

Actions #17

Updated by Evgeny Novikov over 6 years ago

  • Status changed from Resolved to Closed

I merged the branch to master in 459f75e7. At last we have quite proper progress evaluation and visualization that is extremely valuable for large production jobs.

Actions

Also available in: Atom PDF