Feature #8445
closedFeature #8149: Think on proper progress evaluation when several jobs are solved at once or/and much computational resources are available
Visualize useful progress
Added by Evgeny Novikov about 7 years ago. Updated about 7 years ago.
100%
Description
After the recent Core refactoring it does not work at all. But also it should become useful for any cases.
Updated by Vladimir Gratinskiy about 7 years ago
- Due date set to 10/26/2017
- Status changed from New to Resolved
- % Done changed from 0 to 100
Implemented in branch "useful-progress". New request for updating progresses:
URL: /service/update_progresses/
Parameters: 'jobs progresses' - json string with format:
[ { "job id": <job identifier>, "estimated total tasks": <the number of estimated number of tasks for the job with id 'job id'>, "local average time": <average time for one task decision based on finished decision only for this job>, "global average time": <average time for one task decision based on finished decision for all job> }, ... ]
Returns: empty json in case of success; json {"error": <error message>} in case of error.
Uploading data reports with data for progress is not working anymore.
Updated by Evgeny Novikov about 7 years ago
- Subject changed from Calculate and visualize useful progress to Visualize useful progress
- Status changed from Resolved to Open
I am wondering, why the suggested interface differs so much from the one we discussed within parent issue #8149. In particular, progress requests will be sent by Core that most likely means that job identifiers aren't needed. I guess that we don't need statistics on the basis of all jobs while there are many other information to be sent, say, the number of solved/failed tasks, a sub-jobs decision progress, etc.
Updated by Vladimir Gratinskiy about 7 years ago
Evgeny Novikov wrote:
I am wondering, why the suggested interface differs so much from the one we discussed within parent issue #8149. In particular, progress requests will be sent by Core that most likely means that job identifiers aren't needed. I guess that we don't need statistics on the basis of all jobs while there are many other information to be sent, say, the number of solved/failed tasks, a sub-jobs decision progress, etc.
Ilja said that this feature will work in the way I've done. Then I understood it icorrectly and need documentation of this feature.
Updated by Ilja Zakharov about 7 years ago
I meant that a new implementation in Bridge would be just simpler than existing and not many things should be done there (from my perspective).
If you need a bit more detailed documentation, then it is better to wait until I finish the implementation in Core because there are a lot of changes here should be done and tricky issues to be solved.
Updated by Vladimir Gratinskiy about 7 years ago
- Status changed from Open to Resolved
New implementation is in the same branch "useful-progress".
URL: /service/update_progress/
Parameters: 'progress' - json string with format described in google docs.
Returns: empty json in case of success; json {"error": <error message>} in case of error.
Updated by Ilja Zakharov about 7 years ago
All works quite fine. I merged the branch to 8149-new-progress (with my implementation in Core).
The only note about guards in bridge.service.utils. It assumes that process is present but sometimes for failed jobs it is not so, so I slightly modified two checks. Please, review the fix (the last commit in branch 8149-new-progress).
Updated by Ilja Zakharov about 7 years ago
- Status changed from Resolved to Open
After running the progress on real examples we decided to introduced several changes. See comments in the feature documentation.
Updated by Vladimir Gratinskiy about 7 years ago
- Status changed from Open to Resolved
Ilja Zakharov wrote:
After running the progress on real examples we decided to introduced several changes. See comments in the feature documentation.
Fixed in branch "8149-new-progress".
Updated by Ilja Zakharov about 7 years ago
- Status changed from Resolved to Open
In general it works but there are a couple of issues left:
- Still progress is hidden for solved jobs (even if they had it calculated) but in the last version of the document I have added this requirement (see the update comments which I added in the morning).
- I am not completely sure, but once I saw that if progress is kept on the main page (for a failed job) then after starting solving it fields with tasks and subjobs solution dates left with previous values from the last run and not cleaned.
- I also encountered several other problems but I had to fix them to try running jobs. So please review my changes I made in the last commit.
Updated by Evgeny Novikov about 7 years ago
Perhaps some issues will gone after reloading JavaScripts, but I am not sure.
Updated by Vladimir Gratinskiy about 7 years ago
Ilja Zakharov wrote:
In general it works but there are a couple of issues left:
- Still progress is hidden for solved jobs (even if they had it calculated) but in the last version of the document I have added this requirement (see the update comments which I added in the morning).
It should work. Updating cached javascript will not help as dependency of job status is in python script. Maybe you had wrong version or the server is not reloaded after version change.
- I am not completely sure, but once I saw that if progress is kept on the main page (for a failed job) then after starting solving it fields with tasks and subjobs solution dates left with previous values from the last run and not cleaned.
It can't be so as for all starting jobs function __create_solving_progress() is called (service/utils.py: 892). SolvingProgress is created for all started jobs and will be deleted in this function together with JobProgress (where progress dates are) before job start.
- I also encountered several other problems but I had to fix them to try running jobs. So please review my changes I made in the last commit.
Reviewed.
Updated by Ilja Zakharov about 7 years ago
It should work. Updating cached javascript will not help as dependency of job status is in python script. Maybe you had wrong version or the server is not reloaded after version change.
I tried this on my machine after restarting Bridge and browser and in a virtual machine with a completely new installation. If the job is successfully solved then Bridge hides all progress information. Also Bridge hides progress information for failed jobs if the progress is 100%.
Also you reverted one of my fixes, but without it I see the exception sometimes:
Traceback (most recent call last):
File "/var/www/klever-bridge/jobs/views.py", line 313, in get_job_data
progress = GetJobsProgresses(request.user, [job.id]).data[job.id]
File "/var/www/klever-bridge/service/utils.py", line 1001, in init
self.data = self.__get_data(jobs_ids)
File "/var/www/klever-bridge/service/utils.py", line 1026, in _get_data
data[j_id] = self._job_values(j_id)
File "/var/www/klever-bridge/service/utils.py", line 1067, in __job_values
elif self._j_progress[j_id].gag_text_ts is not None:
KeyError: 5
Updated by Vladimir Gratinskiy about 7 years ago
Ilja Zakharov wrote:
It should work. Updating cached javascript will not help as dependency of job status is in python script. Maybe you had wrong version or the server is not reloaded after version change.
I tried this on my machine after restarting Bridge and browser and in a virtual machine with a completely new installation. If the job is successfully solved then Bridge hides all progress information. Also Bridge hides progress information for failed jobs if the progress is 100%.
Could you download and send me such job?
Also you reverted one of my fixes, but without it I see the exception sometimes:
Sorry, I thought it can't be because of line 1030.
Updated by Ilja Zakharov about 7 years ago
Could you download and send me such job?
Send it by an e-mail.
Sorry, I thought it can't be because of line 1030.
Seems that in particular, it happens when the status is Pending or processing and a new progress status is not received yet.
Updated by Vladimir Gratinskiy about 7 years ago
Ilja Zakharov wrote:
Could you download and send me such job?
Send it by an e-mail.
The reason is "Expected time for solving subjobs" and "Expected time for solving tasks" both are null. It should be 0 after a job is solved or another number if it is failed.
Updated by Ilja Zakharov about 7 years ago
Hmm, maybe because I send textual value at the end of solution? This is the example of data with progress:
{'start tasks solution': True} {'expected time for solving tasks': 60, 'solved tasks': 3, 'failed tasks': 0, 'total tasks to be generated': 9} {'solved tasks': 3, 'failed tasks': 0, 'expected time for solving tasks': 50} {'solved tasks': 3, 'failed tasks': 0, 'expected time for solving tasks': 38} {'solved tasks': 8, 'failed tasks': 0, 'expected time for solving tasks': 8} {'finish tasks solution': True, 'solved tasks': 9, 'failed tasks': 0, 'expected time for solving tasks': 'Solution finished'}
Updated by Vladimir Gratinskiy about 7 years ago
Ilja Zakharov wrote:
Hmm, maybe because I send textual value at the end of solution? This is the example of data with progress:
[...]
Yes, it is the reason. Expected time is shown only for processing jobs. In other cases it is not shown and don't need the textual value. The progress can be shown if it can be calculated and expected time is known (like it were described in docs). The second condition is false for textual values.
Updated by Ilja Zakharov about 7 years ago
- Status changed from Open to Resolved
I fixed the last value for estimation time and it works perfectly.I do not see any other problems.
Updated by Evgeny Novikov about 7 years ago
- Status changed from Resolved to Open
One small remark. Indeed the progress specification is too coarse when it requires a progress to be calculated and an expected time in seconds to be received simultaneously. Recent changes allowed to send strings as the expected time. This was needed first of all when the expected time decreases below 0. But when Core sends a string, Bridge also prints stub "Estimating progress" for the progress although the progress is known since it isn't connected with time but with the number of solved/failed tasks/subjobs and their total numbers.
My suggestion is to get rid of that coarse suggestion and to calculate/visualize progresses and expected times independently. You can still check/assume that when you can calculate a progress first time, there should be a corresponding expected time in seconds. But later the progress should be always calculated and visualized, while stubs can be printed for the expected time.
Updated by Evgeny Novikov about 7 years ago
- Status changed from Resolved to Closed
I merged the branch to master in 459f75e7.