Project

General

Profile

Actions

Bug #7668

closed

Concider more reliable way to track runing tasks and jobs processes and threads

Added by Ilja Zakharov over 7 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Immediate
Assignee:
Category:
Scheduling
Target version:
-
Start date:
11/01/2016
Due date:
% Done:

0%

Estimated time:
Detected in build:
svn
Platform:
Published in build:

Description

There are rare cases when python multiprocessing or subprocess (don't know exactly who is in charge of this) process communication methods do not successfully register the child process termination. This leads that control from task has never returned and it is considered as processing even when the corresponding process is dead. This is hard to reproduce and even harder to debug, but we can add additional checks of running state to be aware of such cases or implement an another way of running child processes in the native scheduler.


Related issues 1 (0 open1 closed)

Has duplicate Klever - Bug #7669: Scheduler is waiting forever for already done taskRejected11/01/2016

Actions
Actions #1

Updated by Ilja Zakharov over 7 years ago

  • Assignee set to Ilja Zakharov
  • Priority changed from Normal to Urgent
Actions #2

Updated by Evgeny Novikov over 7 years ago

  • Priority changed from Urgent to Immediate

This issue will hurt us very much when we will perform massive launches. Of course, if it won't be fixed.

Actions #3

Updated by Ilja Zakharov over 7 years ago

  • Status changed from New to Feedback
  • Priority changed from Immediate to High

I implemented some improvements and preliminary night runs have passed successfully. Lets see when more experiments will be performed.

Actions #4

Updated by Evgeny Novikov over 7 years ago

  • Status changed from Feedback to Closed
  • Priority changed from High to Immediate

Re-open it if your fixes fail.

Actions

Also available in: Atom PDF