Feature #817
openSmart management of node capabilities
0%
Description
User should be able to run a distributed task and to set TIME_LIMIT and MEMORY_LIMIT for the task.
If not all available nodes have required capabilities, nevertheless they should be used to execute the task.
Of course, if the execution failed with TIME_LIMIT or MEMORY_LIMIT, it should be restarted at more capable node.
Open question: how to measure TIME_LIMIT correctly if we have different CPUs?
Updated by Pavel Shved almost 14 years ago
- Category set to Cluster
TIME_LIMIT could be expressed in "seconds on a 2k MHz CPU", and be scaled proportionally.
RAM limit isn't scalable, and it's an interesting idea to dynamically re-arrange tasks to more capable nodes. There's nothing in the current architecture that prevents such scenario. However, that's still a prototype, and we may postpone this bug until we'll be designing the final version of cluster algorithms.
Updated by Pavel Shved almost 14 years ago
For example, if all "less capable" nodes are busy, and we're to route a task, we might consider holding it, instead of routing to a "more capable" node outright. I guess, we should first finish the straightforward scenario before diving into such matters.
Updated by Alexey Khoroshilov over 13 years ago
- Priority changed from Normal to High
Updated by Pavel Shved over 13 years ago
- Assignee set to Pavel Shved
- Priority changed from High to Normal