Project

General

Profile

Actions

Feature #2289

open

We need to add statistics caches for problems and times

Added by Evgeny Novikov over 12 years ago. Updated over 12 years ago.

Status:
Open
Priority:
Normal
Category:
Statistics server
Start date:
01/25/2012
Due date:
% Done:

0%

Estimated time:
Published in build:

Description

As one can notice, the statistics server needs too much time to show a lot amount of data. For instance, a main page containing a summary on reference launch results is loaded for about 47.2 seconds. Detailed statistics:
Zend_Db_Profiler_Firebug (10 @ 46.80597 sec)
Time    Event    Parameters
0.00078    connect    NULL
0.00013    SET SESSION group_concat_max_len = @@max_allowed_packet    NULL
0.00079    DESCRIBE `launches`    NULL
7.18087    SELECT `TA`.`id` AS `Task id`, `TA`.`description` AS `Task description`, `EN`.`version` AS `Environment version`, `RU`.`name` AS `Rule name`, COUNT(TR.id) AS `Result`, SUM(IF(`TR`.`result`='Safe', 1, 0)) AS `Safe`, SUM(IF(`TR`.`result`='Unsafe', 1, 0)) AS `Unsafe`, SUM(IF(`TR`.`result`='Unknown', 1, 0)) AS `Unknown`, GROUP_CONCAT(`RECA`.`Tags` SEPARATOR ';') AS `KB Tags`, SUM(IF(`RECA`.`Verdict`='True positive', 1, 0)) AS `True positive`, SUM(IF(`RECA`.`Verdict`='False positive', 1, 0)) AS `False positive`, SUM(IF(`TR`.`result`='Unsafe', 1, 0)) - SUM(IF(`RECA`.`Verdict`='True positive', 1, 0)) - SUM(IF(`RECA`.`Verdict`='False positive', 1, 0)) - SUM(IF(`RECA`.`Verdict`='Inconclusive', 1, 0)) AS `KB Unknown`, SUM(IF(`RECA`.`Verdict`='Inconclusive', 1, 0)) AS `Inconclusive`, COUNT(`ST_BCE`.`id`) AS `BCE In`, SUM(IF(`ST_BCE`.`success`=TRUE, 1, 0)) AS `BCE Ok`, SUM(IF(`ST_BCE`.`success`=FALSE, 1, 0)) AS `BCE Fail`, COUNT(`ST_DEG`.`id`) AS `DEG In`, SUM(IF(`ST_DEG`.`success`=TRUE, 1, 0)) AS `DEG Ok`, SUM(IF(`ST_DEG`.`success`=FALSE, 1, 0)) AS `DEG Fail`, SUM(`ST_DEG`.`loc`) AS `DEG LOC`, COUNT(`ST_DSCV`.`id`) AS `DSCV In`, SUM(IF(`ST_DSCV`.`success`=TRUE, 1, 0)) AS `DSCV Ok`, SUM(IF(`ST_DSCV`.`success`=FALSE, 1, 0)) AS `DSCV Fail`, COUNT(`ST_RI`.`id`) AS `RI In`, SUM(IF(`ST_RI`.`success`=TRUE, 1, 0)) AS `RI Ok`, SUM(IF(`ST_RI`.`success`=FALSE, 1, 0)) AS `RI Fail`, COUNT(`ST_RCV`.`id`) AS `RCV In`, SUM(IF(`ST_RCV`.`success`=TRUE, 1, 0)) AS `RCV Ok`, SUM(IF(`ST_RCV`.`success`=FALSE, 1, 0)) AS `RCV Fail` FROM `launches` AS `LA` LEFT JOIN `tasks` AS `TA` ON `LA`.`task_id`=`TA`.`id` LEFT JOIN `environments` AS `EN` ON `LA`.`environment_id`=`EN`.`id` LEFT JOIN `rule_models` AS `RU` ON `LA`.`rule_model_id`=`RU`.`id` LEFT JOIN `traces` AS `TR` ON `LA`.`trace_id`=`TR`.`id` LEFT JOIN `results_kb_calculated` AS `RECA` ON `TR`.`id`=`RECA`.`trace_id` LEFT JOIN `stats` AS `ST_BCE` ON `TR`.`build_id`=`ST_BCE`.`id` LEFT JOIN `stats` AS `ST_DEG` ON `TR`.`maingen_id`=`ST_DEG`.`id` LEFT JOIN `stats` AS `ST_DSCV` ON `TR`.`dscv_id`=`ST_DSCV`.`id` LEFT JOIN `stats` AS `ST_RI` ON `TR`.`ri_id`=`ST_RI`.`id` LEFT JOIN `stats` AS `ST_RCV` ON `TR`.`rcv_id`=`ST_RCV`.`id` GROUP BY `TA`.`id`, `TA`.`description`, `EN`.`version`, `RU`.`name` ORDER BY `TA`.`id` ASC, `TA`.`description` ASC, `EN`.`version` ASC, `RU`.`name` ASC    NULL
3.12732    SELECT `TA`.`id` AS `Task id`, `TA`.`description` AS `Task description`, `EN`.`version` AS `Environment version`, `RU`.`name` AS `Rule name`, `PR`.`name` AS `BCE Problems`, COUNT(*) AS `BCE Problems number` FROM `launches` AS `LA` LEFT JOIN `tasks` AS `TA` ON `LA`.`task_id`=`TA`.`id` LEFT JOIN `environments` AS `EN` ON `LA`.`environment_id`=`EN`.`id` LEFT JOIN `rule_models` AS `RU` ON `LA`.`rule_model_id`=`RU`.`id` LEFT JOIN `traces` AS `TR` ON `LA`.`trace_id`=`TR`.`id` LEFT JOIN `stats` AS `ST_BCE` ON `TR`.`build_id`=`ST_BCE`.`id` LEFT JOIN `problems_stats` AS `PRST` ON `PRST`.`stats_id`=`ST_BCE`.`id` LEFT JOIN `problems` AS `PR` ON `PRST`.`problem_id`=`PR`.`id` WHERE (`ST_BCE`.`success` = FALSE) GROUP BY `TA`.`id`, `TA`.`description`, `EN`.`version`, `RU`.`name`, `PR`.`name` ORDER BY `TA`.`id` ASC, `TA`.`description` ASC, `EN`.`version` ASC, `RU`.`name` ASC    NULL
3.06379    SELECT `TA`.`id` AS `Task id`, `TA`.`description` AS `Task description`, `EN`.`version` AS `Environment version`, `RU`.`name` AS `Rule name`, `PR`.`name` AS `DEG Problems`, COUNT(*) AS `DEG Problems number` FROM `launches` AS `LA` LEFT JOIN `tasks` AS `TA` ON `LA`.`task_id`=`TA`.`id` LEFT JOIN `environments` AS `EN` ON `LA`.`environment_id`=`EN`.`id` LEFT JOIN `rule_models` AS `RU` ON `LA`.`rule_model_id`=`RU`.`id` LEFT JOIN `traces` AS `TR` ON `LA`.`trace_id`=`TR`.`id` LEFT JOIN `stats` AS `ST_DEG` ON `TR`.`maingen_id`=`ST_DEG`.`id` LEFT JOIN `problems_stats` AS `PRST` ON `PRST`.`stats_id`=`ST_DEG`.`id` LEFT JOIN `problems` AS `PR` ON `PRST`.`problem_id`=`PR`.`id` WHERE (`ST_DEG`.`success` = FALSE) GROUP BY `TA`.`id`, `TA`.`description`, `EN`.`version`, `RU`.`name`, `PR`.`name` ORDER BY `TA`.`id` ASC, `TA`.`description` ASC, `EN`.`version` ASC, `RU`.`name` ASC    NULL
3.26135    SELECT `TA`.`id` AS `Task id`, `TA`.`description` AS `Task description`, `EN`.`version` AS `Environment version`, `RU`.`name` AS `Rule name`, `PR`.`name` AS `DSCV Problems`, COUNT(*) AS `DSCV Problems number` FROM `launches` AS `LA` LEFT JOIN `tasks` AS `TA` ON `LA`.`task_id`=`TA`.`id` LEFT JOIN `environments` AS `EN` ON `LA`.`environment_id`=`EN`.`id` LEFT JOIN `rule_models` AS `RU` ON `LA`.`rule_model_id`=`RU`.`id` LEFT JOIN `traces` AS `TR` ON `LA`.`trace_id`=`TR`.`id` LEFT JOIN `stats` AS `ST_DSCV` ON `TR`.`dscv_id`=`ST_DSCV`.`id` LEFT JOIN `problems_stats` AS `PRST` ON `PRST`.`stats_id`=`ST_DSCV`.`id` LEFT JOIN `problems` AS `PR` ON `PRST`.`problem_id`=`PR`.`id` WHERE (`ST_DSCV`.`success` = FALSE) GROUP BY `TA`.`id`, `TA`.`description`, `EN`.`version`, `RU`.`name`, `PR`.`name` ORDER BY `TA`.`id` ASC, `TA`.`description` ASC, `EN`.`version` ASC, `RU`.`name` ASC    NULL
3.69311    SELECT `TA`.`id` AS `Task id`, `TA`.`description` AS `Task description`, `EN`.`version` AS `Environment version`, `RU`.`name` AS `Rule name`, `PR`.`name` AS `RI Problems`, COUNT(*) AS `RI Problems number` FROM `launches` AS `LA` LEFT JOIN `tasks` AS `TA` ON `LA`.`task_id`=`TA`.`id` LEFT JOIN `environments` AS `EN` ON `LA`.`environment_id`=`EN`.`id` LEFT JOIN `rule_models` AS `RU` ON `LA`.`rule_model_id`=`RU`.`id` LEFT JOIN `traces` AS `TR` ON `LA`.`trace_id`=`TR`.`id` LEFT JOIN `stats` AS `ST_RI` ON `TR`.`ri_id`=`ST_RI`.`id` LEFT JOIN `problems_stats` AS `PRST` ON `PRST`.`stats_id`=`ST_RI`.`id` LEFT JOIN `problems` AS `PR` ON `PRST`.`problem_id`=`PR`.`id` WHERE (`ST_RI`.`success` = FALSE) GROUP BY `TA`.`id`, `TA`.`description`, `EN`.`version`, `RU`.`name`, `PR`.`name` ORDER BY `TA`.`id` ASC, `TA`.`description` ASC, `EN`.`version` ASC, `RU`.`name` ASC    NULL
3.74814    SELECT `TA`.`id` AS `Task id`, `TA`.`description` AS `Task description`, `EN`.`version` AS `Environment version`, `RU`.`name` AS `Rule name`, `PR`.`name` AS `RCV Problems`, COUNT(*) AS `RCV Problems number` FROM `launches` AS `LA` LEFT JOIN `tasks` AS `TA` ON `LA`.`task_id`=`TA`.`id` LEFT JOIN `environments` AS `EN` ON `LA`.`environment_id`=`EN`.`id` LEFT JOIN `rule_models` AS `RU` ON `LA`.`rule_model_id`=`RU`.`id` LEFT JOIN `traces` AS `TR` ON `LA`.`trace_id`=`TR`.`id` LEFT JOIN `stats` AS `ST_RCV` ON `TR`.`rcv_id`=`ST_RCV`.`id` LEFT JOIN `problems_stats` AS `PRST` ON `PRST`.`stats_id`=`ST_RCV`.`id` LEFT JOIN `problems` AS `PR` ON `PRST`.`problem_id`=`PR`.`id` WHERE (`ST_RCV`.`success` = FALSE) GROUP BY `TA`.`id`, `TA`.`description`, `EN`.`version`, `RU`.`name`, `PR`.`name` ORDER BY `TA`.`id` ASC, `TA`.`description` ASC, `EN`.`version` ASC, `RU`.`name` ASC    NULL
22.7297    SELECT `TA`.`id` AS `Task id`, `TA`.`description` AS `Task description`, `EN`.`version` AS `Environment version`, `RU`.`name` AS `Rule name`, `PRO`.`name` AS `Tool name`, `PRO`.`pattern` AS `Process pattern`, SUM(`PRO`.`time_average`) AS `Time Average` FROM `launches` AS `LA` LEFT JOIN `tasks` AS `TA` ON `LA`.`task_id`=`TA`.`id` LEFT JOIN `environments` AS `EN` ON `LA`.`environment_id`=`EN`.`id` LEFT JOIN `rule_models` AS `RU` ON `LA`.`rule_model_id`=`RU`.`id` LEFT JOIN `traces` AS `TR` ON `LA`.`trace_id`=`TR`.`id` LEFT JOIN `processes` AS `PRO` ON `TR`.`id`=`PRO`.`trace_id` WHERE (`PRO`.`trace_id` IS NOT NULL) GROUP BY `TA`.`id`, `TA`.`description`, `EN`.`version`, `RU`.`name`, `PRO`.`name`, `PRO`.`pattern` ORDER BY `TA`.`id` ASC, `TA`.`description` ASC, `EN`.`version` ASC, `RU`.`name` ASC, `PRO`.`name` ASC, `PRO`.`pattern` ASC    NULL

shows, that the most of this time is required to make DB requests (0.4 seconds, i.e. less then 1% of the total time, is required for statistics visualization). Requests are so:
  1. A general request to count safes, unsafes, inputs, outputs, etc. (about 7 seconds).
  2. Requests to get problems for each tool (about 15 seconds: 3 seconds 5 times).
  3. A request to get a time statistics for each tool (about 23 seconds).

So, the most of time is required to calculate statistics on problems and times. We can avoid this in a way similar to the Knowledge Base cache calculation: we can count statistics for problems and times at upload time and then change it if something, which it's depend on, is changed by means of DB triggers. I guess then statistics caches for problems and times will speed up statistics server up to 5 times in some cases.

Actions

Also available in: Atom PDF