Bug #1649
closedLDV reporters handle big reports badly
Description
It turned out that we haven't paid enough attention to making our report fix-up scripts use stream XML parsing. With big reports big cluster launches create (such as those with 5-6 rules checked at once), these scripts take a lot of unnecessary time, and even sometimes segfault (the result is recoverable on these segfaults.
I have already fixed one script in 4006c092. These scripts (written by Alexander, so it would take more time to deal with them) should be fixed:
build-cmd-extractor/build-cmd-extractor-reporter ldv/ldv-task-reporter drv-env-gen/drv-env-gen-reporter
Meanwhile you should not run full-kernel launches with more than 2-3 rules. The results will not be wasted otherwise, but recovering them would require some manual work.
Updated by Pavel Shved over 13 years ago
- Status changed from New to Open
Just checked drv-env-gen/drv-env-gen-reporter
: it doesn't need to be fixed, as it only deals with relatively small XMLs.
However, found one more file to consider:
build-cmd-extractor/cmd-stream-divider-reporter build-cmd-extractor/build-cmd-extractor-reporter ldv/ldv-task-reporter
Updated by Pavel Shved over 13 years ago
Ok, I figured out how to fix DEG and CSD reporters, but the fix now requires them to exchange non-valid XMLs (those without one root per file)! These particular reporters works faster now, but I am not 100% sure that it's a good idea...
These are left:
build-cmd-extractor/build-cmd-extractor-reporter ldv/ldv-task-reporter
Updated by Pavel Shved over 13 years ago
Ok, all scripts are fixed, and are being tested on the cluster.
Updated by Pavel Shved over 13 years ago
- Status changed from Open to Resolved
- % Done changed from 0 to 100
Ok, the fixes have been merged in 19f82a0d. They have dramatically improved the memory consumption of report fixups, and sometimes the time consumption as well. I hope, we won't see Perl segfaulting anymore.
Note to self: the work has taken ~11 man-hours, and has included recovering what 4 Perl scripts do, and rewriting them to stream parsing with XML::Twig
.
Updated by Pavel Shved over 13 years ago
By the way, report processing for one rule now takes 10 minutes instead of 30 minutes.