Bug #1649
closedLDV reporters handle big reports badly
Description
It turned out that we haven't paid enough attention to making our report fix-up scripts use stream XML parsing. With big reports big cluster launches create (such as those with 5-6 rules checked at once), these scripts take a lot of unnecessary time, and even sometimes segfault (the result is recoverable on these segfaults.
I have already fixed one script in 4006c092. These scripts (written by Alexander, so it would take more time to deal with them) should be fixed:
build-cmd-extractor/build-cmd-extractor-reporter ldv/ldv-task-reporter drv-env-gen/drv-env-gen-reporter
Meanwhile you should not run full-kernel launches with more than 2-3 rules. The results will not be wasted otherwise, but recovering them would require some manual work.
Updated by Pavel Shved over 13 years ago
- Status changed from New to Open
Just checked drv-env-gen/drv-env-gen-reporter
: it doesn't need to be fixed, as it only deals with relatively small XMLs.
However, found one more file to consider:
build-cmd-extractor/cmd-stream-divider-reporter build-cmd-extractor/build-cmd-extractor-reporter ldv/ldv-task-reporter
Updated by Pavel Shved over 13 years ago
Ok, I figured out how to fix DEG and CSD reporters, but the fix now requires them to exchange non-valid XMLs (those without one root per file)! These particular reporters works faster now, but I am not 100% sure that it's a good idea...
These are left:
build-cmd-extractor/build-cmd-extractor-reporter ldv/ldv-task-reporter
Updated by Pavel Shved about 13 years ago
Ok, all scripts are fixed, and are being tested on the cluster.
Updated by Pavel Shved about 13 years ago
- Status changed from Open to Resolved
- % Done changed from 0 to 100
Ok, the fixes have been merged in 19f82a0d. They have dramatically improved the memory consumption of report fixups, and sometimes the time consumption as well. I hope, we won't see Perl segfaulting anymore.
Note to self: the work has taken ~11 man-hours, and has included recovering what 4 Perl scripts do, and rewriting them to stream parsing with XML::Twig
.
Updated by Pavel Shved about 13 years ago
By the way, report processing for one rule now takes 10 minutes instead of 30 minutes.
Updated by Pavel Shved about 13 years ago
- Status changed from Resolved to Closed