Project

General

Profile

Bug #1649

LDV reporters handle big reports badly

Added by Pavel Shved over 8 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Infrastructure
Start date:
08/16/2011
Due date:
% Done:

100%

Estimated time:
8.00 h
Spent time:
Detected in build:
svn
Platform:
Published in build:

Description

It turned out that we haven't paid enough attention to making our report fix-up scripts use stream XML parsing. With big reports big cluster launches create (such as those with 5-6 rules checked at once), these scripts take a lot of unnecessary time, and even sometimes segfault (the result is recoverable on these segfaults.

I have already fixed one script in 4006c092. These scripts (written by Alexander, so it would take more time to deal with them) should be fixed:

build-cmd-extractor/build-cmd-extractor-reporter
ldv/ldv-task-reporter
drv-env-gen/drv-env-gen-reporter

Meanwhile you should not run full-kernel launches with more than 2-3 rules. The results will not be wasted otherwise, but recovering them would require some manual work.

History

#1

Updated by Pavel Shved over 8 years ago

  • Status changed from New to Open

Just checked drv-env-gen/drv-env-gen-reporter: it doesn't need to be fixed, as it only deals with relatively small XMLs.

However, found one more file to consider:

build-cmd-extractor/cmd-stream-divider-reporter
build-cmd-extractor/build-cmd-extractor-reporter
ldv/ldv-task-reporter
#2

Updated by Pavel Shved over 8 years ago

Ok, I figured out how to fix DEG and CSD reporters, but the fix now requires them to exchange non-valid XMLs (those without one root per file)! These particular reporters works faster now, but I am not 100% sure that it's a good idea...

These are left:

build-cmd-extractor/build-cmd-extractor-reporter
ldv/ldv-task-reporter

#3

Updated by Pavel Shved over 8 years ago

Ok, all scripts are fixed, and are being tested on the cluster.

#4

Updated by Pavel Shved over 8 years ago

  • Status changed from Open to Resolved
  • % Done changed from 0 to 100

Ok, the fixes have been merged in 19f82a0d. They have dramatically improved the memory consumption of report fixups, and sometimes the time consumption as well. I hope, we won't see Perl segfaulting anymore.

Note to self: the work has taken ~11 man-hours, and has included recovering what 4 Perl scripts do, and rewriting them to stream parsing with XML::Twig.

#5

Updated by Pavel Shved over 8 years ago

By the way, report processing for one rule now takes 10 minutes instead of 30 minutes.

#6

Updated by Pavel Shved over 8 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF