Project

General

Profile

Actions

Feature #9414

closed

Reuse archive with source files between different error traces on report uploading

Added by Vitaly Mordan about 6 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
Start date:
01/25/2019
Due date:
07/02/2019
% Done:

100%

Estimated time:
(Total: 0.00 h)
Published in build:

Description

Feature #8704 allowed to separated source files from error traces and put them into a single archive, which can be used from any error trace in report to be uploaded.
This helps to reduce report size in several times.
But during its uploading Klever duplicate archive with source file for each error trace in bridge/media/Unsafes/Sources, and thus the original goal (which is to reduce amount of uploaded sources) is not satisfied, feature #8704 is not working as intended.
In order to solve this problem, archive with the same source files should be reused between different error traces not only inside report to be uploaded, but also after uploading.
For example, here is a fragment of report:

{
    "attrs": [
        {
            "name": "Id",
            "value": "1" 
        }
    ],
    "error traces": ["unsafe_1.zip"],
    "id": "/CPAchecker/unsafe_1",
    "parent id": "/CPAchecker",
    "sources": "sources.zip",
    "type": "unsafe" 
},
{
    "attrs": [
        {
            "name": "Id",
            "value": "2" 
        }
    ],
    "error traces": ["unsafe_2.zip"],
    "id": "/CPAchecker/unsafe_2",
    "parent id": "/CPAchecker",
    "sources": "sources.zip",
    "type": "unsafe" 
}

After uploading there should be only one archive with source files, which is reuse by both error traces.
Note, that we cannot merge those two nodes into
{
    "attrs": [
        {
            "name": "Id",
            "value": "1" 
        }
    ],
    "error traces": ["unsafe_1.zip", "unsafe_2.zip"],
    "id": "/CPAchecker/unsafe_1",
    "parent id": "/CPAchecker",
    "sources": "sources.zip",
    "type": "unsafe" 
}

because they have different attributes, some of which will be used during filtration of error traces.


Subtasks 2 (0 open2 closed)

Feature #9460: Support new reports with sourcesClosedVladimir Gratinskiy01/25/201907/02/2019

Actions
Feature #9461: Generate new reports with sourcesClosedEvgeny Novikov01/25/2019

Actions

Related issues 2 (0 open2 closed)

Related to Klever - Feature #8704: Support processing of multiple error traces per one unsafe reportClosedVladimir Gratinskiy02/06/2018

Actions
Related to Klever - Feature #9356: Do not deploy the same build basesClosedIlya Shchepetkov10/29/2018

Actions
Actions #1

Updated by Vitaly Mordan about 6 years ago

  • Related to Feature #8704: Support processing of multiple error traces per one unsafe report added
Actions #2

Updated by Evgeny Novikov about 6 years ago

  • Status changed from New to Rejected

There is not this issue, since you do can specify various attributes for different error traces within one report. Klever already does this - you can find that Klever does not report any attributes for error traces in reports explicitly. The trick is including Klever attributes within witnesses. Witnesses with data races started to use this trick a long ago, so, you can learn the format from them.

Actions #3

Updated by Pavel Andrianov about 6 years ago

Some attributes can be related not to a witness, but to a whole launch. So, the feature is only not to clone the same "sources.zip", if it occurs several times.

Actions #4

Updated by Vitaly Mordan about 6 years ago

Now each unsafe node is tied to verification node (with resources, logs, etc.).
Unlike variable names in races, some attributes do not relate to witness at all.
Do you suggest to create a single node for all error trace to lose all that information and components hierarchy?
This is very bad suggestion.

Actions #5

Updated by Evgeny Novikov about 6 years ago

Pavel Andrianov wrote:

Some attributes can be related not to a witness, but to a whole launch. So, the feature is only not to clone the same "sources.zip", if it occurs several times.

If you need the same attributes for all underlying reports, you need to upload them within a parent report, most likely within verification reports.

As far as I understand, very old Bridge did detect file duplicates, but it was quite hard to do.

Actions #6

Updated by Evgeny Novikov about 6 years ago

Vitaly Mordan wrote:

Now each unsafe node is tied to verification node (with resources, logs, etc.).
Unlike variable names in races, some attributes do not relate to witness at all.
Do you suggest to create a single node for all error trace to lose all that information and components hierarchy?
This is very bad suggestion.

What is an unsafe node? What is a verification node?

If attributes are not related with witnesses, you should not upload them with help of unsafe reports.

Actions #7

Updated by Pavel Andrianov about 6 years ago

If you need the same attributes for all underlying reports, you need to upload them within a parent report, most likely within verification reports.

Yes, but in this case we have to create several unsafe reports with the same "source.zip", which will be cloned by bridge.

As far as I understand, very old Bridge did detect file duplicates, but it was quite hard to do.

We just request to resolve file names, so if "source.zip" is contained in several unsafe reports, it should be cloned, while uploading.

Actions #8

Updated by Pavel Andrianov about 6 years ago

What is an unsafe node? What is a verification node?

This is unsafe report and verification report.

Actions #9

Updated by Vitaly Mordan about 6 years ago

For example, we have the following hierarchy of nodes in report:
-root
--Component1
--Component2
---CPAchecker_1
----unsafe (sources=sources.zip)
----unknown
---CPAchecker_N
----unsafe (sources=sources.zip)
----unknown

If you need the same attributes for all underlying reports, you need to upload them within a parent report, most likely within verification reports.

Here we need attributes, resources, logs and such of verification report (CPAchecker_i) for each leaf, therefore I will not merge them.
At the same time, each unsafe report have a link to sources.zip, which should not be duplicated on uploading.

Actions #10

Updated by Evgeny Novikov about 6 years ago

Pavel Andrianov wrote:

If you need the same attributes for all underlying reports, you need to upload them within a parent report, most likely within verification reports.

Yes, but in this case we have to create several unsafe reports with the same "source.zip", which will be cloned by bridge.

Can't you create an intermediate reports of another type? E.g.:

verification report with common attributes:
    unsafe report with common sources and one or more error traces with specific attributes

If you can't, Bridge should support this rather than the strange thing suggested below.

As far as I understand, very old Bridge did detect file duplicates, but it was quite hard to do.

We just request to resolve file names, so if "source.zip" is contained in several unsafe reports, it should be cloned, while uploading.

But there can be files with the same names but various contents. So, it seems you suggest a workaround rather than to do things in the right way.

Actions #11

Updated by Vitaly Mordan about 6 years ago

Evgeny Novikov wrote:

But there can be files with the same names but various contents. So, it seems you suggest a workaround rather than to do things in the right way.

This is not possible, since each archive contains absolute paths to the source files.
If you need different sources, you can create several archives and add links to them:
-unsafe (sources=sources_1.zip, trace=t_1.zip)
-unsafe (sources=sources_1.zip, trace=t_2.zip)
-unsafe (sources=sources_2.zip, trace=t_3.zip)
Again, in this case you will need 2 archives with sources (not 3).

Actions #12

Updated by Pavel Andrianov about 6 years ago

Can't you create an intermediate reports of another type?

We do it already (see example from Vitaly), but it does not help for the problem: anyway we need to create several different unsafe reports with the same "sources.zip".

But there can be files with the same names but various contents.

I hardly can imagine this. How can an archive contain two different files with the same name?

I can suggest another idea, if you liked it more. Could the Bridge get sources not only from the unsafe report, but also from parent ones. So, if the sources are specified for a common parent, it is considered as it is specified for all child reports. It is important, that in this case the sources are not duplicated for all children.

Actions #13

Updated by Evgeny Novikov about 6 years ago

  • Status changed from Rejected to Open
  • Assignee set to Vladimir Gratinskiy
  • Priority changed from Normal to Urgent
  • Target version set to 3.0

Pavel Andrianov wrote:

Can't you create an intermediate reports of another type?

We do it already (see example from Vitaly), but it does not help for the problem: anyway we need to create several different unsafe reports with the same "sources.zip".

Now I catch the issue at last. First of all, I am not sure that you prepare reports in the proper way. As far as I rememember there can be the only verification report for the same parent (Vladimir, am I right?). If you violate this assumption, you should expect various issues in Bridge, e.g. comparison of verification results will not work. I am wondering, why Bridge does not reject your reports - this should be fixed.

But when you will upload reports properly, this will not help to solve the issue. Below you suggested a nice way to fix it.

But there can be files with the same names but various contents.

I hardly can imagine this. How can an archive contain two different files with the same name?

I can suggest another idea, if you liked it more. Could the Bridge get sources not only from the unsafe report, but also from parent ones. So, if the sources are specified for a common parent, it is considered as it is specified for all child reports. It is important, that in this case the sources are not duplicated for all children.

After all there should be something like:

Core report
    Component1 report
    Component2 report
        sources.zip # This is a new feature to be supported by Bridge. It should use this archive for all error traces in subtree below. 
        Component2-1 report
            Verification1 report
                Attributes1
                Log1
                Resources1
                Unsafe1 report
        Component2-2 report
            Verification2 report
                Attributes2
                Log2
                Resources2
                Unsafe2 report
                Unknown2 report

Please, not that for comparison Bridge will consider unique tuples of values of cumulative verification report attributes marked to be used for comparison. For instance, if "Verification1 report" has attribute A with value A1 and one Unsafe report, and "Verification2 report" has attribute A with value A2, one Unsafe report and one Unknown report, and attribute A is marked to be used for comparison, then Bridge will compare these verification results with other ones by constructing tuples (A1) and (A2) for which total verdicts will be "Found all unsafes" and "Found not all unsafes". If you will report verification results using another approach to have some other features, you will lose an ability to compare verification results and perhaps something else.

Actions #14

Updated by Pavel Andrianov about 6 years ago

As far as I rememember there can be the only verification report for the same parent (Vladimir, am I right?)

Neither specification format, nor implementation has such restriction. Anyway, a single component (Verifier) can be launched several times in a one launch, so, it produces several verification reports. Thus I do not consider the restriction to be relevant.

After all there should be something like:

I am not against the new features as in your example. I would just say that add several lines to check the name and not to clone it is much more easier than to implement such large feature. For this particular problem it looks like "To use a sledge-hammer to crack a nut."

Actions #15

Updated by Evgeny Novikov about 6 years ago

  • Status changed from Open to Feedback

Pavel Andrianov wrote:

As far as I rememember there can be the only verification report for the same parent (Vladimir, am I right?)

Neither specification format, nor implementation has such restriction. Anyway, a single component (Verifier) can be launched several times in a one launch, so, it produces several verification reports. Thus I do not consider the restriction to be relevant.

Let's wait an answer from Vladimir. For me it seems to be a bug in Bridge since it does not make corresponding jobs corrupted due to incorrect reports. And later different operations may fail unexpectedly, in particular after updating Bridge.

Klever was never intended for things you are talking about. Of course, you can do very many various ones, e.g. launch various verifiers using conditions and maybe loops, use different configurations and so on, but then you have to prepare the only verification report with all verification results as its children. According to this it looks like the original issue is not relevant for Klever at all.

BTW, if you want to describe some useful information about your complex verification strategy, you can use advanced features of attributes, in particular data attributes, and upload many files rather than one. If you dislike this approach, do not expect that you will be able to use Bridge safely.

Actions #16

Updated by Pavel Andrianov about 6 years ago

then you have to prepare the only verification report with all verification results as its children.

Then I have no possibility to describe all verifiers I used, as I have to prepare only one verification report per all launches. For example, one of the launch failed with exception, the other found several unsafes and finish with timeout and the last one suddenly proved true. Now there is a restriction, that a safe verdict can not be combined with unsafe or unknown, which is reasonable. Also, there are different resources, logs and so on. It is not correct to put all this data into different attributes of unsafe reports. Thus I still consider the restriction 'only one verification report' as irrelevant.

Actions #17

Updated by Vladimir Gratinskiy about 6 years ago

Evgeny Novikov wrote:

As far as I rememember there can be the only verification report for the same parent (Vladimir, am I right?).

There is no such restriction.

Actions #18

Updated by Evgeny Novikov about 6 years ago

Vladimir Gratinskiy wrote:

Evgeny Novikov wrote:

As far as I rememember there can be the only verification report for the same parent (Vladimir, am I right?).

There is no such restriction.

Okay. Does Bridge compare verification results on the basis of verification reports? That's it calculates a total verdict for a given verification report and all its children?

Actions #19

Updated by Evgeny Novikov about 6 years ago

Pavel Andrianov wrote:

then you have to prepare the only verification report with all verification results as its children.

Then I have no possibility to describe all verifiers I used, as I have to prepare only one verification report per all launches. For example, one of the launch failed with exception, the other found several unsafes and finish with timeout and the last one suddenly proved true. Now there is a restriction, that a safe verdict can not be combined with unsafe or unknown, which is reasonable. Also, there are different resources, logs and so on. It is not correct to put all this data into different attributes of unsafe reports. Thus I still consider the restriction 'only one verification report' as irrelevant.

Of course, users are interested in verification results first of all. Thus, you need to prepare them in a standard way. For providing data that can be interested for verifier developers there are different means in Klever that I already described. I am not sure that you was aware about them.

Actions #20

Updated by Pavel Andrianov about 6 years ago

For providing data that can be interested for verifier developers there are different means

I am not speaking about developers. This is a practical issue, when verification task is prepared once and then different configurations are used (memory check with SMG, race check with CPALockator). We do not include any developer-specific features in the report.

Actions #21

Updated by Vladimir Gratinskiy about 6 years ago

Evgeny Novikov wrote:

Vladimir Gratinskiy wrote:

Evgeny Novikov wrote:

As far as I rememember there can be the only verification report for the same parent (Vladimir, am I right?).

There is no such restriction.

Okay. Does Bridge compare verification results on the basis of verification reports? That's it calculates a total verdict for a given verification report and all its children?

It calculates total verdict on the basis of attributes values. Each leaf (unsafe/safe/unknown) has its own attributes that are copied from all parents branch (including verification report).

Actions #22

Updated by Evgeny Novikov about 6 years ago

Vladimir Gratinskiy wrote:

Evgeny Novikov wrote:

Vladimir Gratinskiy wrote:

Evgeny Novikov wrote:

As far as I rememember there can be the only verification report for the same parent (Vladimir, am I right?).

There is no such restriction.

Okay. Does Bridge compare verification results on the basis of verification reports? That's it calculates a total verdict for a given verification report and all its children?

It calculates total verdict on the basis of attributes values. Each leaf (unsafe/safe/unknown) has its own attributes that are copied from all parents branch (including verification report).

I understand that. But I wanted to clarify, how comparison works. If I don't mistake, for each verification report Bridge calculates a total verdict on the basis of its child reports, e.g. "Safe" or "Found all unsafes". And then it compares total verdicts for equal values of attributes that are marked to be used for comparison. Sometimes there may be non-matching attribute values.

Actions #23

Updated by Evgeny Novikov about 6 years ago

  • Status changed from Feedback to Open

Pavel Andrianov wrote:

For providing data that can be interested for verifier developers there are different means

I am not speaking about developers. This is a practical issue, when verification task is prepared once and then different configurations are used (memory check with SMG, race check with CPALockator). We do not include any developer-specific features in the report.

It seems that you described the situation inaccurately. Indeed, you have various verification tasks since you check various specifications for target programs. And you prepare different verification reports for different verification tasks where different instances of a verifier are used for solving those tasks. This matches the common Klever workflow exactly. The only difference is that you have the same sources always while Klever usually has different sources for each verification report. The latter is the case for verification of Linux loadable kernel modules, but in other use cases this can differ. That's why I will try to suggest the general solution of this problem.

Let's consider the following example:

Core report
    sources.zip
    Component1 report
    Component2 report
        sources2.zip
        Verification1 report
            Attributes1 # For instance, "requirement" is "memory safety" in the proper form.
            Log1
            Resources1
            Unsafe1 report
        Verification2 report
            sources3.zip
            Attributes2 # For instance, "requirement" is "concurrency safety" in the proper form.
            Log2
            Resources2
            Unsafe2 report
            Unknown2 report

In this case Bridge should use sources1.zip and sources2.zip together for error traces of "Unsafe1 report", and it should use sources1.zip, sources2.zip and sources3.zip together for error traces of "Unsafe2 report". Using this feature one will be able to report common sources once. Possible caveats:
  1. There can be the same source files with the same or different contents at various levels. I suggest to use those sources that are at a more deep level in this case.
  2. Sources can be uploaded just with component or verification start reports. That's why there won't be data misses and data races when required sources for a child report was not uploaded for its parent or they are uploaded concurrently.
  3. Maybe some more tricky issues.

Do everybody agree that this is a good feature to be supported?

Actions #24

Updated by Evgeny Novikov almost 6 years ago

We suggested a better solution of this issue in a corresponding specification. Here is the blocking issue.

Actions #25

Updated by Evgeny Novikov almost 6 years ago

  • Related to Feature #9356: Do not deploy the same build bases added
Actions #26

Updated by Evgeny Novikov almost 6 years ago

  • Category deleted (Bridge)
  • Assignee changed from Vladimir Gratinskiy to Evgeny Novikov
Actions #27

Updated by Evgeny Novikov over 5 years ago

  • Status changed from Open to Resolved

Except for minor issues this is done in branch klever-3.0.

Actions #28

Updated by Evgeny Novikov about 5 years ago

  • Status changed from Resolved to Closed

In master.

Actions

Also available in: Atom PDF