Feature #7580: Download verifications tasks for the SV-COMP - Klever - Open-Source Projects

Actions

Copy link

Feature #7580

closed

Download verifications tasks for the SV-COMP

Added by Vadim Mutilin over 7 years ago. Updated over 7 years ago.

Status:

Closed

Priority:

Immediate

Assignee:

Vladimir Gratinskiy

Category:

Bridge

Target version:

Start date:

09/30/2016

Due date:

11/16/2016

% Done:

100%

Estimated time:

Published in build:

Description

For the SV-COMP we need to prepare a set verification tasks.
For that puprpose we need to download verification tasks prepared to be solved by SV-COMP tool (e.g. CPAchecker).

Now there is a possibility to download only a single verification tasks.

But we need to download a list of verification tasks selected by some criteria.
Examples of criteria
1. all tasks with unsafe verdicts.
2. All tasks marked by EMG tag
3. All tasks with safe verdict
4. All tasks with safe verdict with CPU time > 100 sec.
5. All tasks with time limit verdict

Files

svcomp.tar.gz (97.3 KB) svcomp.tar.gz

Vitaly Mordan, 10/18/2016 03:21 PM

Actions

Copy link

Updated by Vadim Mutilin over 7 years ago

Assignee set to Vladimir Gratinskiy

Actions

Copy link

Updated by Alexey Khoroshilov over 7 years ago

Priority changed from High to Urgent

In short term (a week or two?), we need at least a simple solution, for example, to download all verification tasks or all tasks for existing filters (by verdicts and tags).
Otherwise, we will not be able to prepare benchmarks with modern environment models for the coming competition.

Actions

Copy link

Updated by Vladimir Gratinskiy over 7 years ago

Vadim Mutilin wrote:

For that puprpose we need to download verification tasks prepared to be solved by SV-COMP tool (e.g. CPAchecker).

What preparation are needed to be done?

Now there is a possibility to download only a single verification tasks.

Where is this possibility? Do you mean downloading task by service tools? These tasks doesn't have "verdict" attribute, and these tasks are deleted when the task is solved (or the whole job is solved if scheduler forgot to delete the task). And also these tasks are not tagged/marked by anything.

Actions

Copy link

Updated by Evgeny Novikov over 7 years ago

Status changed from New to Open

I think that this all is about so called input files for static verifiers (although their proper name is input files for BenchExec). They aren't the same as verification tasks in terms of SV-COMP. These files are always kept if a corresponding option is turned on when a job decision is started.

First of all I would like to know how many verification tasks should be prepared this time? If this number is several hundreds this definitely can and should be done manually. If several thousands then a very restricted solution should be done (I have an idea about it already).

Actions

Copy link

Updated by Vadim Mutilin over 7 years ago

Vladimir Gratinskiy wrote:

Vadim Mutilin wrote:

For that puprpose we need to download verification tasks prepared to be solved by SV-COMP tool (e.g. CPAchecker).

What preparation are needed to be done?

I need to get input files passed to CPAchecker (or any other verification tool) together with configuration options. Let us call it benchmark task

Now there is a possibility to download only a single verification tasks.

Where is this possibility? Do you mean downloading task by service tools? These tasks doesn't have "verdict" attribute, and these tasks are deleted when the task is solved (or the whole job is solved if scheduler forgot to delete the task). And also these tasks are not tagged/marked by anything.

Now you can download single benchmark task from CPAchecker tab, e.g.
http://ldvstore:8998/reports/component/88/686382/
But I want to download a set of benchmarks tasks.
For example, I want to download all Safes for some job listed on the page, see http://ldvstore:8998/reports/component/884713/safes/
Each benchmark should have a unique and informative name, for example
bcb4a75bde38--drivers--hid--hid.ko--linux--fs--char_dev.zip
I need to known at least a module name for each benchmark task.

Actions

Copy link

Updated by Vadim Mutilin over 7 years ago

Evgeny Novikov wrote:

I think that this all is about so called input files for static verifiers (although their proper name is input files for BenchExec). They aren't the same as verification tasks in terms of SV-COMP. These files are always kept if a corresponding option is turned on when a job decision is started.

First of all I would like to know how many verification tasks should be prepared this time? If this number is several hundreds this definitely can and should be done manually. If several thousands then a very restricted solution should be done (I have an idea about it already).

We want to prepare many benchmark tasks as usual. With current possibilities of Klever it is hard to prepare a set of meaningful size.
The idea is to take a set of Safes, a set of Unsafes and a set of Timelimits for several specifications.

Actions

Copy link

Updated by Evgeny Novikov over 7 years ago

Taking all together I see neither a clear goal what should be achieved nor a clear specification what should be done. Actually you don't need to download all safes for the validation set since at least for 6 of 21 of them it is already well known that they aren't really good (see here). So next time you likely will ask to automatically filter them out. Some of safes can be of no interest since they can be very trivial, e.g. too low time or/and memory is required to find them. You aren't interested in safes marked with some tags. Coverage can be too low. Some tasks could be already submitted and you don't need them again. And so on... Too many factors some of them aren't even known to be taken into account and this will be used and refined once a year. Implementation will require more than a week (and a couple of days each year). Downloading ~15 safes will take you less than 5 minutes. Downloading all relevant tasks this year will take half an hour.

Instead of submitting much trash in last minutes as it is often done, please, carefully and manually select and download really outstanding tasks using existing means, examine them, name and reformat them in accordance with SV-COMP rules.

This issue just push me to a really important task that is required much more than once a year by much more people. Whereas many use cases can be already performed it isn't easy to get some interesting tables or even graphs because of corresponding information isn't summarized, cached and, of course, shown. Some trivial use cases are I need to get an unsafe that can be quickly obtained to experiment with error trace visualization enhancement or one needs some safes that requires extremely much resources to experiment with a new analysis algorithm. More advanced use cases is to demonstrate, say, how verification time depends on module sizes or to get graphs similar to the CPAchecker/SV-COMP ones. So I assume to think in this direction rather than to solve this issue. BTW, this will simplify several tasks mentioned here as well.

Actions

Copy link

Updated by Vadim Mutilin over 7 years ago

Evgeny Novikov wrote:

Taking all together I see neither a clear goal what should be achieved nor a clear specification what should be done. Actually you don't need to download all safes for the validation set since at least for 6 of 21 of them it is already well known that they aren't really good (see here).

This page was taken as an example. Actually we need for example all 3476 safes from http://ldvstore:8998/jobs/174/

So next time you likely will ask to automatically filter them out. Some of safes can be of no interest since they can be very trivial, e.g. too low time or/and memory is required to find them. You aren't interested in safes marked with some tags.

For the purpose of filtering we can use CPAchecker infrastructure. There are already nice tables and graphs with the help of which you can easily select tasks according to your criteria, e.g. select only hard benchmark tasks.
What we really need is to get a set of benchmark tasks.

Actions

Copy link

Updated by Vadim Mutilin over 7 years ago

Evgeny Novikov wrote:

at least for 6 of 21 of them it is already well known that they aren't really good (see here).

Good or Bad depends on your purpose.
For example, if EMG misses some function call and the benchmarks is safe then in validation set of Klever we mark it as Bad safe, but for SV-COMP the benchmark is Good, because problems in EMG has nothing to do with verification tools. We likely will select the benchmark especially if it is hard for verification.

Actions

Copy link

#10

Updated by Ilja Zakharov over 7 years ago

As I need the task implemented too, I would like to express my own requirements to compare with the thoughts already discussed there. Making long story short there are two use cases I see here:

1) Download from the web-interface an archive which includes source files, a property file(es) and single XML for BenchExec. This archive contains all tasks generated for verification tool during the job solution and for which particular verdicts have been obtained (unknown, safe, unsafe (no timeouts - this is not a verdict!)).
2) The same task as stated above, but this time we need additionally filter tasks according to given marks already created for tasks of the job. Here we can decide to download strictly timeouts and etc.

Actions

Copy link

#11

Updated by Vadim Mutilin over 7 years ago

As far as deadline for submission of benchmark verification tasks is approaching (Oct. 21, 2016 - http://sv-comp.sosy-lab.org/2017/dates.php) we need at least a feature to download all tasks as mentioned by Ilja.

Actions

Copy link

#12

Updated by Evgeny Novikov over 7 years ago

Priority changed from Urgent to Immediate

I will try to summarize the discussion.

At all pages with lists of leaf reports (Safes, Unsafes, Unknowns) either total or restricted somehow, e.g. by a tag, by a problem or in future by some criteria like CPU time or coverage, there should be a button to download an archive, that contains all archives with files uploaded for a corresponding parent component (now just the CPAchecker verifier). If the same leaf reports belong to the same parent component just download the corresponding archive once. If there will be no such files, when a parent component of unknown(s) isn't a verifier, just forbid to download an archive or make it empty. Names of archives should contain values of attributes specified for corresponding job classes. For validation jobs these should include commits, verification objects and rule specifications. For verification just two last ones.

All filtering will be performed either using current or future (these ones require extremely more time for careful discussion and implementation) means of Bridge, or even some third-party means, including expert brains.

I argue against now Bridge should somehow merge all tasks because of I don't know how to do this, especially in general case. Let's somebody more clever will develop a corresponding script for this and when it will be well tested, then incorporate it with Bridge.

I think that refined issue can be implemented and tested in several days.

Actions

Copy link

#13

Updated by Vladimir Gratinskiy over 7 years ago

Due date set to 10/13/2016
Status changed from Open to Resolved
% Done changed from 0 to 100

Sorry, I didn't read the whole list of comments to this feature, but I did it in branch "download_for_compet" in way I discussed it with Ilja Zakharov.

Actions

Copy link

#14

Updated by Vitaly Mordan over 7 years ago

There was the following error in bridge during downloading files for competition:

There are no filters
Traceback (most recent call last):
File "/home/ubuntu/klever/bridge/jobs/views.py", line 993, in download_files_for_compet
jobtar = DownloadFilesForCompetition(job, json.loads(request.POST['filters']))
File "/home/ubuntu/klever/bridge/reports/utils.py", line 529, in init
self.__get_archive()
File "/home/ubuntu/klever/bridge/reports/utils.py", line 543, in __get_archive
raise ValueError('There are no filters')
ValueError: There are no filters

Actions

Copy link

#15

Updated by Vadim Mutilin over 7 years ago

Status changed from Resolved to Open

Web interface shows
"Unknown error"

Actions

Copy link

#16

Updated by Vladimir Gratinskiy over 7 years ago

Vitaly Mordan wrote:

There was the following error in bridge during downloading files for competition:

There are no filters
Traceback (most recent call last):
File "/home/ubuntu/klever/bridge/jobs/views.py", line 993, in download_files_for_compet
jobtar = DownloadFilesForCompetition(job, json.loads(request.POST['filters']))
File "/home/ubuntu/klever/bridge/reports/utils.py", line 529, in init
self.__get_archive()
File "/home/ubuntu/klever/bridge/reports/utils.py", line 543, in __get_archive
raise ValueError('There are no filters')
ValueError: There are no filters

You must choose at least one type of reports (safes/unsafes/unknowns).

Actions

Copy link

#17

Updated by Vladimir Gratinskiy over 7 years ago

Vadim Mutilin wrote:

Web interface shows
"Unknown error"

What does log say?

Actions

Copy link

#18

Updated by Vitaly Mordan over 7 years ago

Vladimir Gratinskiy wrote:

Vitaly Mordan wrote:

There was the following error in bridge during downloading files for competition:

There are no filters
Traceback (most recent call last):
File "/home/ubuntu/klever/bridge/jobs/views.py", line 993, in download_files_for_compet
jobtar = DownloadFilesForCompetition(job, json.loads(request.POST['filters']))
File "/home/ubuntu/klever/bridge/reports/utils.py", line 529, in init
self.__get_archive()
File "/home/ubuntu/klever/bridge/reports/utils.py", line 543, in __get_archive
raise ValueError('There are no filters')
ValueError: There are no filters

You must choose at least one type of reports (safes/unsafes/unknowns).

This error was shown for any type of reports.

Actions

Copy link

#19

Updated by Vadim Mutilin over 7 years ago

Vladimir Gratinskiy wrote:

What does log say?

See message of Vitaly

"Unknown error" occur for any type od download for safes, unsafes and for unknown with/or without problems selected

Actions

Copy link

#20

Updated by Vladimir Gratinskiy over 7 years ago

You can't download files of lighweight jobs. They don't contain safes' and unknowns' files. Next commit will disable access to download files of such jobs.

Actions

Copy link

#21

Updated by Vitaly Mordan over 7 years ago

File svcomp.tar.gz svcomp.tar.gz added

None-lightweight jobs are processed well and prepared benchmark can be solved by benchexec.

But there is the following warning during extraction of downloaded archive:
tar: benchmark.xml: implausibly old time stamp 1970-01-01 03:00:00
Example archive is attached.

Actions

Copy link

#22

Updated by Ilja Zakharov over 7 years ago

There are minor issues that I observe:
1) If there are some unknowns without generated tasks that can be downloaded and downloading always fails if no unknown marks are selected due to them. I expect that such tasks should be ignored.
2) I am unable to download unknown tasks without marks applied, expect an extra option in the list of marks to download to be shown.
3) Symbols '/' from driver paths are replaced by symbols '-', but '-' can enter names of directories and modules, so it is better to use '---' or something else for more clear names.

Actions

Copy link

#23

Updated by Evgeny Novikov over 7 years ago

Ilja Zakharov wrote:

There are minor issues that I observe:
...
3) Symbols '/' from driver paths are replaced by symbols '-', but '-' can enter names of directories and modules, so it is better to use '---' or something else for more clear names.

I am wondering why we can't keep the drivers hierarchy? Is it required by SV-COMP rules to place all benchmarks as files into one directory? It would be much better to use intermediate directories.

Actions

Copy link

#24

Updated by Vladimir Gratinskiy over 7 years ago

Due date changed from 10/13/2016 to 11/16/2016
Status changed from Open to Resolved

Ilja Zakharov wrote:

There are minor issues that I observe:
1) If there are some unknowns without generated tasks that can be downloaded and downloading always fails if no unknown marks are selected due to them. I expect that such tasks should be ignored.

If downloading fails with "There are no filters" error in log then there are no reports with files satisfied the selected filters. I can't check it before downloading.

2) I am unable to download unknown tasks without marks applied, expect an extra option in the list of marks to download to be shown.

I've added filter "Without marks" that will appear if there are at least one marked unknown.

3) Symbols '/' from driver paths are replaced by symbols '-', but '-' can enter names of directories and modules, so it is better to use '---' or something else for more clear names.

Evgeny Novikov wrote:

I am wondering why we can't keep the drivers hierarchy? Is it required by SV-COMP rules to place all benchmarks as files into one directory? It would be much better to use intermediate directories.

I've added hierarchy for files in archive. There are some changes in names. For example:
Unknowns/f__linux-mutex__drivers-usb-core-usb.cil.i --> Unknowns/linux-mutex/drivers/usb/core/f__usb.ko.cil.i

In case if there are several reports with the same rule and module:
Unsafes/u__linux-mutex__drivers-usb-core-usb__2.cil.i --> Unsafes/linux-mutex/drivers/usb/core/2__u__usb.ko.cil.i

Actions

Copy link

#25

Updated by Vadim Mutilin over 7 years ago

Status changed from Resolved to Open

Vladimir Gratinskiy wrote:

Evgeny Novikov wrote:

I am wondering why we can't keep the drivers hierarchy? Is it required by SV-COMP rules to place all benchmarks as files into one directory? It would be much better to use intermediate directories.

I've added hierarchy for files in archive. There are some changes in names. For example:
Unknowns/f__linux-mutex__drivers-usb-core-usb.cil.i --> Unknowns/linux-mutex/drivers/usb/core/f__usb.ko.cil.i

In case if there are several reports with the same rule and module:
Unsafes/u__linux-mutex__drivers-usb-core-usb__2.cil.i --> Unsafes/linux-mutex/drivers/usb/core/2__u__usb.ko.cil.i

Yes, it is required by SV-COMP infrastructure that benchmarks should have unique names in the same subcategory (DeviceDrivers for us). Please revert this changes, but replace / by ---

Actions

Copy link

#26

Updated by Evgeny Novikov over 7 years ago

Status changed from Open to Resolved

Vadim Mutilin wrote:

Vladimir Gratinskiy wrote:

Evgeny Novikov wrote:

I am wondering why we can't keep the drivers hierarchy? Is it required by SV-COMP rules to place all benchmarks as files into one directory? It would be much better to use intermediate directories.

I've added hierarchy for files in archive. There are some changes in names. For example:
Unknowns/f__linux-mutex__drivers-usb-core-usb.cil.i --> Unknowns/linux-mutex/drivers/usb/core/f__usb.ko.cil.i

In case if there are several reports with the same rule and module:
Unsafes/u__linux-mutex__drivers-usb-core-usb__2.cil.i --> Unsafes/linux-mutex/drivers/usb/core/2__u__usb.ko.cil.i

Yes, it is required by SV-COMP infrastructure that benchmarks should have unique names in the same subcategory (DeviceDrivers for us). Please revert this changes, but replace / by ---

Today this is so, tomorrow this will change. We shouldn't fix our core infrastructure each time.

It is pretty obvious that using subdirectories looks much more nice and is much more convenient. We understood this a long ago. I hope that SV-COMP people will realize this one day. Nowadays this conversion can be and should be trivially done by external scripts.

Actions

Copy link

#27

Updated by Vadim Mutilin over 7 years ago

Evgeny Novikov wrote:

It is pretty obvious that using subdirectories looks much more nice and is much more convenient. We understood this a long ago. I hope that SV-COMP people will realize this one day. Nowadays this conversion can be and should be trivially done by external scripts.

As the primary user of this feature I can not agree that directory structure is much more convenient. For me unique file names are more convenient, because I can freely move the file aroud without conflicts. Directory structure complicates the other things like it.

Actions

Copy link

#28

Updated by Evgeny Novikov over 7 years ago

Status changed from Resolved to Open

Vadim Mutilin wrote:

Evgeny Novikov wrote:

It is pretty obvious that using subdirectories looks much more nice and is much more convenient. We understood this a long ago. I hope that SV-COMP people will realize this one day. Nowadays this conversion can be and should be trivially done by external scripts.

As the primary user of this feature I can not agree that directory structure is much more convenient. For me unique file names are more convenient, because I can freely move the file aroud without conflicts. Directory structure complicates the other things like it.

If so, let's revert that change and replace "/" with "---". But, please, never asks to change this. If you will find something more useful one day, implement this in your scripts yourself.

Actions

Copy link

#29

Updated by Vladimir Gratinskiy over 7 years ago

Status changed from Open to Resolved

Fixed.

Actions

Copy link

#30

Updated by Ilja Zakharov over 7 years ago

Status changed from Resolved to Closed

Merged in 40c22ee6.

Actions

Copy link

#31

Updated by Vladimir Gratinskiy over 7 years ago

This branch should not have been merged with master.

Actions

Copy link

#32

Updated by Ilja Zakharov over 7 years ago

It contains quite useful features which are often needed. Keeping a separate branch and updating it each time is not a perfect solution also. So, if something is implemented roughly, then it is better to provide a fix with a refactoring.

Actions

Copy link

#33

Updated by Evgeny Novikov over 7 years ago

Category set to Bridge
Status changed from Closed to Resolved

Let's keep it as resolved until the last discussion will be concluded.

Actions

Copy link

#34

Updated by Vladimir Gratinskiy over 7 years ago

Ilja Zakharov wrote:

It contains quite useful features which are often needed. Keeping a separate branch and updating it each time is not a perfect solution also. So, if something is implemented roughly, then it is better to provide a fix with a refactoring.

Ok. I'm planning to refactor it in the "bridge-optimizations" branch.

Actions

Copy link

#35

Updated by Evgeny Novikov over 7 years ago

Status changed from Resolved to Closed

So, the original issue was solved well enough. We will think about and implement really outstanding related features one day later.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Klever

Custom queries

Feature #7580

Download verifications tasks for the SV-COMP

Updated by Vadim Mutilin over 7 years ago

Updated by Alexey Khoroshilov over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Evgeny Novikov over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Evgeny Novikov over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Ilja Zakharov over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Evgeny Novikov over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Vitaly Mordan over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Vitaly Mordan over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Vitaly Mordan over 7 years ago

Updated by Ilja Zakharov over 7 years ago

Updated by Evgeny Novikov over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Evgeny Novikov over 7 years ago

Updated by Vadim Mutilin over 7 years ago

Updated by Evgeny Novikov over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Ilja Zakharov over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Ilja Zakharov over 7 years ago

Updated by Evgeny Novikov over 7 years ago

Updated by Vladimir Gratinskiy over 7 years ago

Updated by Evgeny Novikov over 7 years ago