-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
potentially unsound answers #8
Comments
Also, for some of these the platform of the computer I am on seems to matter. wintersteiger/mul/mul-has-no-other-solution-15884.smt2 for example returns SAT on my AMD computer but UNKNOWN on my Intel computer. I am using the same binary (which I compiled on my intel). Both computers are running the same Debian. Below the relevant extracts from /proc/cpuinfo |
Some of the issues could be resolved by not using -march=native that the CMakeLists hard-codes. However the bulk of the issued described above remain. |
The distance between any number and a NaN should be a (fixed) positve number. The original treatment is unsound Partial fix for #8
Currently, we do not support setting rounding modes. We detect such constructs in the given formula, and warn the user that soundness might be affected Related to #8
- Bug fix in how we treated NaNs in the distance function. The distance between any number and a NaN should by > 0. Currently, it's fixed to 1024. - Bug fix in how negation (`not` operator) propagated to sub-expressions. Observe de-morgan's law - Bug fix in handling logical equality and inequality. Formerly, we did not make a distinction between logical equality and FP equality which is wrong. Two separate distance functions were added to support this distinction. - Warn users about unsupported rounding modes. goSAT does not honor rounding mode spec and keeps the default mode in the underlaying machine. - Did minor code refactoring
I can confirm that all smt examples you provided can lead to unsound results. The following bug fixes have been applied:
Also, the examples exposed two current limitations which might be addressed in the future:
Additionally, the changes have passed the regression tests on the Finally, I do not think the Thanks for experimenting with goSAT! |
Thank you for the fast fixes :) I am currently re-running our benchmark suite and will let you know if anything new pops up. I think it would be good that for anything you see that you don't support (like different rounding modes in the same expression) to instantly emit "unsupported" (see page 47 of the smtlib standard) or give the "unknown" response to a check-sat query. Treating Float16 or Float128 as Float64 will not be sound, I would again encourage you to emit "unknown" at that point. I've also now compiled without -march and with -march=opteron-sse3 and both work. -march=native really does not if you build on a recent Intel box and then use the binary on an AMD machine. Many executions terminate with "illegal instruction" and others just give incorrect results. |
Results are different now, we've found 4970 unsound answers, but they are not the same ones as before. Below a reduced list, but it should cover everything.
|
We explicitly check the argument sort of Z3_OP_EQ. In case they are of sort Z3_FLOATING_POINT_SORT we treat equality as Z3_OP_FPA_EQ, otherwise code for Z3_OP_EQ will be generated as usual. Partial fix for #8
I added a few usability features which includes reporting "unsupported" upon detecting FP16 or FP128 variables. More importantly, it turns out that the parser of z3 provides For example, consider the benchmark
the parser gives I've implemented a workaround where for operation |
The Z3 parser is correct, the comparison is SMTLIB =, not FP EQ. I believe your workaround is only going to add unsoundness when benchmarks depend on subtle difference between the two. For these specific constants they are equivalent. But they are not equivalent if either of the constant is 0 or NaN. In C terms you might want to translate |
I should further note that the SMTLIB floating point standard makes this distinction, and several benchmarks depend on this. |
Sorry for the misunderstanding. I believe that z3 is doing the correct thing, however, as far as the code generator is concerned, it was unexpected. I'm interested in supporting a reasonable subset of QF_FP in a sound or, for practical reasons, may be "soundy" way. Along the way, we can learn about the limits of the stochastic search approach. So far, goSAT passes the soundness test on the |
Hi,
We've found the following answers of gosat to be probably incorrect (sat that should be unsat and vice versa).
The filenames below come from my bag of benchmarks. https://github.com/florianschanda/smtlib_schanda I've tried my best to make sure the various status are correct by mistakes are of course possible.
The text was updated successfully, but these errors were encountered: