Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundness of VST for gcc and clang #566

Open
andrew-appel opened this issue Apr 21, 2022 · 3 comments
Open

Soundness of VST for gcc and clang #566

andrew-appel opened this issue Apr 21, 2022 · 3 comments

Comments

@andrew-appel
Copy link
Collaborator

VST is formally proved sound w.r.t. the operational semantics of CompCert Clight.
VST is supposed to be sound for gcc and clang, but is not formally proved sound.
In this issue I propose that we could put that statement on firmer footing.

In some cases, the Clight semantics is a refinement of the C11 semantics. See Krebbers, "The C Standard Formalized in Coq", sections 2.3-2.5 and 10.1. That means, any C program with defined behavior will execute correctly when compiled by CompCert, but some C programs with C11-undefined behavior have defined behavior in CompCert.

Consequently, it is theoretically possible that a program proved correct in VST will have correct behavior when compiled by CompCert but incorrect or undefined behavior when compiled by gcc or clang.

It is an explicit goal of the VST project that,

  • VST is proved sound for CompCert semantics
  • VST is sound (without proof) for clang and gcc.
    We achieve soundness (without proof) for clang and gcc by having stricter rules in VST's program logic (Verifiable C) than would be necessary just to prove soundness w.r.t. CompCert.

Here are the ways that soundness for CompCert does not automatically imply soundness for gcc/clang:

  • C11 considers signed integer (and signed long integer) overflow as undefined behavior, but CompCert considers it as defined (in two's complement modular arithmetic). VST handles this case by stricter "typechecking" than would be needed just for CompCert soundness. This is implemented in VST.veric.expr.tc_nobinover and other related definitions and proofs.
  • C11 allows function arguments (and particularly their side effects) to be evaluated left-to-right or right-to-left. CompCert specifies left-to-right. VST handles this by requiring all function arguments to be free of side effects. (The output of clightgen -normalize produces C programs with no side effects in function arguments.)
  • C11 has certain rules about "sequence points" that govern the order of side effects within expressions. VST handles this by prohibiting side effects within expressions. (clightgen -normalize, again)
  • C11 permits nested scopes within function bodies; addresses of those variables outside those scopes (but within the function body) is undefined in C11 but defined in CompCert. VST handles this by operating on Clight programs, which do not have nested scopes.
  • C11 requires undefined values for padding bytes between struct fields. VST handles this by VST.floyd.mapsto_memory_block.spacer, which has a Vundef at each padding byte.

Historically, signed integer overflow has been the main source of VST unsoundness. Issues #561, #500, #142, #117 are all related (in some way) to this issue. So it might be worth addressing "soundness of VST for gcc and clang" in an organized way.

Here are some possible ways forward.

  1. Careful code review, by an independent volunteer in consultation with the VST developers, of the typechecker code in veric/expr.v and veric/expr2.v
  2. Write an alternate semantics for CompCert Clight2, in which signed integer overflow is "stuck" (as is side effects inside function parameters). Prove VST sound w.r.t. this semantics. Prove that CompCert's standard Clight2 semantics is a refinement of this one.

Of course, VST can never be formally sound for clang or gcc (with mechanized proof) because those compilers are not specified by a formal semantics, and certainly are not proved correct with respect to any such semantics. Methods 1 or 2 might be the best that can be done. Method 2 might not be so very difficult, either.

@roconnor-blockstream
Copy link
Contributor

roconnor-blockstream commented Apr 21, 2022

There are a couple of other sources of arithmetical undefined behaviour:

  • negating MIN_INT causing signed integer overflow.
  • bit shifting by greater than or equal to the width of the (promoted) type.
  • bit shifting by a negative amount.
  • left shifting causing a signed integer overflow.
  • integer division/remainder by 0.

Modifying a string literal is another undefined behaviour.

I haven't investigated if VST already covers all these cases or not. (e.g. perhaps string literals can never get the writable permission.) However, I don't ever recall seeing maximum-width side-conditions on my use of bit-shifting operations.

(See also: https://nitter.net/real_or_random/status/1314596135395262467 for some fun.)

(edit: I also don't know specifically if gcc or clang take advantage of these forms of undefined behavior or, if they specifically define behavior in some of these cases.)

@andrew-appel
Copy link
Collaborator Author

Generally speaking, VST correctly handles the issues just mentioned by @roconnor-blockstream. If you don't recall seeing maximum-width side-conditions, it's because VST efficiently (and soundly) optimizes those away when the shift amount is a constant. Several of these cases are undefined behavior in CompCert, which is a good thing--it means that VST's formal soundness proof already ensures correctness. But still, these are cases that should be carefully reviewed in a code walk-through.

Read-only string literals are also handled correctly: CompCert's semantics marks them as read-only, so formal soundness is guaranteed in VST.

These points illustrate that in some cases the "walk-through" can be accomplished in the CompCert operational semantics, without having to review the corresponding parts of VST.

@QinshiWang
Copy link
Contributor

Is VST sound for traceless infinite loops? It's said loops like for (;;) can be assumed to terminate.
infinite loop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants