Skip to content

Latest commit

 

History

History
365 lines (313 loc) · 22.6 KB

README.org

File metadata and controls

365 lines (313 loc) · 22.6 KB

The Agda Development of LambdaIFCStar

{{{surface}}}, A Gradual Security-Typed Programming Language

{{{surface}}} is an experimental gradual security-typed programming language. {{{surface}}} is gradual, in that it provides the programmer with the freedom of choice between runtime versus compile-time enforcement. {{{surface}}} is secure, because it enforces information-flow security and satisfies noninterference.

On the technical side, {{{surface}}} is the only programming language design that combines gradual typing with information-flow control (IFC) without making any sacrifices. {{{surface}}} (1) satisfies noninterference (the security guarantee), (2) satisfies the gradual guarantee, (3) enjoys type-guided classification, and (4) utilizes NSU checking to enforce implicit flows through the heap with no static analysis required. The semantics of {{{surface}}} is the first gradual security-typed language to be specified using coercion calculi (a la Henglein).

A Tale of Two Gradual Security Languages

This repository contains the Agda mechanization of {{{surface}}} as well as its sibling gradual security-typed language, {{{surface_old}}}:

  1. {{{surface_old}}} is a gradual security language that is vigilant (that is, a runtime error is triggered if inconsistent type annotations are detected) but does not enable type-guided classification (that is, type annotations do not affect the security level of runtime values). The development of {{{surface_old}}} is reported in our Arxiv draft Mechanized Noninterference for Gradual Security (Chen and Siek [2022]). The semantics of {{{surface_old}}} is given by compiling to the cast calculus {{{cc_old}}}, which utilizes type-based casts as its cast representation.
  2. {{{surface}}} is a gradual security language that enjoys both vigilance as well as type-guided classification, thus enabling type-based reasoning for both explicit and implicit information flows. The development of {{{surface}}} is reported in our upcoming PLDI 2024 paper Quest Complete: The Holy Grail of Gradual Security (Chen and Siek [2024]). The semantics of {{{surface}}} is given by compiling to another cast calculus {{{cc}}}, in which we adapt ideas from the Coercion Calculus of Henglein [1994] to IFC.

Both cast calculi, {{{cc}}} and {{{cc_old}}}, support blame-tracking. Casts are inserted on-demand, only when there is insufficient information during compilation to decide whether a security policy is enforced or not.

Quick Start Guide

Software prerequisites

Software dependencies for checking proofs:

Software dependencies for running demos:

Software dependencies for drawing simulation diagrams:

Building the project

  • To build everything, simply run make at the top level of this repository. This will build the proofs, the runnable demo, and a simulation explorer.
  • Alternatively, to check the proofs only, run make proofs. The type-checker of Agda makes sure all the proofs are correct.
  • Alternatively, to build the simulator only, run make sim.

Running the demo programs of {{{surface_old}}}

We include example programs written in {{{surface_old}}}. To get a taste of {{{surface_old}}} running in action, please build everything first and then run bin/RunDemo.

Selected Project Code Structure

All Agda source files are located in the =src/= directory and end with dot Agda. A (fairly) large part of the code-base is shared between {{{surface}}} and {{{surface_old}}}.

Top-level modules of meta-theoretical results and demo programs

There are three top-level modules in the =src/= directory:

  1. *=Proofs=*: The module sources the proofs of important meta-theoretical results in the following files:
    • Meta-theoretical results for {{{surface_old}}} and its cast calculus {{{cc_old}}}:
      • =CC.TypeSafety=: {{{cc_old}}} is type safe by satisfying progress and preservation.
      • =CC.BigStepPreservation=: The big-step semantics of {{{cc_old}}} also preserves types. This big-step semantics is used in the erasure-based noninterference proof.
      • =CC.BigStepErasedDeterministic=: The big-step evaluation of erased {{{cc_old}}} is deterministic.
      • =CC.Noninterference=: {{{cc_old}}} satisfies termination-insensitive noninterference (TINI).
      • =CC.Compile=: The compilation from {{{surface_old}}} to {{{cc_old}}} preserves types.
    • Meta-theoretical results for {{{surface}}} and its cast calculus {{{cc}}}:
      • =CC2.Progress=: {{{cc}}} satisfies progress, so that a well-typed {{{cc}}} term is either a value or a blame, which does not reduce, or the term takes one reduction step.
      • =CC2.Preservation=: The operational semantics of {{{cc}}} preserves types and the well-typedness of heap.
      • =Compile.CompilationPresTypes=: The compilation from {{{surface}}} to {{{cc}}} preserves types.
      • =Surface2.GradualGuarantee=: {{{surface}}} satisfies the gradual guarantee.
  2. *=RunDemo=*: The program runs a stepper on the following {{{surface_old}}} programs and pretty-prints their reduction sequences to command line using the Console pretty-printer backend:
    • The stepper that generates reduction sequences for {{{cc_old}}} in string format is defined in =CC.Interp=.
    • =ExamplePrograms.Demo.Example1=: This example shows that {{{surface_old}}} indeed facilitates both compile-time (static) and runtime (dynamic) information-flow control. If a {{{surface_old}}} program is fully statically-typed, the type system of {{{surface_old}}} alone guarantees security. If type information is insufficient, the runtime of {{{surface_old}}} performs security checks during program execution. The transition between static and dynamic IFC enforcement is controlled by the programmer, depending on the precision of type annotations.
    • =ExamplePrograms.Demo.Example2=: This example establishes the intuition that even if the programmer opts for dynamic IFC enforcement, {{{surface_old}}} still guards against any possible information leak through the heap.
    • =ExamplePrograms.Demo.Example3=: This example shows that moving type annotations to be less precise (more dynamic) does not change the runtime behavior of a {{{surface_old}}} program.
  3. *=RunSimulation=*: The program runs a simulator that simulates between {{{cc_old}}} terms of different precision. The output defaults to the GraphViz pretty-printer backend, which will place *.dot files that represent the simulation diagrams in the =plot/= directory.

Noteworthy technical definitions

General technical definitions in directory =Common/=

  • =Common.SecurityLabels=: Definitions of security labels as well as predicates, relations and operators on security labels.
  • =Common.Types=: Definitions of (security) types as well as predicates, relations and operators on types.
  • =Common.BlameLabels=: This module defines blame labels, which are identifiers of casts. In case a cast fails, it raises a cast error, called blame, that contains its blame label. In this way, the programmer knows which cast is causing the problem.
  • =Common.TypeBasedCast=: This module defines type-based casts between two security types. In particular, {{{cc_old}}} uses type-based casts as its cast representation.
  • =Common.Coercions=: This modules defines the coercion-based cast representation used by {{{cc}}}. In particular, it defines the security coercions on values of {{{cc}}}.

The heap model of {{{surface_old}}} and {{{surface}}} in directory =Memory/=

  • =Memory.Addr=: Definition of memory addresses.
  • =Memory.Heap=: Definition and helper methods of the split-heap model, where low and high addresses are indexed separately. For example, heap lookup has the form $μ(\ell, n) = V$, where $\ell$ is the security level of the memory cell, $n$ is the index of the part of the memory where all cells are labeled $\ell$, and $V$ is the value stored at $(\ell, n)$.
  • =Memory.HeapTyping=: Definition of heap well-typedness. It is defined point-wise. The typing judgment has the form $Σ \vdash μ$, where $Σ$ is the heap context and $μ$ is the (well-typed) heap.

Technical definitions of the surface language {{{surface_old}}} in directory =Surface/=

  • =Surface.SurfaceSyntax=: The syntax definition of {{{surface_old}}}. It uses the Abstract Binding Tree (ABT) library. For example, the term of function application has the signature sig (op-app p) = ■ ∷ ■ ∷ [], because it contains two sub-terms and introduces no binder. On the other hand, the term for $λ$ abstraction has the signature sig (op-lam pc A ℓ) = (ν ■) ∷ [], because there is one free variable in the body of a $λ$ (indicated by (ν ■)).
  • =Surface.SurfaceTyping=: The typing rules for {{{surface_old}}}. The typing judgment takes the form $Γ ; g \vdash M : A$, where $Γ$ is the typing context, $g$ is the static program counter (PC) label, $M$ is a surface language program, and $A$ is the security type that $M$ is typed at.

Technical definitions of the cast calculus {{{cc_old}}} in directory =CC/=

  • =CC.CCSyntax=: The syntax definition of {{{cc_old}}}. Again, the module uses the ABT library. There are a few terms that arise during runtime, including memory addresses, casts, PC casts (cast-pc), protection terms (prot), and runtime errors (error). The opaque term (●) is used in the erasure-based noninterference proof.
  • =CC.CCTyping=: The typing judgment is of form $Γ ; Σ ; g ; \ell \vdash M : A$. It contains 6 field: $Γ$ is the typing context; $Σ$ is the heap context; $g$ is the static PC, which can be viewed as the compile-time approximation of the runtime PC; $\ell$ is the dynamic (runtime) PC; $M$ is a {{{cc_old}}} term; $A$ is the type of $M$.
  • =CC.HeapTyping=: Lemmas about heap well-typedness for {{{cc_old}}}. These lemmas are used in the type safety proof.
  • =CC.Values=: The definition of values in {{{cc_old}}}. A value can be (1) a constant (2) an address (3) a $λ$ abstraction or (4) a value wrapped with an irreducible (Inert) cast. The opaque term is also a value in the semantics of erased {{{cc_old}}}. In this module, there are canonical-form lemmas for constants, functions, and memory addresses. For example, a value of function type must be either a $λ$ or a function proxy (a $λ$ wrapped with at least one inert function cast).
  • =CC.Reduction=: The operational semantics for {{{cc_old}}}. The relation $M \mid μ \mid \ell \longrightarrow N \mid μ’$ says that {{{cc_old}}} term $M$ reduces with heap $μ$ under PC label $\ell$ (by one step) to term $N$ and heap $μ’$.
    • The rule cast turns to the ApplyCast relation defined in the following module:
      • =CC.ApplyCast=: The cast-application relation has the form $\mathtt{ApplyCast}\;V , c \leadsto M$, where $V$ is a value and $c$ is the cast to apply; $M$ can be either a value, or a cast error (blame) if the cast application fails.
    • The rule fun-cast, assign?-cast, and assign-cast turn to the proxy-elimination helpers that are defined in the following module:
      • =CC.ProxyElimination=: The module defines two helper functions: elim-fun-proxy works on a function proxy and elim-ref-proxy works on a reference proxy. The helpers check the well-formedness of the outermost inert cast, generate proper casts that preserve types if well-formed and signal errors if ill-formed.
  • =CC.Interp=: A stepper that replies on the progress proof to generate a reduction sequence of $k$ steps for a well-typed {{{cc_old}}} term.
  • =CC.Compile=: Compilation from {{{surface_old}}} to {{{cc_old}}}. The module also contains a proof that this compilation function preserves types (compilation-preserves-type).

The noninterference proof of {{{cc_old}}} is erasure-based. It uses the following auxiliary definitions:

  • =CC.BigStep=: The big-step semantics for {{{cc_old}}}. It is a direct mechanical translation from the semantics in =CC.Reduction=.
  • =CC.Erasure=: Definition of the erasure functions for terms (erase) and heaps (erase-μ) in {{{cc_old}}}. Note that the memory cells of high security are completely erased, because the values read from those cells are always of high security and thus appear to be opaque from a low-observer’s point of view.
  • =CC.BigStepErased=: The big-step semantics for erased {{{cc_old}}}.

Technical definitions of the surface language {{{surface}}} in directory =Surface2/=

  • =Surface2.Syntax=: The syntax of {{{surface}}}. The most noteworthy difference from {{{surface_old}}} is that in {{{surface}}}, the PC annotation on a $λ$ is treated as a type annotation instead of a value annotation, which means that it is allowed to be $*$ in {{{surface}}} (but not in {{{surface_old}}}).
  • =Surface2.Typing=: The typing rules for {{{surface}}}.
  • =Surface2.Precision=: The precision rules for {{{surface}}}. The precision relation is used in the definition and the proof of the gradual guarantee.

The coercion calculus for security labels in directory =CoercionExpr/=

This directory contains the definition of and the lemmas about the coercion calculus on security labels.

  • =CoercionExpr.Coercions=: A single coercion on security labels can either be identity ($\mathbf{id}$), subtype ($↑$), injection from $\ell$ ($\ell!$), or projection to $\ell$ with blame label $p$ ($\ell?^p$).
  • =CoercionExpr.CoercionExpr=: The syntax, typing, reduction, and normal forms of coercion sequences (formed by sequencing single coercions). We only care about well-typed coercion sequences, so the coercion sequences are intrinsically typed. Coercion sequences strongly normalize (cexpr-sn) and the normalization is deterministic (det-mult), so coercion sequences can be used to model information-flow checks.
  • =CoercionExpr.SyntacComp=: Syntactical composition of coercion sequences. Sequencing and composing coercions model explicit information flow.
  • =CoercionExpr.Stamping=: The stamping operation for coercion sequences. Stamping label $\ell$ on a coercion sequence in normal form $\bar{c}$ results in a new coercion sequence in normal form whose security level is promoted by $\ell$. Stamping models implicit information flow.
  • =CoercionExpr.SecurityLevel=: The $|\bar{c}|$ operator retrieves the security level from the coercion sequence in normal form $\bar{c}$.
  • =CoercionExpr.Precision=: The precision relation between two coercion sequences takes the form $\vdash \bar{c} \sqsubseteq \bar{d}$. The gradual guarantee states that replacing type annotations with $*$ (decreasing type precision) should result in the same value for a correctly running program while adding annotations (increasing type precision) may trigger more runtime errors. The precision relation is a syntactical characterization of the runtime behaviors of programs of different type precision.

One key to the proof of the gradual guarantee is that “security is monotonic with respect to precision”, which states that if $\vdash \bar{c} \sqsubseteq \bar{d}$ then $|\bar{c}| \preccurlyeq |\bar{d}|$.

Security label expressions in directory =LabelExpr/=

Technical definitions of the cast calculus {{{cc}}} in directory =CC2/=

  • =CC2.Syntax=: As usual, the cast calculus {{{cc}}} is a statically-typed language that includes an explicit term for casts. Many of the operators in {{{cc}}} have two variants, a “static” one for when the pertinent security label is statically known (such as ref) and the “dynamic” one for when the security label is statically unknown (such as ref?). The operational semantics of the “dynamic” variants involve runtime checking.
  • =CC2.Typing=: The typing judgment is of form $Γ ; Σ ; g ; \ell \vdash M ⇐ A$. The six fields are of the same meanings as those of {{{cc_old}}}. The main difference is that the typing of {{{cc}}} stays in checking mode, so the type $A$ is considered an input of the judgment.
  • =CC2.HeapTyping=: Lemmas about heap well-typedness for {{{cc}}}. They are similar to those of {{{cc_old}}} because {{{cc}}} shares the same heap model as {{{cc_old}}}.
  • =CC2.Values=: The definition of values in {{{cc}}}. A raw value can be (1) a constant (2) an address or (3) a $λ$ abstraction. A value can be either a raw value, or a raw value wrapped with an irreducible cast. A cast is irreducible if its top-level security label coercion is in normal form and the cast is not identity.
  • =CC2.Reduction=: The operational semantics for {{{cc}}}. Similar to the one of {{{cc_old}}}, the relation $M \mid μ \mid \ell \longrightarrow N \mid μ’$ says that {{{cc}}} term $M$ reduces with heap $μ$ under PC label $\ell$ to term $N$ and heap $μ’$.
    • The reduction of PC depends on the operational semantics of label expressions.
    • =CC2.CastReduction=: The rules for cast are grouped in the relation $V \langle c \rangle \longrightarrow M$. The relation states that applying the coercion on values $c$ on value $V$ results in $M$. We use this relation in rule cast of {{{cc}}}.
    • =CC2.Stamping=: The stamp-val function defines the stamping operation on {{{cc}}} values. The lemma stamp-val-value states that stamping a value results in another value and the lemma stamp-val-wt states that stamping preserves types, so the value after stamping is well-typed.
  • =CC2.Precison=: The precision relation of {{{cc}}}.
  • =CC2.HeapContextPrecision=: The precision between two heap typing contexts has the form $Σ \sqsubseteq_m Σ’$ and is defined point-wise.
  • =CC2.HeapPrecision=: The precision relation between two heaps takes the form $Σ ; Σ’ \vdash μ \sqsubseteq μ’$. It is defined point-wise, similar to the definition of heap well-typedness.
  • =Compile.Compile=: The compilation function from {{{surface}}} to {{{cc}}}.

The dynamic extreme {{{dynifc}}} of {{{surface_old}}} and {{{surface}}} in directory =Dyn/=

This directory contains the formalization of the dynamic extreme for {{{surface_old}}} and {{{surface}}}. We call this dynamic IFC language with labeled heap (that is, each memory cell is associated with a security label) {{{dynifc}}}. {{{dynifc}}} is used in Section 11 of the Appendix of Chen and Siek [2024], where noninterference of {{{surface}}} is proved by simulation between {{{cc}}} and {{{dynifc}}}.