This document provides basic information on Soter and its web interface.
Please note:
The Soter tool is not currently actively developed. Direct any enquiry to Emanuele D'Osualdo.
If you need instructions on how to use Soter, jump to the dedicated section below. For an introduction on how Soter works read on.
If you would like more information about a particular topic, follow the links at the top of the page:
Erlang documents the fragment of Erlang handled by Soter.
Specs illustrates all the methods Soter offers for specifying properties of Erlang modules.
ACS explains the abstract model derived by Soter from the input module.
Papers is an up-to-date list of papers related to Soter. The theory behind Soter can be found in this paper.
Soter (Safety verifier fOr The ERlang language) is a fully-automatic program analyser and verifier for Erlang modules. Given an Erlang module and a safety property, Soter first extracts an abstract (approximate, sound) model and then checks if a corresponding property is satisfied using the model checker BFC.
The name Soter derives from the Ancient Greek name of the spirit of safety, preservation and deliverance from harm.
Soter is designed to accept Erlang modules (with some limitations) as input.
Originally designed to program fault-tolerant distributed systems at Ericsson in the late 80s, Erlang is now a widely used, open-sourced language with support for higher-order functions, concurrency, communication, distribution, on-the-fly code reloading, and multiple platforms (see Armstrong, Virding, and Williams 1993; Armstrong 2010). Largely because of a runtime system that offers highly efficient process creation and message-passing communication, Erlang is a natural fit for programming multicore CPUs, networked servers, distributed databases, GUIs, and monitoring, control and testing tools.
The sequential part of Erlang is a higher-order, dynamically typed, call-by-value functional language with pattern-matching algebraic data types. Following the actor model (Hewitt, Bishop, and Steiger 1973), a concurrent Erlang computation consists of a dynamic network of processes that communicate by message passing. Every process has a unique process identifier (pid), and is equipped with an unbounded mailbox. Messages are sent asynchronously in the sense that send is non-blocking. Messages are retrieved from the mailbox, not FIFO, but First-In-First-Firable-Out (FIFFO) via pattern-matching. A process may block while waiting for a message that matches a certain pattern to arrive in its mailbox. For a quick and highly readable introduction to Erlang, see Armstrong's CACM article.
See erlang.org for the complete documentation and to download the Erlang system.
BFC is a fast coverability solver for Petri nets with transfer arcs, developed by Alexander Kaiser, University of Oxford.
In phase 3, Soter generates a Petri net in BFC's format which represents the abstract model of the system (see ACS). For each property Soter needs to prove, BFC is internally called, with the abstract model and a query representing the property as input.
For more details on how BFC works, how to use it and download links, see http://www.cprover.org/bfc.
Concurrent programs are hard to write. They are just as hard to verify. In the case of Erlang programs, the inherent complexity of the verification task can be seen from several diverse sources of infinity in the state space.
(\(\infty\) 1) Function definitions are not necessarily tail-recursive, so a call-stack is needed.
(\(\infty\) 2) Higher-order functions are first-class values; closures can be passed as parameters or returned.
(\(\infty\) 3) Data domains, and hence the message space, are unbounded: functions may return, and variables may be bound to, terms of an arbitrary size.
(\(\infty\) 4) An unbounded number of processes can be spawned dynamically.
(\(\infty\) 5) Mailboxes have unbounded capacity.
The challenge of verifying Erlang programs is that one must reason about the asynchronous communication of an unbounded set of messages, across an unbounded set of Turing-powerful processes. Our goal is to verify safety properties of Erlang-like programs automatically, using a combination of static analysis and infinite-state model checking.
The starting point of our verification pathway is the abstraction of the sources of infinity (\(\infty\) 1), (\(\infty\) 2) and (\(\infty\) 3). In the style of Van Horn and Might (Van Horn and Might 2010), we devise a machine-based operational semantics of Erlang that uses store-allocated continuations. The advantage of such an indirection is that it enables the construction of a machine semantics which is "generated" from the basic domains of Time, Mailbox and Data. In the paper, we show that there is a simple notion of sound abstraction of the basic domains whereby every such abstraction gives rise to a sound abstract semantics of Erlang programs. Further if a given sound abstraction of the basic domains is finite and the associated auxiliary operations are computable, then the derived abstract semantics is finite and computable.
We study the abstract semantics derived from a particular 0-CFA-like abstraction of the basic domains. However we do not use it to verify properties of Erlang programs directly, as it is too coarse an abstraction to be useful. Rather, we show that a sound abstract model, in the form of Actor Communicating System (ACS), can be constructed in polynomial time by bootstrapping from the 0-CFA-like abstract semantics. Further, the dimension of the resulting ACS is polynomial in the length of the input Erlang program. The idea is that the the 0-CFA-like abstract (transition) semantics constitutes a sound but rough analysis of the control-flow of the program, which takes higher-order computation into account but communicating behaviour only minimally. The bootstrap construction consists in constraining these rough transitions with guards of the form "receive a message of this type" or "send a message of this type" or "spawn a process", thus resulting in a more accurate abstract model of the input Erlang program in the form of an ACS.
Soter generates an abstract model that simulates the input Erlang program.
The abstract model is an Actor Communicating System (ACS) which models the interaction of an unbounded set of communicating actor processes. An ACS has a finite set of control states, a finite set of pid classes, a finite set of messages, and a finite set of transition rules. An ACS transition rule has the shape
\[\pid : q \xrightarrow{\ell} q'\]
which means that a process of pid class \(\pid\) can transition from state \(q\) to state \(q'\) with (possible) communication side effect \(\ell\), of which there are four kinds, namely,
the process makes an internal transition
it extracts and reads a message \(msg\) from its mailbox
it sends a message \(msg\) to a process of pid class \(\pid'\)
it spawns a process of pid class \(\pid'\).
The various kinds of control states and transitions are represented pictorially as follows.
ACS models are infinite state: the mailbox of a process has unbounded capacity, and the number of processes in an ACS may grow arbitrarily large. However the set of pid classes is fixed, and processes of the same pid classes are not distinguishable.
There is a natural interpretation of ACS as vector addition systems (VAS), or equivalently Petri nets, using counter abstraction, as explained in the paper.
Soter produces both a graphical representation for the generated ACS, and an equivalent BFC model.
Soter normally goes through three phases:
In phase 3, Soter will attempt to verify each of the specified properties in turn. The outcome of such attempts can be one of the following.
Soter can be used in one of three ways:
"Analysis Only", to generate the abstract model (see ACS) which can then be viewed, or manually fed into BFC for verification.
"Verify Absence-of-Errors", to generate the abstract model and attempt to prove the absence of certain runtime errors; for details on which kinds of errors Soter can detect see [error-free].
"Verify Assertions", to generate the abstract model and attempt to prove each user-defined property specified via the methods outlined in the section Assertions.
The "Verify All" mode enables both 2 and 3.
We present here a short introduction to property specifications in Soter. For a detailed explanation refer to Specs.
A flexible and simple method for specifying properties of Soter consists in "labelling" and "constraining":
First, you determine which program locations or mailboxes your
property involves. For example, mutual exclusion could involve a program
location in the critical section of a function definition. Then you
proceed by annotating your code with the ?label("LABEL")
(for program locations) and ?label_mail("LABEL")
(for
mailboxes) functions.
Second, you add -uncoverable(PROP)
directives before
your definitions in the module; PROP
has to be a string and
it contains a sequence of constraints, separated by commas. Each
constraint is in the form "LABEL" >= N
where
"LABEL"
is one of the labels defined in step 1, and
N
is a positive natural number. Soter will then try to
prove that no state that satisfies the constraints is reached.
When specifying properties, keep in mind that the constraints describe what are the `bad states', not the good ones.
Note: Soter will ignore these annotations when in "Analysis Only" and "Verify Absence-of-Runtime-Errors" modes.
When in "Verify Absence-of-Errors" mode, Soter will try to prove that no process will ever be in a state of error, i.e. an uncaught runtime exception. There is a limited set of such errors which are currently modelled by Soter:
"I am a string" ! {msg, hi}
). Note that the
register
primitives are not supported so sending to an atom
is always seen as "illegal" by Soter;erlang:error
.One major omission is pattern-matching errors. Soter is currently ignoring them. The next release of Soter will feature a more precise abstract pattern-matching mechanism that will allow for pattern-matching errors to be detected avoiding too many false-alarms.
Please note that the exception handling features of Erlang, such as
catch
, are not currently supported by Soter. We aim to add
support for them in the next release.
Soter does have limited support for data abstraction. A term is abstracted by truncation (i.e. ignoring all its subterms) at level L. There are two independent abstractions: one for values and one for messages, controlled respectively by the two truncation levels D and M.
The default parameter for value abstraction is D=0. If the property at hand is dependent on the values of a certain parameter, increasing D improves the chances of proving property but usually requires a lot more time.
If not otherwise specified, M is set to the maximum depth of
the receive patterns plus D. You may need to manually adjust
this parameter if your property needs to distinguish more
kinds
of messages. Increasing M is not expected to
affect the performance too much.
Currently no other abstraction parameter (such as context-sensitivity k) can be adjusted.
Typically one would like to verify functions. However Soter requires
the module to have a main
function which has no arguments,
acting as entry-point. This limitation can be overcome by using the
non-deterministic choice operator soter:choice
. Its runtime
behaviour, provided by the soter.erl
module, is a random
choice between its two arguments. Soter interprets it as a
non-deterministic branch in the semantics.
We discourage the direct use of soter:choice
and favour
instead the use of the macros defined in grammars.hrl
,
described in detail in the Erlang input
chapter.
As a useful example, you can use the macro ?any_nat()
to
describe a value that ranges over all the natural numbers (to be used
with the peano
switch, see Erlang
input).
Soter is a multi-platform command-line tool written in Haskell. We currently do not provide a direct download of the tool but the binaries can be obtained from the authors.
The manual page of the command-line tool gives a detailed description of all its options.
For further details, questions or bug reports contact