----------------------------------------------------------------------
	     A Formal and Executable Model of the x86 ISA
	  and the associated x86 Machine-Code Analysis Framework

	 Shilpi Goel, Warren A. Hunt, Jr., and Matt Kaufmann

----------------------------------------------------------------------

These books contain the specification of x86 instruction set
architecture (ISA); we characterize x86 machine instructions and model
the instruction fetch, decode, and execute process using the ACL2
theorem-proving system.  We use our x86 ISA specification to simulate
and formally verify x86 machine-code programs.

You can choose to go to Section III of this file (Building the Books)
directly for instructions to certify these books.  The documentation
of these books is available at
http://www.cs.utexas.edu/users/moore/acl2/manuals/current/manual/
under the topic "Software-verification".

Users are encouraged to go through this file carefully.  Contributing
authors are *strongly* encouraged to keep it up to date.

Questions/Comments? Contact Shilpi Goel (shigoel@cs.utexas.edu).

----------------------------------------------------------------------
Outline:
----------------------------------------------------------------------

* Section   I: Introduction

* Section  II: Directory Organization

* Section III: Building the Books (Book Certification)

----------------------------------------------------------------------
Section I: Introduction
----------------------------------------------------------------------

Our x86 ISA model continues to evolve, but we summarize its current
principal features below.

  1.  Out of the 64-bit and Compatibility sub-modes available in the
      IA-32e mode of Intel(R) 64 architecture, we specify only the
      former, i.e., the 64-bit mode.

  2.  Our model supports both efficient reasoning and simulation of
      x86 machine-code programs.  The ability to run real, unmodified
      x86 programs on our model enables its validation against
      physical machines, thereby providing test-based assurance that
      our model faithfully captures the semantics of an actual x86
      processor.

  3.  Consistent with Intel's memory model, our memory model is
      byte-addressable.  However, it offers sequential consistency as
      opposed to a relaxed memory model like TSO.

  4.  All addressing modes of the 64-bit mode are supported.

  5.  Our model can operate in the following modes:

	a. System-level Mode:

	   As of this writing, the supervisor-level mode includes
	   support for IA-32e paging and segmentation.  In this mode,
	   our memory model characterizes a 2^52-byte physical address
	   space, which is the largest address space provided by
	   modern AMD/Intel x86-compatible implementations.  Our
	   specification includes the translation of linear addresses
	   to physical addresses.  This mode is intended for the
	   simulation and verification of x86 system software.  This
	   mode provides the same environment to a system program as
	   is provided by an x86 machine.

	b. Programmer-level Mode:

	   Some supervisor-level features are neither used nor
	   required for the analysis of application software.  In most
	   cases, a programmer cares about the correctness of his
	   application program while assuming that services like
	   paging and I/O operations are provided reliably by the
	   operating system.  The programmer-level mode of our model
	   attempts to provide the same environment for reasoning as
	   is provided by an OS for programming.  In this mode, our
	   memory model provides the 2^48-byte linear address space
	   specified for modern 64-bit Intel machines.

	   The programmer-level mode includes a specification of some
	   system calls; these specification functions give one
	   possible characterization of the service a programmer could
	   expect from an OS.  Note that these specification functions
	   abstract away many details of the OS-level routines that
	   implement system calls --- these routines could be analyzed
	   in the supervisor-level mode of our model.

	   From the point of view of a programmer, system calls are
	   non-deterministic.  The programmer-level mode provides a
	   simple and flexible framework to reason about
	   non-deterministic computations.  User-level programs with
	   system calls are simulated by requesting the underlying OS
	   to provide system call service.  The execution framework
	   for system calls is validated against the framework for
	   reasoning about them, as well as against the real machine,
	   i.e., the x86 processor and OS.

	   We support system calls provided by both Linux and Darwin
	   systems.  The system calls implemented for both simulation
	   and reasoning are: read, write, open, close, and lseek.
	   System calls like dup, dup2, dup3, fadvise64, link, and
	   unlink are currently supported only for simulation.

  6.  The RDRAND instruction, first introduced in Intel's Ivy Bridge
      x86 processor, is supported by our model.  The RDRAND
      instruction is sometimes used by programs that implement
      cryptographic functions, such as key generation.

Below are some features not supported by our model.  Of course, we
would like to take items in the list below (which is not exhaustive to
begin with) and move them to the list above.

  1.  I/O instructions:  IN, OUT, ...

  2.  Co-processor (SSE and some floating-point) Instructions

  3.  Interrupts

  4.  Exceptions:
      At present, when our model detects situations that lead to an
      exception, our model's status (ms) register is set to indicate
      that a behavior not yet captured by our model has occurred.  Our
      model is then in an error/halted state.

  5.  Multiple Threads/Processors

  6.  Safer-mode Extensions, TPM, and associated features

  7.  Task Management

  8.  Performance Counters

----------------------------------------------------------------------
Section II: Directory Organization
----------------------------------------------------------------------

Makefile -   For book certification with full execution support; see
	     Section III below

portcullis - X86ISA package definition, constants we plan to read in
	     using the sharp-dot-reader

utils      - Definitions of constants, recognizers, accessors, and
	     updaters related to various x86 registers

machine    - Core x86 books, containing definition of the x86 state,
	     instruction semantic functions, fetch-decode-execute
	     specification, etc.

proofs     - Utilities for x86 machine-code verification and code proofs

tools      - Tools for program simulation and dynamic instrumentation

top        - Top-level book that includes all the X86ISA books and
	     their documentation

----------------------------------------------------------------------
Section III: Book Certification
----------------------------------------------------------------------

The authors of these books generally prefer to use ACL2(p) for
certification, but of course, you are free to use ACL2.

Two ways of building the X86ISA books are:

1. Using the Makefile provided with the X86ISA books:

   Users of these books who wish to simulate x86 programs with
   non-deterministic computations like SYSCALL (in
   programmer-level-mode) and RDRAND should use this Makefile and run
   make with X86ISA_EXEC set to t (which is the default value).

   make JOBS=8 \
	X86ISA_EXEC=t \
	ACL2=/Users/shilpi/acl2/saved_acl2

   where the number of jobs to be run in parallel in this example is
   8. Note that we use JOBS here instead of the -j flag.

   When X86ISA_EXEC is t, some dynamic C libraries that are used in
   the model for supporting the execution of SYSCALL in the
   programmer-level mode and RDRAND will be built. Since we rely on
   the foreign function interface of Clozure CL (CCL), full execution
   support is available only if you use CCL.

   Values of X86ISA_EXEC other than t will not allow the execution of
   SYSCALL and RDRAND instructions (as may be the case with using
   other Lisps as well). Note that reasoning about these instructions
   will still be possible. Execution and reasoning about all other
   instructions will always be possible, irrespective of X86ISA_EXEC
   or the underlying Lisp.

   IMPORTANT: You should do a "make clean" (or "make execclean" if you
   are in a hurry) if you wish to certify the books with a different
   value of X86ISA_EXEC.

2. Using the "everything" target of the ACL2 Community Books (see
   acl2/books/GNUmakefile):

   This is essentially the same as executing "cert.pl
   books/projects/x86isa/top". This will build the x86 books without
   full execution support, i.e., the effect will be the same as
   building these books with X86ISA_EXEC=nil even though the Makefile
   provided with the X86ISA books will not be used.


Suppose you had certified these books previously, but you have since
forgotten whether you built them with X86ISA_EXEC=t or not. Here is a
way of checking the certified books to see if you have full execution
support. Look at the following file: machine/x86-syscalls.cert.out. If
this file contains the following:

X86ISA_EXEC Warning: x86-environment-and-syscalls-raw.lsp is not
included.

then you do not have SYSCALL execution support. Otherwise, you do. If,
in machine/x86-other-non-det.cert.out, you find the following string,
then you do not have RDRAND execution support, either because the
books were certified with X86ISA_EXEC not equal to t or because your
processor does not support the RDRAND instruction:

X86ISA_EXEC Warning: x86-other-non-det-raw.lsp is not included.

----------------------------------------------------------------------