EIP3779 - Safer Control Flow for the EVM

# Abstract

This EIP specifies validation rules for some important safety properties, including

valid instructions,
valid jump destinations,
no stack underflows, and
no stack overflows without recursion.

Valid contracts will not halt with an exception unless they either run out of gas or overflow stack during a recursive subroutine call.

Code must be validated at contract creation time – not runtime – by the provided algorithm or its equivalent. Therefore, tables of valid dynamic jumps are specified in an EOF container section. This allows for a one-pass algorithm that is (and must be) linear in the size of the code plus the section, so as not to be a DoS vulnerability.

# Motivation

Validating safe control flow at creation time has a few important advantages.

Jump destination analysis does not need to be performed at runtime, thus improving performance and preventing denial of service attacks.
Jump destination validity does not always need to be checked for at runtime, improving performance.
Stack underflow does not ever need to be checked for at runtime, improving performance.

Dynamic jumps, where the destination of a JUMP or JUMPI is not known until runtime, can be an obstacle to statically proving this sort of safety in linear time, but have also been seen as necessary to implement the return jump from a subroutine. But consider this example of calling a minimal subroutine

ADD:
    RTN_ADD 
    0x02
    0x03
    ADDITION
    jump
RTN_ADD:
    jumpdest
    swap1
    jump

ADDITION:
    jumpdest
    add
    swap1
    jump

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

Note that the return address and the destination address are pushed on the stack as constants -- the JUMP instructions are in fact static, not dynamic, in the sense that they jump to the same PC on every run. So we do not need (nor typically use) dynamic jumps to implement subroutines.

Since many of the jumps we need in practice are static we can validate their safety with a static analysis of the code. And since can, we should.

Even where jumps are dynamic it is possible to tabulate valid destinations in advance, and the Ethereum Object Format gives us a place to store such tables. Providing for the safe use of dynamic jumps makes for concise and efficient implementations of language constructs like switches and virtual functions.

So again, since we can validate the safety of tabulated dynamic jumps with a static analysis of the code, we should.

# Specification

# EOF container changes

A new EOF section called vjumptable (section_kind = 4) is introduced. It contains a sequence of n tuples _(jumpsrc, jumpdest_i__, sorted in ascending lexicographic order. Each tuple represents a valid jump from one location in the code to another.
A new EOF section called vtraptable (section_kind = 5) is introduced. It contains a sequence of n tuples (jumpsrc, jumpdst_i, sorted in ascending lexicographic order. Each tuple represents a valid jump from one location in the code to another. There is one and only one trap table entry for each jumpsrc.

At runtime, a dynamic jump cause a search for a matching jumpsrc and jumpdst in the vjumptable. if found, the jump proceeds to the jumpdest. If not, the jump proceeds to the jumpdest for the matching jumpsrc in the vtraptable. In this way dynamic jumps always succeed, but it is left to the programmer to implement a default case for the trap table.

# Validity

We define safety here as avoiding exceptional halting states:

Valid contracts will not halt with an exception unless they
- run out of gas, or
- overflow stack while making a recursive subroutine call.

Attempts to create contracts that cannot be proven to be valid will fail.

# Exceptional Halting States

Execution is as defined in the Yellow Paper (opens new window) a sequence of changes to the EVM state. The conditions on valid code are preserved by state changes. At runtime, if execution of an instruction would violate a condition the execution is in an exceptional halting state. The Yellow Paper defines five such states.

Insufficient gas
More than 1024 stack items
Insufficient stack items
Invalid jump destination
Invalid instruction

We would like to consider EVM code valid iff no execution of the program can lead to an exceptional halting state, but we must be able to validate code in linear time to avoid denial of service attacks. So in practice, we can only partially meet these requirements. Our validation rules do not consider the code's data and computations, only its control flow and stack use. This means we will reject programs with any invalid code paths, even if those paths are not reachable at runtime.

# Validation Rules

This section extends the contact creation validation rules (as defined in EIP-3540.)

Every JUMP and JUMPI either
- matches at least one tuple in the vjumptable or the vtraptable, or
- is static
The stack depth is
- always positive and
- the same on every path through an opcode.
The number of items on the data stack and on the return stack is at most 1024.

A JUMP or JUMPI instruction matches a tuple in a table if the first, jumpsrc element equals that instruction's offset in the code.

We consider a JUMP or JUMPI instruction to be static if its jumpsrc argument was first placed on the stack as a valid JUMPDEST (e.g. via a PUSH…), and its value has not changed since, other than by a DUP… or SWAP….

The Yellow Paper has the stack pointer (SP) pointing just past the top item on the data stack. We define the stack base (BP) as the element that the SP addressed at the entry to the current basic block, or 0 on program entry, and the stack depth as the number of stack elements between the current SP and the current BP.

Taken together, these rules allow for code to be validated by traversing the control-flow graph, following each edge only once.

# Rationale

The alternative to checking validity at creation time is checking it at runtime. This hurts performance and is a denial of service vulnerability. Thus the above rules and accompanying one-pass validation algorithm.

Rule 1 – requiring static or previously tabulated destinations for JUMP and JUMPI – simplifies static jumps and constrains dynamic jumps.

Jump destinations are currently checked at runtime, but static jumps can be validated at creation time.
Requiring the possible destinations of dynamic jumps to be tabulated in advance allows for tractable bytecode traversal for creation-time validation: a jump table takes up space proportional to the number of jump destinations, so attempting to attack the validation algorithm with large numbers of jump destinations will also reduce the available space for code to be validated.

Rule 2 – requiring positive, consistent stack depth – ensures sufficient stack. Exceptions can be caused by some irreducible paths like jumping into loops and subroutines, and by calling subroutines with insufficient numbers of arguments.

Rule 3 – bounding the stack pointer – catches all stack overflows that occur without recursion.

# Validation Algorithm

This section specifies an algorithm for checking the above the rules. Equivalent code must be run at creation time. We assume that the validation defined in EIP-3540 and EIP-3670 has already run to check Rule 0, although in practice the algorithms can be merged.

The following is a pseudo-Go implementation of an algorithm for enforcing program validity. This algorithm is a symbolic execution of the program that recursively traverses the bytecode, following its control flow and stack use and checking for violations of the rules above. It uses a stack to track the slots that hold PUSHed constants, from which it pops the destinations to validate during the analysis.

This algorithm runs in time equal to O(vertices + edges) in the program's control-flow graph, where edges represent control flow and the vertices represent basic blocks – thus the algorithm takes time proportional to the size of the bytecode.

For simplicity's sake we assume a few helper functions.

advance_pc() advances the PC, skipping any immediate data.
imm_data() returns immediate data for an instruction.
valid_jumpdest() tests whether
- dynamic jumps match a valid (jumpsrc, jumpdest) in the vjumptable or vtraptable
- all jump destinations are JUMPDEST bytes and not in immediate data.
remove_items() returns the number of items removed from the stack by an instruction
add_items() returns the number of items added to thestack. Items are added as 0xFFFFFFFF. ThePC,PUSH…,SWAP…,DUP…,JUMP, andJUMPI` instructions are handled separately.

var code  [code_len]byte
var depth [code_len]unsigned
var stack [1024]int = { -1 }  // -1 marks non-constant 
var sp := 1023                // stack grows down
var bp := 1023

func validate(pc := 0, depth := 0) boolean {

   for ; pc < code_len; pc = advance_pc(pc) {

      // successful termination
      switch instruction {
      case STOP    return true
      case RETURN  return true
      case SUICIDE return true
      }

      // check for stack underflow and overflow
      depth := bp - sp
      if depth < 0 || sp < 0 || 1024 < sp {
         return false
      }

      // if stack depth for `pc` is non-zero we have been here before 
      // so return to break cycle in control flow graph
      if depth[pc] != 0 {
          if depth[pc] != depth {
             return false
          }
          return true
      }
      depth[pc] = depth

      // track constants on the stack
      if (instruction == PC) {
         stack[sp++] == pc
         continue
      }
      if (PUSH1 <= instruction && instruction <= PUSH16) {
         stack[sp++] = imm_data(pc)
         continue
      }
      if (DUP1 <= instruction && instruction <= DUP16) {
         n := instruction - DUP1 + 1
         stack[sp + 1] = stack[n + 1]
         continue
      }
      if (SWAP1 <= instruction && instruction <= SWAP16) {
         n := instruction - SWAP1 + 1
         swap := stack[n]
         stack[n] = stack[sp + 1]
         stack[sp + 1] = swap
         continue
      }

      if (instruction == JUMP) {

         // check for valid destination
         jumpdest = const_stack[sp++]
         if !valid_jumpdest(jumpdest) {
            return false
         }

         // will enter basic block at destination
         bp = sp

         // reset pc to destination of jump 
         pc = jumpdest
         continue
      }
      if (instruction == JUMPI {

         // check for valid destination
         jumpdest = stack[sp++]
         if !valid_jumpdest(dest) {
            return false
         }

         // recurse to jump to code to validate
         if !validate(jumpdest) {
            return false
         }
         // false side of conditional

         // enter basic block 
         bp = sp
         continue
      }
      if (instruction == JUMPDEST) {

         // enter basic block 
         bp = sp
         continue
      }

      // apply other instructions to stack
      sp += remove_items(pc)
      sp -= add_items(pc)
   }

   // successful termination
   return true
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103

# Backwards Compatibility

These changes affect the semantics of EVM code – the use of JUMP, JUMPI, and the stack are restricted, such that some code that would otherwise run correctly will nonetheless be invalid EVM code.

These changes and associated security guarantees are compatible with the valid use of the control-flow operations introduced by EIP-2315: Simple Subroutines for the EVM.

# Security Considerations

This EIP is intended to ensure a minimal level of safety for EVM code deployed on the blockchain.