EIP4573 - Entry Points and Procedures for EVM Code Sections

# Abstract

EOF sections and EVM instructions are introduced to provide procedural entry points to code sections.

# Motivation

Currently the Ethereum Object Format supports a single code section, with no further means of structuring the code.

This proposal adds a layer of structure by supporting additional code sections. How to access these sections? This proposal specifies a table of entry points for each additional section. Each entry point corresponds to a procedure within a code section.

We define procedures as blocks of code that can be entered only at their entry point, and at other points can call other procedures and subroutines and return to the procedure that called them.

# Specification

# EOF Sections

Code sections after the first must be immediately preceded by an entry point section. These sections can only be entered at one of the specified entry points in the code section, given as an offset relative to the beginning of the code section.

# Entry Point Section

description		length
section_kind		1-byte	Encoded as a 8-bit unsigned number.
section_size		2-bytes	Encoded as a 16-bit unsigned big-endian number.
[]entry_point
	entry_point	2-bytes	Encoded as a 16-bit unsigned big-endian number.
	...	...	...

# Call Frame Stack

These instructions make use of a frame_stack to hold arguments, locals, and return values for procedures in memory. Frame memory begins at address 0 in memory and grows downwards, towards more negative addresses.

Memory can be addressed relative to the frame pointer FP or by absolute address. FP starts at 0, and moves to point to the start of the input arguments for each CALLPROC, and to the start of the returned output values for each RETURNPROC.

So on entry to a procedure the call frame stack looks like this (FP <= 0):

     0-> . . .
         . . .
         previous frame
         . . .
    FP-> inputs
         . . .
         locals
         . . .

1
2
3
4
5
6
7
8

And on exit from the procedure the call frame stack looks like this:

     0-> . . .
         . . .
    FP-> previous frame
         . . .
         outputs
         . . .

1
2
3
4
5
6

Note: In order to save space the inputs and outputs overlap. When the inputs are used up in the course of producing the outputs this should not be a problem.

# Instructions

# ENTERPROC (0x??) n_inputs: uint2, n_locals: uint2, n_outputs: uint2

Marks the entry point to a procedure taking n_inputs arguments, reserving n_locals variables, and returning n_outputs values. Procedures can only be entered via a CALLPROC to their entry point.

Execution of ENTERPROC is invalid.

# LEAVEPROC (0x??)

Marks the end of a procedure. Each ENTERPROC requires a closing LEAVEPROC.

Execution of LEAVEPROC from within the procedure it ends is equivalent to RETURNPROC. Otherwise it is invalid.

# CALLPROC (0x??) code_id: byte, entry_point: byte

  frame_stack.push(FP)
  FP += n_locals + (n_inputs of callee)
  return_stack.push(PC + 1)
  PC = (address of code_id section) + (entry_point offset)

1
2
3
4

Prepare to return to the current procedure, then call the new procedure. The start of the input addressed by FP.

The required n_inputs words are known (at validation time) to be available on the frame stack.

Note: code_id=0 indexes the current section, other ids index each section in order, from 1. The entry_point indexes procedures in order of their ENTERPROC, from 1. entry_point=0 is equivalent entering the first procedure with no arguments on the data stack or the frame stack.

# RETURNPROC (0x??)

   FP = frame_stack.pop()
   PC = return_stack.pop()

1
2

Return to the calling produre, with PC set to one past its previous value, and FP restored to its previous value. The start of the output addressed by FP.

The promised n_outputs words are known (at validation time) to be available on the frame stack.

# MSTOREFP (0x??) offset: int2

*(FP + offset) = *SP--

Pop a word from the top of the data stack to memory at FP + offset and decrease stack size by one word.

# MLOADFP (0x??) offset: int2, n_words: uint2

*SP++ = *(FP + offset)

Push a word to the top of the data stack from memory at FP + offset and increase stack size by one word.

# Memory Costs

Presently, MSTORE is defined as

$μ′{m}[μ{s}[0] . . . (μ_{s}[0] + 31)] ≡ μ_{s}[1]$ $μ′{i} ≡ max(μ{i},⌈(μ_{s}[0]+32)÷32⌉)$

We propose to treat memory addresses as signed, so the formula for MSTOREFP needs to use an absolute value

$μ′{i} ≡ max(abs(μ{i}),⌈(μ_{s}[0]+32)÷32⌉)$

TBD: Details!

# Rationale

This proposal uses the EIP-2315 return stack to manage calls and returns, and steals and simplifies ideas from EIP-615, EIP-3336 and EIP-3337. ENTERPROC is modeled on BEGINSUB, and per EIP-3336 and EIP-3337 it moves call frames from the data stack to a new in-memory stack. The FP register automatically tracks call-frame addresses as procedures are entered and left, and cannot be otherwise modified TBD: The aliasing of the frame stack with ordinary memory supports languages that provide pointers to variables on the stack

# Backwards Compatibility

This proposal adds new EOF sections and EVM opcodes. It doesn't remove or change the semantics of any existing opcodes, so there should be no backwards compatibility issues.

# Security

Safe use of these constructs will be checked completely at validation time, in linear time and space, so there should be no security issues.