EIP4573 - Entry Points and Procedures for EVM Code Sections
# Abstract
EOF sections and EVM instructions are introduced to provide procedural entry points to code sections.
# Motivation
Currently the Ethereum Object Format supports a single code section, with no further means of structuring the code.
This proposal adds a layer of structure by supporting additional code sections. How to access these sections? This proposal specifies a table of entry points for each additional section. Each entry point corresponds to a procedure within a code section.
We define procedures as blocks of code that can be entered only at their entry point, and at other points can call other procedures and subroutines and return to the procedure that called them.
# Specification
# EOF Sections
Code sections after the first must be immediately preceded by an entry point section. These sections can only be entered at one of the specified entry points in the code section, given as an offset relative to the beginning of the code section.
# Entry Point Section
description | length | ||
---|---|---|---|
section_kind | 1-byte | Encoded as a 8-bit unsigned number. | |
section_size | 2-bytes | Encoded as a 16-bit unsigned big-endian number. | |
[]entry_point | |||
entry_point | 2-bytes | Encoded as a 16-bit unsigned big-endian number. | |
... | ... | ... |
# Call Frame Stack
These instructions make use of a frame_stack
to hold arguments, locals, and return values for procedures in memory. Frame memory begins at address 0 in memory and grows downwards, towards more negative addresses.
Memory can be addressed relative to the frame pointer FP
or by absolute address. FP
starts at 0, and moves to point to the start of the input arguments for each CALLPROC
, and to the start of the returned output values for each RETURNPROC
.
So on entry to a procedure the call frame stack looks like this (FP <= 0)
:
0-> . . .
. . .
previous frame
. . .
FP-> inputs
. . .
locals
. . .
2
3
4
5
6
7
8
And on exit from the procedure the call frame stack looks like this:
0-> . . .
. . .
FP-> previous frame
. . .
outputs
. . .
2
3
4
5
6
Note: In order to save space the inputs and outputs overlap. When the inputs are used up in the course of producing the outputs this should not be a problem.
# Instructions
# ENTERPROC (0x??) n_inputs: uint2, n_locals: uint2, n_outputs: uint2
Marks the entry point to a procedure taking
n_inputs
arguments, reservingn_locals
variables, and returningn_outputs
values. Procedures can only be entered via aCALLPROC
to their entry point.Execution of
ENTERPROC
is invalid.
# LEAVEPROC (0x??)
Marks the end of a procedure. Each
ENTERPROC
requires a closingLEAVEPROC
.Execution of
LEAVEPROC
from within the procedure it ends is equivalent toRETURNPROC
. Otherwise it is invalid.
# CALLPROC (0x??) code_id: byte, entry_point: byte
frame_stack.push(FP)
FP += n_locals + (n_inputs of callee)
return_stack.push(PC + 1)
PC = (address of code_id section) + (entry_point offset)
2
3
4
Prepare to return to the current procedure, then call the new procedure. The start of the input addressed by
FP
.
The required n_inputs
words are known (at validation time) to be available on the frame stack.
Note:
code_id=0
indexes the current section, other ids index each section in order, from 1. Theentry_point
indexes procedures in order of theirENTERPROC
, from 1.entry_point=0
is equivalent entering the first procedure with no arguments on thedata stack
or theframe stack
.
# RETURNPROC (0x??)
FP = frame_stack.pop()
PC = return_stack.pop()
2
Return to the calling produre, with PC set to one past its previous value, and FP restored to its previous value. The start of the output addressed by FP.
The promised n_outputs
words are known (at validation time) to be available on the frame stack.
# MSTOREFP (0x??) offset: int2
*(FP + offset) = *SP--
Pop a word from the top of the data stack to memory at
FP + offset
and decrease stack size by one word.
# MLOADFP (0x??) offset: int2, n_words: uint2
*SP++ = *(FP + offset)
Push a word to the top of the data stack from memory at
FP + offset
and increase stack size by one word.
# Memory Costs
Presently, MSTORE
is defined as
$μ′{m}[μ{s}[0] . . . (μ_{s}[0] + 31)] ≡ μ_{s}[1]$ $μ′{i} ≡ max(μ{i},⌈(μ_{s}[0]+32)÷32⌉)$
We propose to treat memory addresses as signed, so the formula for MSTOREFP
needs to use an absolute value
$μ′{i} ≡ max(abs(μ{i}),⌈(μ_{s}[0]+32)÷32⌉)$
TBD: Details!
# Rationale
This proposal uses the EIP-2315 return stack to manage calls and returns, and steals and simplifies ideas from EIP-615, EIP-3336 and EIP-3337. ENTERPROC
is modeled on BEGINSUB
, and per EIP-3336 and EIP-3337 it moves call frames from the data stack to a new in-memory stack. The FP
register automatically tracks call-frame addresses as procedures are entered and left, and cannot be otherwise modified
TBD: The aliasing of the frame stack with ordinary memory supports languages that provide pointers to variables on the stack
# Backwards Compatibility
This proposal adds new EOF sections and EVM opcodes. It doesn't remove or change the semantics of any existing opcodes, so there should be no backwards compatibility issues.
# Security
Safe use of these constructs will be checked completely at validation time, in linear time and space, so there should be no security issues.
# Copyright
Copyright and related rights waived via CC0 (opens new window).