EIP3668 - Durin: Secure offchain data retrieval

# Simple Summary

Durin provides a secure mechanism for fetching offchain data without additional trust assumptions.

# Abstract

Contracts wishing to support lookup of data from offchain sources may, instead of returning the data directly, revert using OffchainLookup(string url, bytes prefix). Clients supporting this spec then repeat the original call as a POST request to the URL provided by the contract, which returns an opaque byte string. Finally, clients call the original contract with the byte string, after verifying it starts with the provided prefix. The original contract then decodes and verifies the returned data using an implementation-specific method and returns it to the caller.

This mechanism allows for offchain lookups of data in a way that is transparent to clients, and allows contract authors to implement whatever validation is necessary; in many cases this can be provided without any additional trust assumptions over and above those required if data is stored onchain.

# Motivation

Minimising storage and transaction costs on Ethereum has driven contract authors to adopt a variety of techniques for moving data offchain, including hashing, recursive hashing (eg Merkle Trees/Tries) and L2 solutions. While each solution has unique constraints and parameters, they all share in common the fact that enough information is stored onchain to validate the externally stored data when required.

Thus far, applications have tended to devise bespoke solutions rather than trying to define a universal standard. This is practical - although inefficient - when a single offchain data storage solution suffices, but rapidly becomes impractical in a system where multiple end-users may wish to make use of different data storage and availability solutions based on what suits their needs.

By defining a common specification for fetching and validating offchain data using smart contracts, we facilitate writing clients that are entirely agnostic to the storage solution being used, which enables new applications that can operate without knowing about the underlying storage details of the contracts they interact with.

Examples of this include:

Interacting with 'airdrop' contracts that store a list of recipients offchain in a merkle trie.
Viewing token information for tokens stored on an L2 solution as if they were native L1 tokens.
Allowing delegation of data such as ENS domains to various L2 solutions, without requiring clients to support each solution individually.

# Specification

# Overview

Answering a query via Durin takes place in three steps:

Querying the original contract
Querying the gateway
Validating the gateway response with the original contract

In step 1, a standard blockchain call operation is made to the contract. The contract reverts with an error that specifies the data to complete the call can be found offchain, and provides the url to a service that can provide the answer, along with a required prefix.

In step 2, the client calls the gateway service with the original call data. The gateway responds with an answer, whose content is opaque to the client but must start with the prefix specified by the contract.

In step 3, the client calls the original contract with the exact data provided by the gateway. The contract decodes the gateway's return data and uses it to validate the response, returning the final result if successful, or reverting with an error if validation fails.

# Contract interface

A Durin-enabled contract must revert with the following error whenever a function that requires offchain data is called:

error OffchainLookup(string url, bytes data, bytes prefix)

url specifies the URL to a service (known as the gateway) that implements the Durin protocol and can formulate an answer to the query. url can be the empty string '', in which case a client-specific default should be used.

data specifies the data to call that service with; for simple cases this can be the same as the original call data.

prefix specifies a prefix that the data returned by the gateway at url must have. Clients will check this value is a prefix of the data returned by the gateway, and cause the call to fail if it does not match.

The contract must also implement a callback method for decoding and validating the data returned by the gateway. The name and signature of this method are implementation-specific and must be agreed upon between the contract and the gateway. If the client successfully calls the gateway, this contract function will be invoked by the client, with the callback data returned from the gateway. The contract must interpret this data and use it to validate the original request, returning the result of that request if successful, or reverting if validation fails.

prefix provides a critical security measure against malicious gateways; by requiring the response returned by the gateway to start with a specific prefix, the contract can prevent the gateway from causing the client to call arbitrary methods on the contract, and require that the answer pertains to the original query, by including query parameters in the required prefix.

For example, suppose a contract has the following method:

function balanceOf(address addr) public view returns(uint balance);

Data for these queries is stored offchain in some kind of hashed data structure, the details of which are not important for this example. The contract author wants the gateway to fetch the proof information for this query and call the following function with it:

function balanceOfWithProof(address addr, Proof proof) public view returns(uint balance);

If the contract provides no required prefix, a malicious gateway can return a call to any function on the contract, not just the desired one, so the prefix returned by balanceOf must at least be the 4 byte function identifier for balanceOfWithProof(address,Proof). If only the function identifier is provided, however, a malicious gateway can return any valid proof at all - it could answer a query for the balance of account A with the balance of account B. As a result, the prefix must also include all relevant information about the original query - in this case, the address addr that the query is for.

One example of a valid implementation of balanceOf would thus be:

function balanceOf(address addr) public view returns(uint balance) {
    revert OffchainLookup(url, msg.data, abi.encodeWithSelector(Contract.balanceOfWithProof.selector, addr));
}

1
2
3

Due to the way ABI encoding functions, a prefix is sufficient to encode the function selector and initial fixed-length arguments to a function. However, dynamic-length arguments (eg, bytes, string or arrays) cannot be enforced with a prefix. Where these must be committed to, we recommend hashing them and including the hash as an initial argument committed to in the prefix.

# Gateway Interface

The URL returned by a contract may be of any schema, but this specification only defines how clients should handle HTTPS URLs. Compliant gateways must implement JSON-RPC, and support the durin_call method. Clients must implement the semantics described in the Ethereum JSON-RPC specification (opens new window) for handling of binary data as hexadecimal strings.

# `durin_call`

Accepts calldata for a smart contract call, and returns a message (encoded as a call to the same smart contract) that can be used to validate and return the response.

# Parameters

Object - The call object

to: DATA, 20 Bytes - The address of the contract being called.
data: DATA - The call data for the contract invocation - this must be the value of the data field from the OffchainLookup error.
abi: DATA - A JSON format ABI for the function being invoked, as specified in the Solidity ABI encoding specification (opens new window).

# Returns

DATA - encoded data with which to call the contract at to in order to retrieve the final result.

# Example

// Request
curl -X POST --data '{"jsonrpc":"2.0","method":"durin_call","params":[{see above]]","id":1}'

// Result
{
      "id": 1,
      "jsonrpc": "2.0",
      "result": "0x"
}

1
2
3
4
5
6
7
8
9

# Client Lookup Protocol

The lookup protocol for a client is described with the following pseudocode:

def durin_call(address, abi, args, default_gateway):
    calldata = abi_encode_call(abi, args)
    try:
        return address.call(calldata)
    except OffchainLookup as e:
        url, data, prefix = e.url, e.data, e.prefix
        response = json_rpc_call(url or default_gateway, 'durin_call', {'to': address, 'data': data, 'abi': abi})
        if not response.result.startswith(prefix):
            throw Error("Invalid response prefix")
        return address.call(response.result)

1
2
3
4
5
6
7
8
9
10

Where:

abi_encode_call(abi, args) is a function that encodes args using the JSON-format ABI in abi, including 4-byte function selector.
address.call(data) makes an eth_call to address with the provided encoded data and returns the result, or raises an exception if the call reverts.
json_rpc_call(url, method, data) makes a JSON-RPC call to url with the specified method and data.

If the function being called is a standard contract function, the process terminates after the original call, returning the same result as for a regular eth_call. Otherwise, the gateway at url (or default_gateway if no URL is returned) is called with the original call data, and is expected to return a valid response, which is first checked against prefix and then used to call the original contract to produce the final result of the operation.

# Use of Durin for transactions

While the specification above is for read-only contract calls (eg, eth_call), it is simple to use this method for sending transactions (eg, eth_sendTransaction or eth_sendRawTransaction) that require offchain data. While 'preflighting' a transaction using eth_estimateGas or eth_call, a client that receives an OffchainLookup revert can follow the procedure described above in "Client lookup protocol", substituting a transaction for the call in the last step. This functionality is ideal for applications such as making onchain claims supported by offchain proof data.

# Rationale

# Use of `revert` to convey call information

For offchain data lookup to function as desired, clients must either have some way to know that a function depends on this specification for functionality - such as a specifier in the ABI for the function - or else there must be a way for the contract to signal to the client that data needs to be fetched from elsewhere.

While specifying the call type in the ABI is a possible solution, this makes retrofitting existing interfaces to support offchain data awkward, and either results in contracts with the same name and arguments as the original specification, but with different return data - which will cause decoding errors for clients that do not expect this - or duplicating every function that needs support for offchain data with a different name (eg, balanceOf -> offchainBalanceOf). Neither solutions is particularly satisfactory.

Using a revert, and conveying the url and prefix in the revert data, allows any function to be retrofitted to support lookups via Durin so long as the client understands the specification, and so facilitates translation of existing specificatiosn to use offchain data.

# Passing address and abi data to `durin_call`

Both of these parameters are required to be passed to the gateway in order to facilitate the writing of generic gateways, thus reducing the burden on contract authors to provide their own gateway implementations.

Supplying address allows the gateway to perform lookups to the original contract for information needed to assist with resolution, making it possible to operate one gateway for any number of identical contracts.

Supplying abi allows a gateway to function even if it does not know, a-priori, the functions that will be called on it, by allowing it to dynamically decode and interpret the call data. This also allows the gateway to synthesize the return data dynamically - for example, a gateway could support any method on a contract by specifying that it will create a callback to a method of the same name with WithProof appended and the relevant proof data added to the call arguments.

# Requirement for prefix matching on gateway return data

The requirement that contracts return a prefix that must match the data returned by the gateway serves as a guard against malicious gateways. Without a required prefix, a gateway can cause the client to call any method on the original contract, or produce a call that validates as correct but answers a different query to the one that the client provided. Allowing the contract to require that the response start with critical information computed from the call data of the original call prevents this attack vector.

More sophisticated conditions could be required - for example, the contract could return a regular expression that must match the returned data. However, this adds significant complexity to the proposal, and due to the nature of ABI encoding, a prefix is sufficient to match a function selector and the first fixed-length arguments of a contract function.

# Use of JSON-RPC

JSON-RPC is already heavily used in the Ethereum ecosystem as the primary means for communicating with Ethereum nodes. We have adopted it here in order to avoid introducing new RPC specifications and dependencies unnecessarily.

# Glossary

Client: A process, such as JavaScript executing in a web browser, or a backend service, that wishes to query a blockchain for data. The client understands how to fetch data using Durin.
Contract: A smart contract existing on Ethereum or another blockchain.
Gateway: A service that answers application-specific Durin queries, usually over HTTPS.

# Backwards Compatibility

Existing contracts that do not wish to use this specification are unaffected. Clients can add support for Durin to all contract calls without introducing any new overhead or incompatibilities.

Contracts that require Durin will not function in conjunction with clients that do not implement this specification. Attempts to call these contracts from non-compliant clients will result in the contract throwing an exception that is propagaged to the user.

# Reference Implementation

TBD.

# Security Considerations

In order to prevent a malicious gateway from causing unintended side-effects or faulty results, contracts MUST specify a prefix that is sufficient to ensure the gateway's honesty. For more details, see "Requirement for prefix matching on gateway return data" in the Justification section, and the main specification.

Contracts must also implement sufficient validation of the data returned by the gateway to ensure it is valid. The validation required is application-specific and cannot be specified on a global basis.

Client libraries implementing Durin may want to ensure that callers have to explicitly enable its use on a contract, so users are not surprised when their client library begins making HTTPS calls to external services.