Title: Bounded Looping Operations
Type: Standards
Layer: Consensus
Maintainer: Jason Dreyzehner
Status: Draft
Initial Publication Date: 2021-05-28
Latest Revision Date: 2021-05-28
This proposal includes two new BCH virtual machine (VM) operations, OP_BEGIN
and OP_UNTIL
, which enable a variety of loop constructions in BCH contracts without increasing the processing or memory requirements of the VM.
Deployment of this specification is proposed for the May 2022 upgrade.
This proposal requires CHIP: Targeted Virtual Machine Limits
, or equivalent limits on the processing and memory usage of the VM.
The Bitcoin Cash VM is strictly limited to prevent maliciously-designed transactions from requiring excessive resources during transaction validation.
Loops were originally excluded from this design on the mistaken assumption that they risk consuming excessive validation resources. In reality, a small number of VM operations make up the majority of resource usage during transaction validation, and any implementation of bounded loops can be many orders of magnitude faster than the most pathological VM bytecode constructions.
Introducing this basic control flow structure would make BCH contracts significantly more powerful and efficient.
Looping operations enable many use cases where specifying a precise procedure using OP_IF
is impractical or impossible, and they also significantly reduce the bytecode length of many contracts with repeated procedures.
The following costs and risks have been assessed.
Without adequate limits, looping operations have the potential to increase worst-case transaction validation costs and expose VM implementations to Denial of Service (DOS) attacks.
Mitigations: this proposal includes strict limits which ensure loops do not increase the maximum executable bytecode length (MAX_SCRIPT_SIZE
, 10,000 bytes). This altogether avoids increasing the available surface area for Denial of Service (DOS) attacks.
This proposal affects consensus – all fully-validating node software must implement these VM changes to remain in consensus.
These VM changes are backwards-compatible: all past and currently-possible transactions remain valid under these new rules.
Because this proposal only affects internals of the VM, standard wallets, block explorers, and other services will not require software modifications for these changes. Only software which offers VM evaluation (e.g. Bitauth IDE) will be required to upgrade.
Wallets and other services may optionally upgrade to add support for new contracts which will become possible after deployment of this proposal.
Two new VM operations are added: OP_BEGIN
(0x65
/101
) and OP_UNTIL
(0x66
/102
).
Note: these codepoints were previously used by
OP_VERIF
(0x65
) andOP_VERNOTIF
(0x66
) in early versions of Bitcoin, but those operations were disabled when it was realized that they created a fork-risk between different versions of the Bitcoin software. Both codepoints currently render the transaction invalid (even when found in an unexecuted OP_IF branch), so they are effectively equivalent to any codepoints in the undefined ranges. They are particularly well-suited for this application because they are the only two available, adjacent codepoints within the control-flow range of operations.
To support these operations, the control stack is modified to support integer values, and a new unsigned integer counter Repeated Bytes
is added to the VM state.
This proposal modifies the control stack (A.K.A. ExecStack
or ConditionStack
) to support integer values. When testing for operation execution, integer values must be ignored.
Prior to this proposal, the control stack (A.K.A.
ExecStack
orConditionStack
) is conceptually an array of boolean values: when a flow control operation (likeOP_IF
) is encountered, atrue
is pushed to the stack if the branch is to be executed, andfalse
if not. (OP_ELSE
toggles the top value on this stack.) Non-flow control operations are only evaluated if the top of the stack istrue
.
A new unsigned integer counter, Repeated Bytes
, is added to the VM state, initialized to 0
at the beginning of evaluation.
Repeated Bytes
is only used byOP_UNTIL
to prevent excessive use of loops.
OP_BEGIN
pushes the current instruction pointer index to the control stack (A.K.A. ConditionStack
).
This operation sets a pointer to which the evaluation will return when the matching
OP_UNTIL
is encountered. If a matchingOP_UNTIL
is never encountered, the evaluation returns an error.
OP_UNTIL
pops the top item from the control stack:
- if this control value is not an integer, error.
- The difference between the current instruction pointer and the control value is added to
Repeated Bytes
; ifRepeated Bytes
+script.size()
is greater thanMAX_SCRIPT_SIZE
, error. - The top item is popped from the stack, if the value is
0
(the same test asOP_NOTIF
) the instruction pointer is set to theOP_BEGIN
instruction (otherwise, evaluation continues past theOP_UNTIL
).
OP_UNTIL
has two possible error conditions: 1) a mismatchedOP_BEGIN
(eitherOP_BEGIN
is missing or the loop contains anOP_IF
without a matchingOP_ENDIF
), or 2) executing the loop would cause the evaluation to exceedMAX_SCRIPT_SIZE
. If neither of these occur, and the top stack item is0
, evaluation returns to theOP_BEGIN
(and that instruction pointer index is again pushed to the control stack).
This section documents design decisions made in this specification.
This proposal is influenced by the BEGIN ... UNTIL
indefinite loop construction in the Forth programming language (upon which the rest of the Bitcoin Cash VM is based). This particular construction offers maximal flexibility: definite loops can be easily emulated with indefinite loops, but indefinite loops cannot be emulated with definite loops.
One alternative to the proposed OP_BEGIN ... OP_UNTIL
construction could be a single OP_GOTO
operation which (with certain limits) jumps to the instruction pointer at the top of the stack.
While such an operation could help to further increase the efficiency and expressiveness of the BCH instruction set, a GOTO
instruction also makes modeling execution more difficult. This is particularly limiting for covenant applications – it is notably more difficult for covenants to keep track of expected instruction pointer indexes (to reconstruct contracts with hypothetical OP_GOTO
operations) than it is for them to use this proposal's construction.
The higher-level OP_BEGIN ... OP_UNTIL
construction also better matches the semantics of the existing VM instruction set (e.g. OP_IF ... OP_ENDIF
).
The following examples demonstrate usage of OP_BEGIN
and OP_UNTIL
.
Bounded loops reduce wasted bytes in repeated procedures like merkle proof validation.
Merkle trees offer a versatile data structure for storing state in BCH contracts. Below is one possible strategy for validating a merkle proof (shown validating the left-most element in a tree of 16 elements):
<sibling_tier_4_hash>
<sibling_tier_3_hash>
<sibling_tier_2_hash>
<sibling_leaf_hash>
<value>
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
OP_FROMALTSTACK // check if leaf requires swap
OP_IF OP_SWAP OP_ENDIF
OP_CAT OP_HASH160
OP_FROMALTSTACK // check if tier 2 requires swap
OP_IF OP_SWAP OP_ENDIF
OP_CAT OP_HASH160
OP_FROMALTSTACK // check if tier 3 requires swap
OP_IF OP_SWAP OP_ENDIF
OP_CAT OP_HASH160
OP_FROMALTSTACK // check if tier 4 requires swap
OP_IF OP_SWAP OP_ENDIF
OP_CAT OP_HASH160
<root> OP_EQUALVERIFY
With bounded loops, this procedure can be simplified to:
<sibling_tier_4_hash>
<sibling_tier_3_hash>
<sibling_tier_2_hash>
<sibling_leaf_hash>
<value>
<0> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1>
OP_BEGIN
OP_IF OP_SWAP OP_ENDIF // check if leaf requires swap
OP_CAT OP_HASH160
OP_FROMALTSTACK
OP_UNTIL
<root> OP_EQUALVERIFY
Without bounded loops, contracts are forced to emulate aggregation by re-specifying a particular aggregation procedure for each possible set size.
For example, to aggregate the UTXO value of each transaction input without bounded loops (for up to 4 inputs using transaction introspection operations):
<0> // aggregated value
<4> // the count of values to aggregate
// locking bytecode
OP_DUP OP_TXINPUTCOUNT OP_LESSTHANOREQUAL OP_VERIFY // prevent malleability
OP_1SUB OP_DUP <0> OP_GREATERTHANOREQUAL OP_IF
OP_SWAP
<0> OP_UTXOVALUE
OP_ADD
OP_SWAP
OP_ENDIF
OP_1SUB OP_DUP <0> OP_GREATERTHANOREQUAL OP_IF
OP_SWAP
<1> OP_UTXOVALUE
OP_ADD
OP_SWAP
OP_ENDIF
OP_1SUB OP_DUP <0> OP_GREATERTHANOREQUAL OP_IF
OP_SWAP
<2> OP_UTXOVALUE
OP_ADD
OP_SWAP
OP_ENDIF
OP_1SUB OP_DUP <0> OP_GREATERTHANOREQUAL OP_IF
OP_SWAP
<3> OP_UTXOVALUE
OP_ADD
OP_SWAP
OP_ENDIF
OP_DROP // drop remaining index, leaving aggregated value on top
With bounded loops, this aggregation can be simplified to:
<0> <0>
OP_BEGIN // loop (re)starts with index, value
OP_OVER OP_UTXOVALUE OP_ADD // Add UTXO value to total
OP_SWAP OP_1ADD OP_SWAP // increment index
OP_OVER OP_TXINPUTCOUNT OP_EQUAL
OP_UNTIL // loop until next index would exceed final index
OP_NIP // drop index, leaving sum of UTXO values
Please see the following reference implementations for additional examples and test vectors:
[TODO: after initial public feedback]
[TODO]
This document is placed in the public domain.