Cretonne Language Reference

The Cretonne intermediate language (IL) has two equivalent representations: an in-memory data structure that the code generator library is using, and a text format which is used for test cases and debug output. Files containing Cretonne textual IL have the .cton filename extension.

This reference uses the text format to describe IL semantics but glosses over the finer details of the lexical and syntactic structure of the format.

Overall structure

Cretonne compiles functions independently. A .cton IL file may contain multiple functions, and the programmatic API can create multiple function handles at the same time, but the functions don’t share any data or reference each other directly.

This is a simple C function that computes the average of an array of floats:

float
average(const float *array, size_t count)
{
    double sum = 0;
    for (size_t i = 0; i < count; i++)
        sum += array[i];
    return sum / count;
}

Here is the same function compiled into Cretonne IL:

function average(i32, i32) -> f32 {
    ss1 = stack_slot 8            ; Stack slot for ``sum``.

ebb1(v1: i32, v2: i32):
    v3 = f64const 0x0.0
    stack_store v3, ss1
    brz v2, ebb3                  ; Handle count == 0.
    v4 = iconst.i32 0
    jump ebb2(v4)

ebb2(v5: i32):
    v6 = imul_imm v5, 4
    v7 = iadd v1, v6
    v8 = heap_load.f32 v7         ; array[i]
    v9 = fpromote.f64 v8
    v10 = stack_load.f64 ss1
    v11 = fadd v9, v10
    stack_store v11, ss1
    v12 = iadd_imm v5, 1
    v13 = icmp ult v12, v2
    brnz v13, ebb2(v12)           ; Loop backedge.
    v14 = stack_load.f64 ss1
    v15 = fcvt_from_uint.f64 v2
    v16 = fdiv v14, v15
    v17 = fdemote.f32 v16
    return v17

ebb3:
    v100 = f32const +NaN
    return v100
}

The first line of a function definition provides the function name and the function signature which declares the argument and return types. Then follows the function preamble which declares a number of entities that can be referenced inside the function. In the example above, the preamble declares a single local variable, ss1.

After the preamble follows the function body which consists of extended basic blocks, the first of which is the entry block. Every EBB ends with a terminator instruction, so execution can never fall through to the next EBB without an explicit branch.

A .cton file consists of a sequence of independent function definitions:

function_list ::=  { function }
function      ::=  function_spec "{" preamble function_body "}"
function_spec ::=  "function" function_name signature
preamble      ::=  { preamble_decl }
function_body ::=  { extended_basic_block }

Static single assignment form

The instructions in the function body use and produce values in SSA form. This means that every value is defined exactly once, and every use of a value must be dominated by the definition.

Cretonne does not have phi instructions but uses EBB arguments instead. An EBB can be defined with a list of typed arguments. Whenever control is transferred to the EBB, values for the arguments must be provided. When entering a function, the incoming function arguments are passed as arguments to the entry EBB.

Instructions define zero, one, or more result values. All SSA values are either EBB arguments or instruction results.

In the example above, the loop induction variable i is represented as three SSA values: In the entry block, v4 is the initial value. In the loop block ebb2, the EBB argument v5 represents the value of the induction variable during each iteration. Finally, v12 is computed as the induction variable value for the next iteration.

It can be difficult to generate correct SSA form if the program being converted into Cretonne IL contains multiple assignments to the same variables. Such variables can be presented to Cretonne as stack slots instead. Stack slots are accessed with the stack_store and stack_load instructions which behave more like variable accesses in a typical programming language. Cretonne can perform the necessary data-flow analysis to convert stack slots to SSA form.

Value types

All SSA values have a type which determines the size and shape (for SIMD vectors) of the value. Many instructions are polymorphic – they can operate on different types.

Boolean types

Boolean values are either true or false. While this only requires a single bit to represent, more bits are often used when holding a boolean value in a register or in memory. The b1 type represents an abstract boolean value. It can only exist as an SSA value, it can’t be stored in memory or converted to another type. The larger boolean types can be stored in memory.

Todo

Clarify the representation of larger boolean types.

The multi-bit boolean types can be interpreted in different ways. We could declare that zero means false and non-zero means true. This may require unwanted normalization code in some places.

We could specify a fixed encoding like all ones for true. This would then lead to undefined behavior if untrusted code uses the multibit booleans incorrectly.

Something like this:

  • External code is not allowed to load/store multi-bit booleans or otherwise expose the representation.
  • Each target specifies the exact representation of a multi-bit boolean.
b1

A boolean value that is either true or false.

Bytes:Can’t be stored in memory
b8

A boolean type with 8 bits.

Bytes:1
b16

A boolean type with 16 bits.

Bytes:2
b32

A boolean type with 32 bits.

Bytes:4
b64

A boolean type with 64 bits.

Bytes:8

Integer types

Integer values have a fixed size and can be interpreted as either signed or unsigned. Some instructions will interpret an operand as a signed or unsigned number, others don’t care.

i8

An integer type with 8 bits.

Bytes:1
i16

An integer type with 16 bits.

Bytes:2
i32

An integer type with 32 bits.

Bytes:4
i64

An integer type with 64 bits.

Bytes:8

Floating point types

The floating point types have the IEEE semantics that are supported by most hardware. There is no support for higher-precision types like quads or double-double formats.

f32

A 32-bit floating point type represented in the IEEE 754-2008 binary32 interchange format. This corresponds to the float type in most C implementations.

Bytes:4
f64

A 64-bit floating point type represented in the IEEE 754-2008 binary64 interchange format. This corresponds to the double type in most C implementations.

Bytes:8

SIMD vector types

A SIMD vector type represents a vector of values from one of the scalar types (boolean, integer, and floating point). Each scalar value in a SIMD type is called a lane. The number of lanes must be a power of two in the range 2-256.

iBxN

A SIMD vector of integers. The lane type iB is one of the integer types i8 ... i64.

Some concrete integer vector types are i32x4, i64x8, and i16x4.

The size of a SIMD integer vector in memory is \(N B\over 8\) bytes.

f32xN

A SIMD vector of single precision floating point numbers.

Some concrete f32 vector types are: f32x2, f32x4, and f32x8.

The size of a f32 vector in memory is \(4N\) bytes.

f64xN

A SIMD vector of double precision floating point numbers.

Some concrete f64 vector types are: f64x2, f64x4, and f64x8.

The size of a f64 vector in memory is \(8N\) bytes.

b1xN

A boolean SIMD vector.

Boolean vectors are used when comparing SIMD vectors. For example, comparing two i32x4 values would produce a b1x4 result.

Like the b1 type, a boolean vector cannot be stored in memory.

Pseudo-types and type classes

These are not concrete types, but convenient names uses to refer to real types in this reference.

iPtr

A Pointer-sized integer.

This is either i32, or i64, depending on whether the target platform has 32-bit or 64-bit pointers.

iB

Any of the scalar integer types i8i64.

Int

Any scalar or vector integer type: iB or iBxN.

fB

Either of the floating point scalar types: f32 or f64.

Float

Any scalar or vector floating point type: fB or fBxN.

TxN

Any SIMD vector type.

Mem

Any type that can be stored in memory: Int or Float.

Logic

Either b1 or b1xN.

Testable

Either b1 or iN.

Immediate operand types

These types are not part of the normal SSA type system. They are used to indicate the different kinds of immediate operands on an instruction.

imm64

A 64-bit immediate integer. The value of this operand is interpreted as a signed two’s complement integer. Instruction encodings may limit the valid range.

In the textual format, imm64 immediates appear as decimal or hexadecimal literals using the same syntax as C.

offset32

A signed 32-bit immediate address offset.

In the textual format, offset32 immediates always have an explicit sign, and a 0 offset may beomitted.

ieee32

A 32-bit immediate floating point number in the IEEE 754-2008 binary32 interchange format. All bit patterns are allowed.

ieee64

A 64-bit immediate floating point number in the IEEE 754-2008 binary64 interchange format. All bit patterns are allowed.

intcc

An integer condition code. See the icmp instruction for details.

floatcc

A floating point condition code. See the fcmp instruction for details.

The two IEEE floating point immediate types ieee32 and ieee64 are displayed as hexadecimal floating point literals in the textual IL format. Decimal floating point literals are not allowed because some computer systems can round differently when converting to binary. The hexadecimal floating point format is mostly the same as the one used by C99, but extended to represent all NaN bit patterns:

Normal numbers
Compatible with C99: -0x1.Tpe where T are the trailing significand bits encoded as hexadecimal, and e is the unbiased exponent as a decimal number. ieee32 has 23 trailing significand bits. They are padded with an extra LSB to produce 6 hexadecimal digits. This is not necessary for ieee64 which has 52 trailing significand bits forming 13 hexadecimal digits with no padding.
Zeros
Positive and negative zero are displayed as 0.0 and -0.0 respectively.
Subnormal numbers
Compatible with C99: -0x0.Tpemin where T are the trailing significand bits encoded as hexadecimal, and emin is the minimum exponent as a decimal number.
Infinities
Either -Inf or Inf.
Quiet NaNs
Quiet NaNs have the MSB of the trailing significand set. If the remaining bits of the trailing significand are all zero, the value is displayed as -NaN or NaN. Otherwise, -NaN:0xT where T are the trailing significand bits encoded as hexadecimal.
Signaling NaNs
Displayed as -sNaN:0xT.

Control flow

Branches transfer control to a new EBB and provide values for the target EBB’s arguments, if it has any. Conditional branches only take the branch if their condition is satisfied, otherwise execution continues at the following instruction in the EBB.

jump EBB(args...)

Jump.

Unconditionally jump to an extended basic block, passing the specified EBB arguments. The number and types of arguments must match the destination EBB.

Arguments:
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
fallthrough EBB(args...)

Fall through to the next EBB.

This is the same as jump, except the destination EBB must be the next one in the layout.

Jumps are turned into fall-through instructions by the branch relaxation pass. There is no reason to use this instruction outside that pass.

Arguments:
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
brz c, EBB(args...)

Branch when zero.

If c is a b1 value, take the branch when c is false. If c is an integer value, take the branch when c = 0.

Arguments:
  • c (Testable) – Controlling value to test
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • Testable – inferred from c
brnz c, EBB(args...)

Branch when non-zero.

If c is a b1 value, take the branch when c is true. If c is an integer value, take the branch when c != 0.

Arguments:
  • c (Testable) – Controlling value to test
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • Testable – inferred from c
br_icmp Cond, x, y, EBB(args...)

Compare scalar integers and branch.

Compare x and y in the same way as the icmp instruction and take the branch if the condition is true:

br_icmp ugt v1, v2, ebb4(v5, v6)

is semantically equivalent to:

v10 = icmp ugt, v1, v2
brnz v10, ebb4(v5, v6)

Some RISC architectures like MIPS and RISC-V provide instructions that implement all or some of the condition codes. The instruction can also be used to represent macro-op fusion on architectures like Intel’s.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • iB – inferred from x
br_table x, JT

Indirect branch via jump table.

Use x as an unsigned index into the jump table JT. If a jump table entry is found, branch to the corresponding EBB. If no entry was found fall through to the next instruction.

Note that this branch instruction can’t pass arguments to the targeted blocks. Split critical edges as needed to work around this.

Arguments:
  • x (iB) – index into jump table
  • JT (jump_table) – A jump table.
Type Variables:
  • iB – inferred from x
JT = jump_table EBB0, EBB1, ..., EBBn

Declare a jump table in the function preamble.

This declares a jump table for use by the br_table indirect branch instruction. Entries in the table are either EBB names, or 0 which indicates an absent entry.

The EBBs listed must belong to the current function, and they can’t have any arguments.

Arguments:
  • EBB0 – Target EBB when x = 0.
  • EBB1 – Target EBB when x = 1.
  • EBBn – Target EBB when x = n.
Result:

A jump table identifier. (Not an SSA value).

Traps stop the program because something went wrong. The exact behavior depends on the target instruction set architecture and operating system. There are explicit trap instructions defined below, but some instructions may also cause traps for certain input value. For example, udiv traps when the divisor is zero.

trap

Terminate execution unconditionally.

trapz c

Trap when zero.

if c is non-zero, execution continues at the following instruction.

Arguments:
  • c (Testable) – Controlling value to test
Type Variables:
  • Testable – inferred from c
trapnz c

Trap when non-zero.

if c is zero, execution continues at the following instruction.

Arguments:
  • c (Testable) – Controlling value to test
Type Variables:
  • Testable – inferred from c

Function calls

A function call needs a target function and a function signature. The target function may be determined dynamically at runtime, but the signature must be known when the function call is compiled. The function signature describes how to call the function, including arguments, return values, and the calling convention:

signature  ::=  "(" [arglist] ")" ["->" retlist] [call_conv]
arglist    ::=  arg { "," arg }
retlist    ::=  arglist
arg        ::=  type [argext] [argspecial]
argext     ::=  "uext" | "sext"
argspecial ::=  "sret" | "link" | "fp" | "csr"
callconv   ::=  string

Arguments and return values have flags whose meaning is mostly target dependent. They make it possible to call native functions on the target platform. When calling other Cretonne functions, the flags are not necessary.

Functions that are called directly must be declared in the function preamble:

FN = function NAME signature

Declare a function so it can be called directly.

Arguments:
  • NAME – Name of the function, passed to the linker for resolution.
  • signature – Function signature. See below.
Results:
  • FN – A function identifier that can be used with call.
rvals = call FN(args...)

Direct function call.

Call a function which has been declared in the preamble. The argument types must match the function’s signature.

Arguments:
  • FN (func_ref) – function to call, declared by function
  • args (variable_args) – call arguments
Results:
  • rvals (variable_args) – return values
return rvals...

Return from the function.

Unconditionally transfer control to the calling function, passing the provided return values. The list of return values must match the function signature’s return types.

Arguments:
  • rvals (variable_args) – return values

This simple example illustrates direct function calls and signatures:

function gcd(i32 uext, i32 uext) -> i32 uext "C" {
    fn1 = function divmod(i32 uext, i32 uext) -> i32 uext, i32 uext

ebb1(v1: i32, v2: i32):
    brz v2, ebb2
    v3, v4 = call fn1(v1, v2)
    br ebb1(v2, v4)

ebb2:
    return v1
}

Indirect function calls use a signature declared in the preamble.

SIG = signature signature

Declare a function signature for use with indirect calls.

Arguments:
  • signature – Function signature. See signature.
Results:
  • SIG – A signature identifier.
rvals = call_indirect SIG, callee(args...)

Indirect function call.

Call the function pointed to by callee with the given arguments. The called function must match the specified signature.

Arguments:
  • SIG (sig_ref) – function signature
  • callee (iAddr) – address of function to call
  • args (variable_args) – call arguments
Results:
  • rvals (variable_args) – return values
Type Variables:
  • iAddr – inferred from callee

Todo

Define safe indirect function calls.

The call_indirect instruction is dangerous to use in a sandboxed environment since it is not easy to verify the callee address. We need a table-driven indirect call instruction, similar to br_table.

Memory

Cretonne provides fully general load and store instructions for accessing memory. However, it can be very complicated to verify the safety of general loads and stores when compiling code for a sandboxed environment, so Cretonne also provides more restricted memory operations that are always safe.

a = load Flags, p, Offset

Load from memory at p + Offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (Mem) – Value loaded
Type Variables:
  • Mem – explicitly provided
  • iAddr – from input operand
store Flags, x, p, Offset

Store x to memory at p + Offset.

This is a polymorphic instruction that can store any value type with a memory representation.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (Mem) – Value to be stored
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Type Variables:
  • Mem – inferred from x
  • iAddr – from input operand

Loads and stores are misaligned if the resultant address is not a multiple of the expected alignment. Depending on the target architecture, misaligned memory accesses may trap, or they may work. Sometimes, operating systems catch alignment traps and emulate the misaligned memory access.

Extending loads and truncating stores

Most ISAs provide instructions that load an integer value smaller than a register and extends it to the width of the register. Similarly, store instructions that only write the low bits of an integer register are common.

Cretonne provides extending loads and truncation stores for 8, 16, and 32-bit memory accesses.

a = uload8 Flags, p, Offset

Load 8 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i8 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (iExt8) – An integer type with more than 8 bits
Type Variables:
  • iExt8 – explicitly provided
  • iAddr – from input operand
a = sload8 Flags, p, Offset

Load 8 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i8 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (iExt8) – An integer type with more than 8 bits
Type Variables:
  • iExt8 – explicitly provided
  • iAddr – from input operand
istore8 Flags, x, p, Offset

Store the low 8 bits of x to memory at p + Offset.

This is equivalent to ireduce.i8 followed by store.i8.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt8) – An integer type with more than 8 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Type Variables:
  • iExt8 – inferred from x
  • iAddr – from input operand
a = uload16 Flags, p, Offset

Load 16 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i16 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (iExt16) – An integer type with more than 16 bits
Type Variables:
  • iExt16 – explicitly provided
  • iAddr – from input operand
a = sload16 Flags, p, Offset

Load 16 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i16 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (iExt16) – An integer type with more than 16 bits
Type Variables:
  • iExt16 – explicitly provided
  • iAddr – from input operand
istore16 Flags, x, p, Offset

Store the low 16 bits of x to memory at p + Offset.

This is equivalent to ireduce.i16 followed by store.i8.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt16) – An integer type with more than 16 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Type Variables:
  • iExt16 – inferred from x
  • iAddr – from input operand
a = uload32 Flags, p, Offset

Load 32 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i32 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (iExt32) – An integer type with more than 32 bits
Type Variables:
  • iExt32 – explicitly provided
  • iAddr – from input operand
a = sload32 Flags, p, Offset

Load 32 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i32 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (iExt32) – An integer type with more than 32 bits
Type Variables:
  • iExt32 – explicitly provided
  • iAddr – from input operand
istore32 Flags, x, p, Offset

Store the low 32 bits of x to memory at p + Offset.

This is equivalent to ireduce.i32 followed by store.i8.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt32) – An integer type with more than 32 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – In-bounds offset into stack slot
Type Variables:
  • iExt32 – inferred from x
  • iAddr – from input operand

Local variables

One set of restricted memory operations access the current function’s stack frame. The stack frame is divided into fixed-size stack slots that are allocated in the function preamble. Stack slots are not typed, they simply represent a contiguous sequence of bytes in the stack frame.

SS = stack_slot Bytes, Flags...

Allocate a stack slot in the preamble.

If no alignment is specified, Cretonne will pick an appropriate alignment for the stack slot based on its size and access patterns.

Arguments:
  • Bytes – Stack slot size on bytes.
Flags:
  • align(N) – Request at least N bytes alignment.
Results:
  • SS – Stack slot index.
a = stack_load SS, Offset

Load a value from a stack slot at the constant offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. \(sizeof(a) + Offset <= sizeof(SS)\).

Arguments:
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (Mem) – Value loaded
Type Variables:
  • Mem – explicitly provided
stack_store x, SS, Offset

Store a value to a stack slot at a constant offset.

This is a polymorphic instruction that can store any value type with a memory representation.

The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. \(sizeof(a) + Offset <= sizeof(SS)\).

Arguments:
  • x (Mem) – Value to be stored
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Type Variables:
  • Mem – inferred from x

The dedicated stack access instructions are easy for the compiler to reason about because stack slots and offsets are fixed at compile time. For example, the alignment of these stack memory accesses can be inferred from the offsets and stack slot alignments.

It can be necessary to escape from the safety of the restricted instructions by taking the address of a stack slot.

addr = stack_addr SS, Offset

Get the address of a stack slot.

Compute the absolute address of a byte in a stack slot. The offset must refer to a byte inside the stack slot: \(0 <= Offset < sizeof(SS)\).

Arguments:
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided

The stack_addr instruction can be used to macro-expand the stack access instructions before instruction selection:

v1 = stack_load.f64 ss3, 16
; Expands to:
v9 = stack_addr ss3, 16
v1 = load.f64 v9

Heaps

Code compiled from WebAssembly or asm.js runs in a sandbox where it can’t access all process memory. Instead, it is given a small set of memory areas to work in, and all accesses are bounds checked. Cretonne models this through the concept of heaps.

A heap is declared in the function preamble and can be accessed with restricted instructions that trap on out-of-bounds accesses. Heap addresses can be smaller than the native pointer size, for example unsigned i32 offsets on a 64-bit architecture.

H = heap Name

Declare a heap in the function preamble.

This doesn’t allocate memory, it just retrieves a handle to a sandbox from the runtime environment.

Arguments:
  • Name – String identifying the heap in the runtime environment.
Results:
  • H – Heap identifier.
a = heap_load p, Offset

Load a value at the address \(p + Offset\) in the heap H.

Trap if the heap access would be out of bounds.

Arguments:
  • p (iAddr) – An integer address type
  • Offset (uoffset32) – Unsigned offset to effective address
Results:
  • a (Mem) – Value loaded
Type Variables:
  • Mem – explicitly provided
  • iAddr – from input operand
heap_store x, p, Offset

Store a value at the address \(p + Offset\) in the heap H.

Trap if the heap access would be out of bounds.

Arguments:
  • x (Mem) – Value to be stored
  • p (iAddr) – An integer address type
  • Offset (uoffset32) – Unsigned offset to effective address
Type Variables:
  • Mem – inferred from x
  • iAddr – from input operand

When optimizing heap accesses, Cretonne may separate the heap bounds checking and address computations from the memory accesses.

addr = heap_addr p, Offset

Bounds check and compute absolute address of heap memory.

Verify that the address range p .. p + Size - 1 is valid in the heap H, and trap if not.

Convert the heap-relative address in p to a real absolute address and return it.

Arguments:
  • p (iAddr) – An integer address type
  • Offset (uoffset32) – Unsigned offset to effective address
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – inferred from p

A small example using heaps:

function vdup(i32, i32) {
    h1 = heap "main"

ebb1(v1: i32, v2: i32):
    v3 = heap_load.i32x4 h1, v1, 0
    v4 = heap_addr h1, v2, 32      ; Shared range check for two stores.
    store v3, v4, 0
    store v3, v4, 16
    return
}

The final expansion of the heap_addr range check and address conversion depends on the runtime environment.

Operations

The remaining instruction set is mostly arithmetic.

A few instructions have variants that take immediate operands (e.g., band / band_imm), but in general an instruction is required to load a constant into an SSA value.

a = select c, x, y

Conditional select.

This instruction selects whole values. Use vselect for lane-wise selection.

Arguments:
  • c (Testable) – Controlling value to test
  • x (Any) – Value to use when c is true
  • y (Any) – Value to use when c is false
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
  • Testable – from input operand

Constant materialization

a = iconst N

Integer constant.

Create a scalar integer SSA value with an immediate constant value, or an integer vector where all the lanes have the same value.

Arguments:
  • N (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A constant integer scalar or vector value
Type Variables:
  • Int – explicitly provided
a = f32const N

Floating point constant.

Create a f32 SSA value with an immediate constant value, or a floating point vector where all the lanes have the same value.

Arguments:
  • N (ieee32) – A 32-bit immediate floating point number.
Results:
  • a (f32) – A constant integer scalar or vector value
a = f64const N

Floating point constant.

Create a f64 SSA value with an immediate constant value, or a floating point vector where all the lanes have the same value.

Arguments:
  • N (ieee64) – A 64-bit immediate floating point number.
Results:
  • a (f64) – A constant integer scalar or vector value

Live range splitting

Cretonne’s register allocator assigns each SSA value to a register or a spill slot on the stack for its entire live range. Since the live range of an SSA value can be quite large, it is sometimes beneficial to split the live range into smaller parts.

A live range is split by creating new SSA values that are copies or the original value or each other. The copies are created by inserting copy, spill, or fill instructions, depending on whether the values are assigned to registers or stack slots.

This approach permits SSA form to be preserved throughout the register allocation pass and beyond.

a = copy x

Register-register copy.

This instruction copies its input, preserving the value type.

A pure SSA-form program does not need to copy values, but this instruction is useful for representing intermediate stages during instruction transformations, and the register allocator needs a way of representing register copies.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
a = spill x

Spill a register value to a stack slot.

This instruction behaves exactly like copy, but the result value is assigned to a spill slot.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
a = fill x

Load a register value from a stack slot.

This instruction behaves exactly like copy, but creates a new SSA value for the spilled input value.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x

Vector operations

lo, hi = vsplit x

Split a vector into two halves.

Split the vector x into two separate values, each containing half of the lanes from x. The result may be two scalars if x only had two lanes.

Arguments:
  • x (TxN) – Vector to split
Results:
  • lo (half_vector(TxN)) – Low-numbered lanes of x
  • hi (half_vector(TxN)) – High-numbered lanes of x
Type Variables:
  • TxN – inferred from x
a = vconcat x, y

Vector concatenation.

Return a vector formed by concatenating x and y. The resulting vector type has twice as many lanes as each of the inputs. The lanes of x appear as the low-numbered lanes, and the lanes of y become the high-numbered lanes of a.

It is possible to form a vector by concatenating two scalars.

Arguments:
  • x (Any128) – Low-numbered lanes
  • y (Any128) – High-numbered lanes
Results:
  • a (double_vector(Any128)) – Concatenation of x and y
Type Variables:
  • Any128 – inferred from x
a = vselect c, x, y

Vector lane select.

Select lanes from x or y controlled by the lanes of the boolean vector c.

Arguments:
  • c (as_bool(TxN)) – Controlling vector
  • x (TxN) – Value to use where c is true
  • y (TxN) – Value to use where c is false
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – inferred from x
a = splat x

Vector splat.

Return a vector whose lanes are all x.

Arguments:
  • x (lane_of(TxN)) – None
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – explicitly provided
a = insertlane x, Idx, y

Insert y as lane Idx in x.

The lane index, Idx, is an immediate value, not an SSA value. It must indicate a valid lane index for the type of x.

Arguments:
  • x (TxN) – SIMD vector to modify
  • Idx (uimm8) – Lane index
  • y (lane_of(TxN)) – New lane value
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – inferred from x
a = extractlane x, Idx

Extract lane Idx from x.

The lane index, Idx, is an immediate value, not an SSA value. It must indicate a valid lane index for the type of x.

Arguments:
  • x (TxN) – A SIMD vector type
  • Idx (uimm8) – Lane index
Results:
  • a (lane_of(TxN)) – None
Type Variables:
  • TxN – inferred from x

Integer operations

a = icmp Cond, x, y

Integer comparison.

The condition code determines if the operands are interpreted as signed or unsigned integers.

Signed Unsigned Condition
eq eq Equal
ne ne Not equal
slt ult Less than
sge uge Greater than or equal
sgt ugt Greater than
sle ule Less than or equal

When this instruction compares integer vectors, it returns a boolean vector of lane-wise comparisons.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (as_bool(Int)) – None
Type Variables:
  • Int – inferred from x
a = icmp_imm Cond, x, Y

Compare scalar integer to a constant.

This is the same as the icmp instruction, except one operand is an immediate constant.

This instruction can only compare scalars. Use icmp for lane-wise vector comparisons.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (b1) – A boolean value that is either true or false.
Type Variables:
  • iB – inferred from x
a = iadd x, y

Wrapping integer addition: \(a := x + y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = iadd_imm x, Y

Add immediate integer.

Same as iadd, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = iadd_cin x, y, c_in

Add integers with carry in.

Same as iadd with an additional carry input. Computes:

\[a = x + y + c_{in} \pmod 2^B\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • c_in (b1) – Input carry flag
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from y
a, c_out = iadd_cout x, y

Add integers with carry out.

Same as iadd with an additional carry output.

\[\begin{split}a &= x + y \pmod 2^B \\ c_{out} &= x+y >= 2^B\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
  • c_out (b1) – Output carry flag
Type Variables:
  • iB – inferred from x
a, c_out = iadd_carry x, y, c_in

Add integers with carry in and out.

Same as iadd with an additional carry input and output.

\[\begin{split}a &= x + y + c_{in} \pmod 2^B \\ c_{out} &= x + y + c_{in} >= 2^B\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • c_in (b1) – Input carry flag
Results:
  • a (iB) – A scalar integer type
  • c_out (b1) – Output carry flag
Type Variables:
  • iB – inferred from y
a = isub x, y

Wrapping integer subtraction: \(a := x - y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = irsub_imm x, Y

Immediate reverse wrapping subtraction: \(a := Y - x \pmod{2^B}\).

Also works as integer negation when \(Y = 0\). Use iadd_imm with a negative immediate operand for the reverse immediate subtraction.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = isub_bin x, y, b_in

Subtract integers with borrow in.

Same as isub with an additional borrow flag input. Computes:

\[a = x - (y + b_{in}) \pmod 2^B\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • b_in (b1) – Input borrow flag
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from y
a, b_out = isub_bout x, y

Subtract integers with borrow out.

Same as isub with an additional borrow flag output.

\[\begin{split}a &= x - y \pmod 2^B \\ b_{out} &= x < y\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
  • b_out (b1) – Output borrow flag
Type Variables:
  • iB – inferred from x
a, b_out = isub_borrow x, y, b_in

Subtract integers with borrow in and out.

Same as isub with an additional borrow flag input and output.

\[\begin{split}a &= x - (y + b_{in}) \pmod 2^B \\ b_{out} &= x < y + b_{in}\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • b_in (b1) – Input borrow flag
Results:
  • a (iB) – A scalar integer type
  • b_out (b1) – Output borrow flag
Type Variables:
  • iB – inferred from y
a = imul x, y

Wrapping integer multiplication: \(a := x y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Polymorphic over all integer types (vector and scalar).

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = imul_imm x, Y

Integer multiplication by immediate constant.

Polymorphic over all scalar integer types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x

Todo

Larger multiplication results.

For example, smulx which multiplies i32 operands to produce a i64 result. Alternatively, smulhi and smullo pairs.

a = udiv x, y

Unsigned integer division: \(a := \lfloor {x \over y} \rfloor\).

This operation traps if the divisor is zero.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = udiv_imm x, Y

Unsigned integer division by an immediate constant.

This instruction never traps because a divisor of zero is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = sdiv x, y

Signed integer division rounded toward zero: \(a := sign(xy) \lfloor {|x| \over |y|}\rfloor\).

This operation traps if the divisor is zero, or if the result is not representable in \(B\) bits two’s complement. This only happens when \(x = -2^{B-1}, y = -1\).

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = sdiv_imm x, Y

Signed integer division by an immediate constant.

This instruction never traps because a divisor of -1 or 0 is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = urem x, y

Unsigned integer remainder.

This operation traps if the divisor is zero.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = urem_imm x, Y

Unsigned integer remainder with immediate divisor.

This instruction never traps because a divisor of zero is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = srem x, y

Signed integer remainder.

This operation traps if the divisor is zero.

Todo

Integer remainder vs modulus.

Clarify whether the result has the sign of the divisor or the dividend. Should we add a smod instruction for the case where the result has the same sign as the divisor?

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = srem_imm x, Y

Signed integer remainder with immediate divisor.

This instruction never traps because a divisor of 0 or -1 is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x

Todo

Minimum / maximum.

NEON has smin, smax, umin, and umax instructions. We should replicate those for both scalar and vector integer types. Even if the target ISA doesn’t have scalar operations, these are good pattern matching targets.

Todo

Saturating arithmetic.

Mostly for SIMD use, but again these are good patterns for contraction. Something like usatadd, usatsub, ssatadd, and ssatsub is a good start.

Bitwise operations

The bitwise operations and operate on any value type: Integers, floating point numbers, and booleans. When operating on integer or floating point types, the bitwise operations are working on the binary representation of the values. When operating on boolean values, the bitwise operations work as logical operators.

a = band x, y

Bitwise and.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = band_imm x, Y

Bitwise and with immediate.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bor x, y

Bitwise or.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bor_imm x, Y

Bitwise or with immediate.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bxor x, y

Bitwise xor.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bxor_imm x, Y

Bitwise xor with immediate.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bnot x

Bitwise not.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x

Todo

Redundant bitwise operators.

ARM has instructions like bic(x,y) = x & ~y, orn(x,y) = x | ~y, and eon(x,y) = x ^ ~y.

The shift and rotate operations only work on integer types (scalar and vector). The shift amount does not have to be the same type as the value being shifted. Only the low B bits of the shift amount is significant.

When operating on an integer vector type, the shift amount is still a scalar type, and all the lanes are shifted the same amount. The shift amount is masked to the number of bits in a lane, not the full size of the vector type.

a = rotl x, y

Rotate left.

Rotate the bits in x by y places.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = rotl_imm x, Y

Rotate left by immediate.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = rotr x, y

Rotate right.

Rotate the bits in x by y places.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = rotr_imm x, Y

Rotate right by immediate.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = ishl x, y

Integer shift left. Shift the bits in x towards the MSB by y places. Shift in zero bits to the LSB.

The shift amount is masked to the size of x.

When shifting a B-bits integer type, this instruction computes:

\[\begin{split}s &:= y \pmod B, \\ a &:= x \cdot 2^s \pmod{2^B}.\end{split}\]
Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = ishl_imm x, Y

Integer shift left by immediate.

The shift amount is masked to the size of x.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = ushr x, y

Unsigned shift right. Shift bits in x towards the LSB by y places, shifting in zero bits to the MSB. Also called a logical shift.

The shift amount is masked to the size of the register.

When shifting a B-bits integer type, this instruction computes:

\[\begin{split}s &:= y \pmod B, \\ a &:= \lfloor x \cdot 2^{-s} \rfloor.\end{split}\]
Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = ushr_imm x, Y

Unsigned shift right by immediate.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = sshr x, y

Signed shift right. Shift bits in x towards the LSB by y places, shifting in sign bits to the MSB. Also called an arithmetic shift.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = sshr_imm x, Y

Signed shift right by immediate.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x

The bit-counting instructions below are scalar only.

a = clz x

Count leading zero bits.

Starting from the MSB in x, count the number of zero bits before reaching the first one bit. When x is zero, returns the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (i8) – An integer type with 8 bits.
Type Variables:
  • iB – inferred from x
a = cls x

Count leading sign bits.

Starting from the MSB after the sign bit in x, count the number of consecutive bits identical to the sign bit. When x is 0 or -1, returns one less than the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (i8) – An integer type with 8 bits.
Type Variables:
  • iB – inferred from x
a = ctz x

Count trailing zeros.

Starting from the LSB in x, count the number of zero bits before reaching the first one bit. When x is zero, returns the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (i8) – An integer type with 8 bits.
Type Variables:
  • iB – inferred from x
a = popcnt x

Population count

Count the number of one bits in x.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (i8) – An integer type with 8 bits.
Type Variables:
  • iB – inferred from x

Floating point operations

These operations generally follow IEEE 754-2008 semantics.

a = fcmp Cond, x, y

Floating point comparison.

Two IEEE 754-2008 floating point numbers, x and y, relate to each other in exactly one of four ways:

UN Unordered when one or both numbers is NaN.
EQ When \(x = y\). (And \(0.0 = -0.0\)).
LT When \(x < y\).
GT When \(x > y\).

The 14 floatcc condition codes each correspond to a subset of the four relations, except for the empty set which would always be false, and the full set which would always be true.

The condition codes are divided into 7 ‘ordered’ conditions which don’t include UN, and 7 unordered conditions which all include UN.

Ordered Unordered Condition
ord EQ | LT | GT uno UN NaNs absent / present.
eq EQ ueq UN | EQ Equal
one LT | GT ne UN | LT | GT Not equal
lt LT ult UN | LT Less than
le LT | EQ ule UN | LT | EQ Less than or equal
gt GT ugt UN | GT Greater than
ge GT | EQ uge UN | GT | EQ Greater than or equal

The standard C comparison operators, <, <=, >, >=, are all ordered, so they are false if either operand is NaN. The C equality operator, ==, is ordered, and since inequality is defined as the logical inverse it is unordered. They map to the floatcc condition codes as follows:

C Cond Subset
== eq EQ
!= ne UN | LT | GT
< lt LT
<= le LT | EQ
> gt GT
>= ge GT | EQ

This subset of condition codes also corresponds to the WebAssembly floating point comparisons of the same name.

When this instruction compares floating point vectors, it returns a boolean vector with the results of lane-wise comparisons.

Arguments:
  • Cond (floatcc) – A floating point comparison condition code.
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (as_bool(Float)) – None
Type Variables:
  • Float – inferred from x
a = fadd x, y

Floating point addition.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fsub x, y

Floating point subtraction.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fmul x, y

Floating point multiplication.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fdiv x, y

Floating point division.

Unlike the integer division instructions sdiv and udiv, this can’t trap. Division by zero is infinity or NaN, depending on the dividend.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = sqrt x

Floating point square root.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fma x, y, z

Floating point fused multiply-and-add.

Computes \(a := xy+z\) without any intermediate rounding of the product.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
  • z (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from y

Sign bit manipulations

The sign manipulating instructions work as bitwise operations, so they don’t have special behavior for signaling NaN operands. The exponent and trailing significand bits are always preserved.

a = fneg x

Floating point negation.

Note that this is a pure bitwise operation.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit inverted
Type Variables:
  • Float – inferred from x
a = fabs x

Floating point absolute value.

Note that this is a pure bitwise operation.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit cleared
Type Variables:
  • Float – inferred from x
a = fcopysign x, y

Floating point copy sign.

Note that this is a pure bitwise operation. The sign bit from y is copied to the sign bit of x.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit changed to that of y
Type Variables:
  • Float – inferred from x

Minimum and maximum

These instructions return the larger or smaller of their operands. They differ in their handling of quiet NaN inputs. Note that signaling NaN operands always cause a NaN result.

When comparing zeroes, these instructions behave as if \(-0.0 < 0.0\).

a = fmin x, y

Floating point minimum, propagating NaNs.

If either operand is NaN, this returns a NaN.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The smaller of x and y
Type Variables:
  • Float – inferred from x
a = fminnum x, y

Floating point minimum, suppressing quiet NaNs.

If either operand is a quiet NaN, the other operand is returned. If either operand is a signaling NaN, NaN is returned.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The smaller of x and y
Type Variables:
  • Float – inferred from x
a = fmax x, y

Floating point maximum, propagating NaNs.

If either operand is NaN, this returns a NaN.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The larger of x and y
Type Variables:
  • Float – inferred from x
a = fmaxnum x, y

Floating point maximum, suppressing quiet NaNs.

If either operand is a quiet NaN, the other operand is returned. If either operand is a signaling NaN, NaN is returned.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The larger of x and y
Type Variables:
  • Float – inferred from x

Rounding

These instructions round their argument to a nearby integral value, still represented as a floating point number.

a = ceil x

Round floating point round to integral, towards positive infinity.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = floor x

Round floating point round to integral, towards negative infinity.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = trunc x

Round floating point round to integral, towards zero.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = nearest x

Round floating point round to integral, towards nearest with ties to even.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x

Conversion operations

a = bitcast x

Reinterpret the bits in x as a different type.

The input and output types must be storable to memory and of the same size. A bitcast is equivalent to storing one type and loading the other type from the same address.

Arguments:
  • x (Mem) – Any type that can be stored in memory
Results:
  • a (MemTo) – Bits of x reinterpreted
Type Variables:
  • MemTo – explicitly provided
  • Mem – from input operand
a = breduce x

Convert x to a smaller boolean type in the platform-defined way.

The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (BoolTo) – A smaller boolean type with the same number of lanes
Type Variables:
  • BoolTo – explicitly provided
  • Bool – from input operand
a = bextend x

Convert x to a larger boolean type in the platform-defined way.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (BoolTo) – A larger boolean type with the same number of lanes
Type Variables:
  • BoolTo – explicitly provided
  • Bool – from input operand
a = bint x

Convert x to an integer.

True maps to 1 and false maps to 0. The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (IntTo) – An integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Bool – from input operand
a = bmask x

Convert x to an integer mask.

True maps to all 1s and false maps to all 0s. The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (IntTo) – An integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Bool – from input operand
a = ireduce x

Convert x to a smaller integer type by dropping high bits.

Each lane in x is converted to a smaller integer type by discarding the most significant bits. This is the same as reducing modulo \(2^n\).

The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A smaller integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = uextend x

Convert x to a larger integer type by zero-extending.

Each lane in x is converted to a larger integer type by adding zeroes. The result has the same numerical value as x when both are interpreted as unsigned integers.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = sextend x

Convert x to a larger integer type by sign-extending.

Each lane in x is converted to a larger integer type by replicating the sign bit. The result has the same numerical value as x when both are interpreted as signed integers.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = fpromote x

Convert x to a larger floating point format.

Each lane in x is converted to the destination floating point format. This is an exact operation.

Since Cretonne currently only supports two floating point formats, this instruction always converts f32 to f64. This may change in the future.

The result type must have the same number of vector lanes as the input, and the result lanes must be larger than the input lanes.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Float – from input operand
a = fdemote x

Convert x to a smaller floating point format.

Each lane in x is converted to the destination floating point format by rounding to nearest, ties to even.

Since Cretonne currently only supports two floating point formats, this instruction always converts f64 to f32. This may change in the future.

The result type must have the same number of vector lanes as the input, and the result lanes must be smaller than the input lanes.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Float – from input operand
a = fcvt_to_uint x

Convert floating point to unsigned integer.

Each lane in x is converted to an unsigned integer by rounding towards zero. If x is NaN or if the unsigned integral value cannot be represented in the result type, this instruction traps.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Float – from input operand
a = fcvt_to_sint x

Convert floating point to signed integer.

Each lane in x is converted to a signed integer by rounding towards zero. If x is NaN or if the signed integral value cannot be represented in the result type, this instruction traps.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Float – from input operand
a = fcvt_from_uint x

Convert unsigned integer to floating point.

Each lane in x is interpreted as an unsigned integer and converted to floating point using round to nearest, ties to even.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Int – from input operand
a = fcvt_from_sint x

Convert signed integer to floating point.

Each lane in x is interpreted as a signed integer and converted to floating point using round to nearest, ties to even.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Int – from input operand

Legalization operations

These instructions are used as helpers when legalizing types and operations for the target ISA.

lo, hi = isplit x

Split an integer into low and high parts.

Vectors of integers are split lane-wise, so the results have the same number of lanes as the input, but the lanes are half the size.

Returns the low half of x and the high half of x as two independent values.

Arguments:
  • x (WideInt) – An integer type with lanes from i16 upwards
Results:
  • lo (half_width(WideInt)) – The low bits of x
  • hi (half_width(WideInt)) – The high bits of x
Type Variables:
  • WideInt – inferred from x
a = iconcat lo, hi

Concatenate low and high bits to form a larger integer type.

Vectors of integers are concatenated lane-wise such that the result has the same number of lanes as the inputs, but the lanes are twice the size.

Arguments:
  • lo (NarrowInt) – An integer type with lanes type to i32
  • hi (NarrowInt) – An integer type with lanes type to i32
Results:
  • a (double_width(NarrowInt)) – The concatenation of lo and hi
Type Variables:
  • NarrowInt – inferred from lo

Implementation limits

Cretonne’s intermediate representation imposes some limits on the size of functions and the number of entities allowed. If these limits are exceeded, the implementation will panic.

Number of instructions in a function
At most \(2^{31} - 1\).
Number of EBBs in a function

At most \(2^{31} - 1\).

Every EBB needs at least a terminator instruction anyway.

Number of secondary values in a function

At most \(2^{31} - 1\).

Secondary values are any SSA values that are not the first result of an instruction.

Other entities declared in the preamble

At most \(2^{32} - 1\).

This covers things like stack slots, jump tables, external functions, and function signatures, etc.

Number of arguments to an EBB
At most \(2^{16}\).
Number of arguments to a function

At most \(2^{16}\).

This follows from the limit on arguments to the entry EBB. Note that Cretonne may add a handful of ABI register arguments as function signatures are lowered. This is for representing things like the link register, the incoming frame pointer, and callee-saved registers that are saved in the prologue.

Size of function call arguments on the stack

At most \(2^{32} - 1\) bytes.

This is probably not possible to achieve given the limit on the number of arguments, except by requiring extremely large offsets for stack arguments.

Glossary

intermediate language
IL
The language used to describe functions to Cretonne. This reference describes the syntax and semantics of the Cretonne IL. The IL has two forms: Textual and an in-memory intermediate representation (IR).
intermediate representation
IR
The in-memory representation of IL. The data structures Cretonne uses to represent a program internally are called the intermediate representation. Cretonne’s IR can be converted to text losslessly.
function signature

A function signature describes how to call a function. It consists of:

  • The calling convention.
  • The number of arguments and return values. (Functions can return multiple values.)
  • Type and flags of each argument.
  • Type and flags of each return value.

Not all function attributes are part of the signature. For example, a function that never returns could be marked as noreturn, but that is not necessary to know when calling it, so it is just an attribute, and not part of the signature.

function preamble

A list of declarations of entities that are used by the function body. Some of the entities that can be declared in the preamble are:

  • Local variables.
  • Functions that are called directly.
  • Function signatures for indirect function calls.
  • Function flags and attributes that are not part of the signature.
function body
The extended basic blocks which contain all the executable code in a function. The function body follows the function preamble.
basic block
A maximal sequence of instructions that can only be entered from the top, and that contains no branch or terminator instructions except for the last instruction.
extended basic block
EBB

A maximal sequence of instructions that can only be entered from the top, and that contains no terminator instructions except for the last one. An EBB can contain conditional branches that can fall through to the following instructions in the block, but only the first instruction in the EBB can be a branch target.

The last instruction in an EBB must be a terminator instruction, so execution cannot flow through to the next EBB in the function. (But there may be a branch to the next EBB.)

Note that some textbooks define an EBB as a maximal subtree in the control flow graph where only the root can be a join node. This definition is not equivalent to Cretonne EBBs.

terminator instruction

A control flow instruction that unconditionally directs the flow of execution somewhere else. Execution never continues at the instruction following a terminator instruction.

The basic terminator instructions are br, return, and trap. Conditional branches and instructions that trap conditionally are not terminator instructions.

entry block
The EBB that is executed first in a function. Currently, a Cretonne function must have exactly one entry block which must be the first block in the function. The types of the entry block arguments must match the types of arguments in the function signature.
stack slot
A fixed size memory allocation in the current function’s activation frame. Also called a local variable.