Cretonne Language Reference

The Cretonne intermediate language (IL) has two equivalent representations: an in-memory data structure that the code generator library is using, and a text format which is used for test cases and debug output. Files containing Cretonne textual IL have the .cton filename extension.

This reference uses the text format to describe IL semantics but glosses over the finer details of the lexical and syntactic structure of the format.

Overall structure

Cretonne compiles functions independently. A .cton IL file may contain multiple functions, and the programmatic API can create multiple function handles at the same time, but the functions don’t share any data or reference each other directly.

This is a simple C function that computes the average of an array of floats:

float
average(const float *array, size_t count)
{
    double sum = 0;
    for (size_t i = 0; i < count; i++)
        sum += array[i];
    return sum / count;
}

Here is the same function compiled into Cretonne IL:

function %average(i32, i32) -> f32 native {
    ss1 = local 8            ; Stack slot for ``sum``.

ebb1(v1: i32, v2: i32):
    v3 = f64const 0x0.0
    stack_store v3, ss1
    brz v2, ebb3                  ; Handle count == 0.
    v4 = iconst.i32 0
    jump ebb2(v4)

ebb2(v5: i32):
    v6 = imul_imm v5, 4
    v7 = iadd v1, v6
    v8 = load.f32 v7              ; array[i]
    v9 = fpromote.f64 v8
    v10 = stack_load.f64 ss1
    v11 = fadd v9, v10
    stack_store v11, ss1
    v12 = iadd_imm v5, 1
    v13 = icmp ult v12, v2
    brnz v13, ebb2(v12)           ; Loop backedge.
    v14 = stack_load.f64 ss1
    v15 = fcvt_from_uint.f64 v2
    v16 = fdiv v14, v15
    v17 = fdemote.f32 v16
    return v17

ebb3:
    v100 = f32const +NaN
    return v100
}

The first line of a function definition provides the function name and the function signature which declares the parameter and return types. Then follows the function preamble which declares a number of entities that can be referenced inside the function. In the example above, the preamble declares a single local variable, ss1.

After the preamble follows the function body which consists of extended basic blocks (EBBs), the first of which is the entry block. Every EBB ends with a terminator instruction, so execution can never fall through to the next EBB without an explicit branch.

A .cton file consists of a sequence of independent function definitions:

function_list ::=  { function }
function      ::=  function_spec "{" preamble function_body "}"
function_spec ::=  "function" function_name signature
preamble      ::=  { preamble_decl }
function_body ::=  { extended_basic_block }

Static single assignment form

The instructions in the function body use and produce values in SSA form. This means that every value is defined exactly once, and every use of a value must be dominated by the definition.

Cretonne does not have phi instructions but uses EBB parameters instead. An EBB can be defined with a list of typed parameters. Whenever control is transferred to the EBB, argument values for the parameters must be provided. When entering a function, the incoming function parameters are passed as arguments to the entry EBB’s parameters.

Instructions define zero, one, or more result values. All SSA values are either EBB parameters or instruction results.

In the example above, the loop induction variable i is represented as three SSA values: In the entry block, v4 is the initial value. In the loop block ebb2, the EBB parameter v5 represents the value of the induction variable during each iteration. Finally, v12 is computed as the induction variable value for the next iteration.

The cton_frontend crate contains utilities for translating from programs containing multiple assignments to the same variables into SSA form for Cretonne IL.

Such variables can also be presented to Cretonne as stack slots. Stack slots are accessed with the stack_store and stack_load instructions, and can have their address taken with stack_addr, which supports C-like programming languages where local variables can have their address taken.

Value types

All SSA values have a type which determines the size and shape (for SIMD vectors) of the value. Many instructions are polymorphic – they can operate on different types.

Boolean types

Boolean values are either true or false. While this only requires a single bit to represent, more bits are often used when holding a boolean value in a register or in memory. The b1 type represents an abstract boolean value. It can only exist as an SSA value, it can’t be stored in memory or converted to another type. The larger boolean types can be stored in memory. They are represented as either all zero bits or all one bits.

b1

A boolean type with 1 bits.

Bytes:Can’t be stored in memory
b8

A boolean type with 8 bits.

Bytes:1
b16

A boolean type with 16 bits.

Bytes:2
b32

A boolean type with 32 bits.

Bytes:4
b64

A boolean type with 64 bits.

Bytes:8

Integer types

Integer values have a fixed size and can be interpreted as either signed or unsigned. Some instructions will interpret an operand as a signed or unsigned number, others don’t care.

i8

An integer type with 8 bits.

Bytes:1
i16

An integer type with 16 bits.

Bytes:2
i32

An integer type with 32 bits.

Bytes:4
i64

An integer type with 64 bits.

Bytes:8

Floating point types

The floating point types have the IEEE 754 semantics that are supported by most hardware, except that non-default rounding modes, unmasked exceptions, and exception flags are not currently supported.

There is currently no support for higher-precision types like quad-precision, double-double, or extended-precision, nor for narrower-precision types like half-precision.

NaNs are encoded following the IEEE 754-2008 recommendation, with quiet NaN being encoded with the MSB of the trailing significand set to 1, and signaling NaNs being indicated by the MSB of the trailing significand set to 0.

Except for bitwise and memory instructions, NaNs returned from arithmetic instructions are encoded as follows:

  • If all NaN inputs to an instruction are quiet NaNs with all bits of the trailing significand other than the MSB set to 0, the result is a quiet NaN with a nondeterministic sign bit and all bits of the trailing significand other than the MSB set to 0.
  • Otherwise the result is a quiet NaN with a nondeterministic sign bit and all bits of the trailing significand other than the MSB set to nondeterministic values.
f32

A 32-bit floating point type represented in the IEEE 754-2008 binary32 interchange format. This corresponds to the float type in most C implementations.

Bytes:4
f64

A 64-bit floating point type represented in the IEEE 754-2008 binary64 interchange format. This corresponds to the double type in most C implementations.

Bytes:8

CPU flags types

Some target ISAs use CPU flags to represent the result of a comparison. These CPU flags are represented as two value types depending on the type of values compared.

Since some ISAs don’t have CPU flags, these value types should not be used until the legalization phase of compilation where the code is adapted to fit the target ISA. Use instructions like icmp instead.

The CPU flags types are also restricted such that two flags values can not be live at the same time. After legalization, some instruction encodings will clobber the flags, and flags values are not allowed to be live across such instructions either. The verifier enforces these rules.

iflags

CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.

Bytes:Can’t be stored in memory
fflags

CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code.

Bytes:Can’t be stored in memory

SIMD vector types

A SIMD vector type represents a vector of values from one of the scalar types (boolean, integer, and floating point). Each scalar value in a SIMD type is called a lane. The number of lanes must be a power of two in the range 2-256.

iBxN

A SIMD vector of integers. The lane type iB is one of the integer types i8i64.

Some concrete integer vector types are i32x4, i64x8, and i16x4.

The size of a SIMD integer vector in memory is \(N B\over 8\) bytes.

f32xN

A SIMD vector of single precision floating point numbers.

Some concrete f32 vector types are: f32x2, f32x4, and f32x8.

The size of a f32 vector in memory is \(4N\) bytes.

f64xN

A SIMD vector of double precision floating point numbers.

Some concrete f64 vector types are: f64x2, f64x4, and f64x8.

The size of a f64 vector in memory is \(8N\) bytes.

b1xN

A boolean SIMD vector.

Boolean vectors are used when comparing SIMD vectors. For example, comparing two i32x4 values would produce a b1x4 result.

Like the b1 type, a boolean vector cannot be stored in memory.

Pseudo-types and type classes

These are not concrete types, but convenient names used to refer to real types in this reference.

iAddr

A Pointer-sized integer representing an address.

This is either i32, or i64, depending on whether the target platform has 32-bit or 64-bit pointers.

iB

Any of the scalar integer types i8i64.

Int

Any scalar or vector integer type: iB or iBxN.

fB

Either of the floating point scalar types: f32 or f64.

Float

Any scalar or vector floating point type: fB or fBxN.

TxN

Any SIMD vector type.

Mem

Any type that can be stored in memory: Int or Float.

Testable

Either b1 or iN.

Immediate operand types

These types are not part of the normal SSA type system. They are used to indicate the different kinds of immediate operands on an instruction.

imm64

A 64-bit immediate integer. The value of this operand is interpreted as a signed two’s complement integer. Instruction encodings may limit the valid range.

In the textual format, imm64 immediates appear as decimal or hexadecimal literals using the same syntax as C.

offset32

A signed 32-bit immediate address offset.

In the textual format, offset32 immediates always have an explicit sign, and a 0 offset may be omitted.

ieee32

A 32-bit immediate floating point number in the IEEE 754-2008 binary32 interchange format. All bit patterns are allowed.

ieee64

A 64-bit immediate floating point number in the IEEE 754-2008 binary64 interchange format. All bit patterns are allowed.

bool

A boolean immediate value, either false or true.

In the textual format, bool immediates appear as ‘false’ and ‘true’.

intcc

An integer condition code. See the icmp instruction for details.

floatcc

A floating point condition code. See the fcmp instruction for details.

The two IEEE floating point immediate types ieee32 and ieee64 are displayed as hexadecimal floating point literals in the textual IL format. Decimal floating point literals are not allowed because some computer systems can round differently when converting to binary. The hexadecimal floating point format is mostly the same as the one used by C99, but extended to represent all NaN bit patterns:

Normal numbers
Compatible with C99: -0x1.Tpe where T are the trailing significand bits encoded as hexadecimal, and e is the unbiased exponent as a decimal number. ieee32 has 23 trailing significand bits. They are padded with an extra LSB to produce 6 hexadecimal digits. This is not necessary for ieee64 which has 52 trailing significand bits forming 13 hexadecimal digits with no padding.
Zeros
Positive and negative zero are displayed as 0.0 and -0.0 respectively.
Subnormal numbers
Compatible with C99: -0x0.Tpemin where T are the trailing significand bits encoded as hexadecimal, and emin is the minimum exponent as a decimal number.
Infinities
Either -Inf or Inf.
Quiet NaNs
Quiet NaNs have the MSB of the trailing significand set. If the remaining bits of the trailing significand are all zero, the value is displayed as -NaN or NaN. Otherwise, -NaN:0xT where T are the trailing significand bits encoded as hexadecimal.
Signaling NaNs
Displayed as -sNaN:0xT.

Control flow

Branches transfer control to a new EBB and provide values for the target EBB’s arguments, if it has any. Conditional branches only take the branch if their condition is satisfied, otherwise execution continues at the following instruction in the EBB.

jump EBB(args…)

Jump.

Unconditionally jump to an extended basic block, passing the specified EBB arguments. The number and types of arguments must match the destination EBB.

Arguments:
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
fallthrough EBB(args…)

Fall through to the next EBB.

This is the same as jump, except the destination EBB must be the next one in the layout.

Jumps are turned into fall-through instructions by the branch relaxation pass. There is no reason to use this instruction outside that pass.

Arguments:
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
brz c, EBB(args…)

Branch when zero.

If c is a b1 value, take the branch when c is false. If c is an integer value, take the branch when c = 0.

Arguments:
  • c (Testable) – Controlling value to test
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • Testable – inferred from c
brnz c, EBB(args…)

Branch when non-zero.

If c is a b1 value, take the branch when c is true. If c is an integer value, take the branch when c != 0.

Arguments:
  • c (Testable) – Controlling value to test
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • Testable – inferred from c
br_icmp Cond, x, y, EBB(args…)

Compare scalar integers and branch.

Compare x and y in the same way as the icmp instruction and take the branch if the condition is true:

br_icmp ugt v1, v2, ebb4(v5, v6)

is semantically equivalent to:

v10 = icmp ugt, v1, v2
brnz v10, ebb4(v5, v6)

Some RISC architectures like MIPS and RISC-V provide instructions that implement all or some of the condition codes. The instruction can also be used to represent macro-op fusion on architectures like Intel’s.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • iB – inferred from x
brif Cond, f, EBB(args…)

Branch when condition is true in integer CPU flags.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
brff Cond, f, EBB(args…)

Branch when condition is true in floating point CPU flags.

Arguments:
  • Cond (floatcc) – A floating point comparison condition code.
  • f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code.
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
br_table x, JT

Indirect branch via jump table.

Use x as an unsigned index into the jump table JT. If a jump table entry is found, branch to the corresponding EBB. If no entry was found fall through to the next instruction.

Note that this branch instruction can’t pass arguments to the targeted blocks. Split critical edges as needed to work around this.

Arguments:
  • x (iB) – index into jump table
  • JT (jump_table) – A jump table.
Type Variables:
  • iB – inferred from x
JT = jump_table EBB0, EBB1, , EBBn

Declare a jump table in the function preamble.

This declares a jump table for use by the br_table indirect branch instruction. Entries in the table are either EBB names, or 0 which indicates an absent entry.

The EBBs listed must belong to the current function, and they can’t have any arguments.

Arguments:
  • EBB0 – Target EBB when x = 0.
  • EBB1 – Target EBB when x = 1.
  • EBBn – Target EBB when x = n.
Result:

A jump table identifier. (Not an SSA value).

Traps stop the program because something went wrong. The exact behavior depends on the target instruction set architecture and operating system. There are explicit trap instructions defined below, but some instructions may also cause traps for certain input value. For example, udiv traps when the divisor is zero.

trap code

Terminate execution unconditionally.

Arguments:
  • code (trapcode) – A trap reason code.
trapz c, code

Trap when zero.

if c is non-zero, execution continues at the following instruction.

Arguments:
  • c (Testable) – Controlling value to test
  • code (trapcode) – A trap reason code.
Type Variables:
  • Testable – inferred from c
trapnz c, code

Trap when non-zero.

if c is zero, execution continues at the following instruction.

Arguments:
  • c (Testable) – Controlling value to test
  • code (trapcode) – A trap reason code.
Type Variables:
  • Testable – inferred from c

Function calls

A function call needs a target function and a function signature. The target function may be determined dynamically at runtime, but the signature must be known when the function call is compiled. The function signature describes how to call the function, including parameters, return values, and the calling convention:

signature    ::=  "(" [paramlist] ")" ["->" retlist] [call_conv]
paramlist    ::=  param { "," param }
retlist      ::=  paramlist
param        ::=  type [paramext] [paramspecial]
paramext     ::=  "uext" | "sext"
paramspecial ::=  "sret" | "link" | "fp" | "csr" | "vmctx"
callconv     ::=  "native" | "spiderwasm"

Parameters and return values have flags whose meaning is mostly target dependent. They make it possible to call native functions on the target platform. When calling other Cretonne functions, the flags are not necessary.

Functions that are called directly must be declared in the function preamble:

FN = function NAME signature

Declare a function so it can be called directly.

Arguments:
  • NAME – Name of the function, passed to the linker for resolution.
  • signature – Function signature. See below.
Results:
  • FN – A function identifier that can be used with call.
rvals = call FN(args…)

Direct function call.

Call a function which has been declared in the preamble. The argument types must match the function’s signature.

Arguments:
  • FN (func_ref) – function to call, declared by function
  • args (variable_args) – call arguments
Results:
  • rvals (variable_args) – return values
return rvals…

Return from the function.

Unconditionally transfer control to the calling function, passing the provided return values. The list of return values must match the function signature’s return types.

Arguments:
  • rvals (variable_args) – return values

This simple example illustrates direct function calls and signatures:

function %gcd(i32 uext, i32 uext) -> i32 uext native {
    fn1 = function %divmod(i32 uext, i32 uext) -> i32 uext, i32 uext

ebb1(v1: i32, v2: i32):
    brz v2, ebb2
    v3, v4 = call fn1(v1, v2)
    return v3

ebb2:
    return v1
}

Indirect function calls use a signature declared in the preamble.

rvals = call_indirect SIG, callee(args…)

Indirect function call.

Call the function pointed to by callee with the given arguments. The called function must match the specified signature.

Arguments:
  • SIG (sig_ref) – function signature
  • callee (iAddr) – address of function to call
  • args (variable_args) – call arguments
Results:
  • rvals (variable_args) – return values
Type Variables:
  • iAddr – inferred from callee
addr = func_addr FN

Get the address of a function.

Compute the absolute address of a function declared in the preamble. The returned address can be used as a callee argument to call_indirect. This is also a method for calling functions that are too far away to be addressable by a direct call instruction.

Arguments:
  • FN (func_ref) – function to call, declared by function
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided

Memory

Cretonne provides fully general load and store instructions for accessing memory, as well as extending loads and truncating stores.

If the memory at the given addresss is not addressable, the behavior of these instructions is undefined. If it is addressable but not accessible, they trap.

a = load Flags, p, Offset

Load from memory at p + Offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (Mem) – Value loaded
Type Variables:
  • Mem – explicitly provided
  • iAddr – from input operand
store Flags, x, p, Offset

Store x to memory at p + Offset.

This is a polymorphic instruction that can store any value type with a memory representation.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (Mem) – Value to be stored
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Type Variables:
  • Mem – inferred from x
  • iAddr – from input operand

There are also more restricted operations for accessing specific types of memory objects.

Memory operation flags

Loads and stores can have flags that loosen their semantics in order to enable optimizations.

Flag Description
notrap Memory is assumed to be accessible.
aligned Trapping allowed for misaligned accesses.

When the accessible flag is set, the behavior is undefined if the memory is not accessible.

Loads and stores are misaligned if the resultant address is not a multiple of the expected alignment. By default, misaligned loads and stores are allowed, but when the aligned flag is set, a misaligned memory access is allowed to trap.

Local variables

One set of restricted memory operations access the current function’s stack frame. The stack frame is divided into fixed-size stack slots that are allocated in the function preamble. Stack slots are not typed, they simply represent a contiguous sequence of accessible bytes in the stack frame.

SS = local Bytes, Flags…

Allocate a stack slot for a local variable in the preamble.

If no alignment is specified, Cretonne will pick an appropriate alignment for the stack slot based on its size and access patterns.

Arguments:
  • Bytes – Stack slot size on bytes.
Flags:
  • align(N) – Request at least N bytes alignment.
Results:
  • SS – Stack slot index.
a = stack_load SS, Offset

Load a value from a stack slot at the constant offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. \(sizeof(a) + Offset <= sizeof(SS)\).

Arguments:
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (Mem) – Value loaded
Type Variables:
  • Mem – explicitly provided
stack_store x, SS, Offset

Store a value to a stack slot at a constant offset.

This is a polymorphic instruction that can store any value type with a memory representation.

The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. \(sizeof(a) + Offset <= sizeof(SS)\).

Arguments:
  • x (Mem) – Value to be stored
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Type Variables:
  • Mem – inferred from x

The dedicated stack access instructions are easy for the compiler to reason about because stack slots and offsets are fixed at compile time. For example, the alignment of these stack memory accesses can be inferred from the offsets and stack slot alignments.

It’s also possible to obtain the address of a stack slot, which can be used in unrestricted loads and stores.

addr = stack_addr SS, Offset

Get the address of a stack slot.

Compute the absolute address of a byte in a stack slot. The offset must refer to a byte inside the stack slot: \(0 <= Offset < sizeof(SS)\).

Arguments:
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided

The stack_addr instruction can be used to macro-expand the stack access instructions before instruction selection:

v1 = stack_load.f64 ss3, 16
; Expands to:
v9 = stack_addr ss3, 16
v1 = load.f64 v9

Global variables

A global variable is an accessible object in memory whose address is not known at compile time. The address is computed at runtime by global_addr, possibly using information provided by the linker via relocations. There are multiple kinds of global variables using different methods for determining their address. Cretonne does not track the type or even the size of global variables, they are just pointers to non-stack memory.

When Cretonne is generating code for a virtual machine environment, globals can be used to access data structures in the VM’s runtime. This requires functions to have access to a VM context pointer which is used as the base address. Typically, the VM context pointer is passed as a hidden function argument to Cretonne functions.

GV = vmctx+Offset

Declare a global variable in the VM context struct.

This declares a global variable whose address is a constant offset from the VM context pointer which is passed as a hidden argument to all functions JIT-compiled for the VM.

Typically, the VM context is a C struct, and the declared global variable is a member of the struct.

Arguments:
  • Offset – Byte offset from the VM context pointer to the global variable.
Results:
  • GV – Global variable.

The address of a global variable can also be derived by treating another global variable as a struct pointer. This makes it possible to chase pointers into VM runtime data structures.

GV = deref(BaseGV)+Offset

Declare a global variable in a struct pointed to by BaseGV.

The address of GV can be computed by first loading a pointer from BaseGV and adding Offset to it.

It is assumed the BaseGV resides in readable memory with the apropriate alignment for storing a pointer.

Chains of deref global variables are possible, but cycles are not allowed. They will be caught by the IL verifier.

Arguments:
  • BaseGV – Global variable containing the base pointer.
  • Offset – Byte offset from the loaded base pointer to the global variable.
Results:
  • GV – Global variable.
GV = globalsym name

Declare a global variable at a symbolic address.

The address of GV is symbolic and will be assigned a relocation, so that it can be resolved by a later linking phase.

Arguments:
  • name – External name.
Results:
  • GV – Global variable.
addr = global_addr GV

Compute the address of global variable GV.

Arguments:
  • GV (global_var) – A global variable.
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided

Heaps

Code compiled from WebAssembly or asm.js runs in a sandbox where it can’t access all process memory. Instead, it is given a small set of memory areas to work in, and all accesses are bounds checked. Cretonne models this through the concept of heaps.

A heap is declared in the function preamble and can be accessed with the heap_addr instruction that traps on out-of-bounds accesses or returns a pointer that is guaranteed to trap. Heap addresses can be smaller than the native pointer size, for example unsigned i32 offsets on a 64-bit architecture.

digraph static { node [ shape=record, fontsize=10, fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans" ] "static" [label="mapped\npages|unmapped\npages|guard\npages"] }

Heap address space layout

A heap appears as three consecutive ranges of address space:

  1. The mapped pages are the accessible memory range in the heap. A heap may have a minimum guaranteed size which means that some mapped pages are always present.
  2. The unmapped pages is a possibly empty range of address space that may be mapped in the future when the heap is grown. They are addressable but not accessible.
  3. The guard pages is a range of address space that is guaranteed to cause a trap when accessed. It is used to optimize bounds checking for heap accesses with a shared base pointer. They are addressable but not accessible.

The heap bound is the total size of the mapped and unmapped pages. This is the bound that heap_addr checks against. Memory accesses inside the heap bounds can trap if they hit an unmapped page (which is not accessible).

addr = heap_addr H, p, Size

Bounds check and compute absolute address of heap memory.

Verify that the offset range p .. p + Size - 1 is in bounds for the heap H, and generate an absolute address that is safe to dereference.

  1. If p + Size is not greater than the heap bound, return an absolute address corresponding to a byte offset of p from the heap’s base address.
  2. If p + Size is greater than the heap bound, generate a trap.
Arguments:
  • H (heap) – A heap.
  • p (HeapOffset) – An unsigned heap offset
  • Size (uimm32) – Size in bytes
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided
  • HeapOffset – from input operand

Two styles of heaps are supported, static and dynamic. They behave differently when resized.

Static heaps

A static heap starts out with all the address space it will ever need, so it never moves to a different address. At the base address is a number of mapped pages corresponding to the heap’s current size. Then follows a number of unmapped pages where the heap can grow up to its maximum size. After the unmapped pages follow the guard pages which are also guaranteed to generate a trap when accessed.

H = static Base, min MinBytes, bound BoundBytes, guard GuardBytes

Declare a static heap in the preamble.

Arguments:
  • Base – Global variable holding the heap’s base address or reserved_reg.
  • MinBytes – Guaranteed minimum heap size in bytes. Accesses below this size will never trap.
  • BoundBytes – Fixed heap bound in bytes. This defines the amount of address space reserved for the heap, not including the guard pages.
  • GuardBytes – Size of the guard pages in bytes.

Dynamic heaps

A dynamic heap can be relocated to a different base address when it is resized, and its bound can move dynamically. The guard pages move when the heap is resized. The bound of a dynamic heap is stored in a global variable.

H = dynamic Base, min MinBytes, bound BoundGV, guard GuardBytes

Declare a dynamic heap in the preamble.

Arguments:
  • Base – Global variable holding the heap’s base address or reserved_reg.
  • MinBytes – Guaranteed minimum heap size in bytes. Accesses below this size will never trap.
  • BoundGV – Global variable containing the current heap bound in bytes.
  • GuardBytes – Size of the guard pages in bytes.

Heap examples

The SpiderMonkey VM prefers to use fixed heaps with a 4 GB bound and 2 GB of guard pages when running WebAssembly code on 64-bit CPUs. The combination of a 4 GB fixed bound and 1-byte bounds checks means that no code needs to be generated for bounds checks at all:

function %add_members(i32) -> f32 spiderwasm {
    gv0 = vmctx+64
    heap0 = static gv0, min 0x1000, bound 0x1_0000_0000, guard 0x8000_0000

ebb0(v0: i32):
    v1 = heap_addr.i64 heap0, v0, 1
    v2 = load.f32 v1+16
    v3 = load.f32 v1+20
    v4 = fadd v2, v3
    return v4
}

A static heap can also be used for 32-bit code when the WebAssembly module declares a small upper bound on its memory. A 1 MB static bound with a single 4 KB guard page still has opportunities for sharing bounds checking code:

function %add_members(i32) -> f32 spiderwasm {
    gv0 = vmctx+64
    heap0 = static gv0, min 0x1000, bound 0x10_0000, guard 0x1000

ebb0(v0: i32):
    v1 = heap_addr.i32 heap0, v0, 1
    v2 = load.f32 v1+16
    v3 = load.f32 v1+20
    v4 = fadd v2, v3
    return v4
}

If the upper bound on the heap size is too large, a dynamic heap is required instead.

Finally, a runtime environment that simply allocates a heap with malloc() may not have any guard pages at all. In that case, full bounds checking is required for each access:

function %add_members(i32) -> f32 spiderwasm {
    gv0 = vmctx+64
    gv1 = vmctx+72
    heap0 = dynamic gv0, min 0x1000, bound gv1, guard 0

ebb0(v0: i32):
    v1 = heap_addr.i64 heap0, v0, 20
    v2 = load.f32 v1+16
    v3 = heap_addr.i64 heap0, v0, 24
    v4 = load.f32 v3+20
    v5 = fadd v2, v4
    return v5
}

Operations

a = select c, x, y

Conditional select.

This instruction selects whole values. Use vselect for lane-wise selection.

Arguments:
  • c (Testable) – Controlling value to test
  • x (Any) – Value to use when c is true
  • y (Any) – Value to use when c is false
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
  • Testable – from input operand

Constant materialization

A few instructions have variants that take immediate operands (e.g., band / band_imm), but in general an instruction is required to load a constant into an SSA value.

a = iconst N

Integer constant.

Create a scalar integer SSA value with an immediate constant value, or an integer vector where all the lanes have the same value.

Arguments:
  • N (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A constant integer scalar or vector value
Type Variables:
  • Int – explicitly provided
a = f32const N

Floating point constant.

Create a f32 SSA value with an immediate constant value.

Arguments:
  • N (ieee32) – A 32-bit immediate floating point number.
Results:
  • a (f32) – A constant f32 scalar value
a = f64const N

Floating point constant.

Create a f64 SSA value with an immediate constant value.

Arguments:
  • N (ieee64) – A 64-bit immediate floating point number.
Results:
  • a (f64) – A constant f64 scalar value
a = bconst N

Boolean constant.

Create a scalar boolean SSA value with an immediate constant value, or a boolean vector where all the lanes have the same value.

Arguments:
  • N (bool) – An immediate boolean.
Results:
  • a (Bool) – A constant boolean scalar or vector value
Type Variables:
  • Bool – explicitly provided

Live range splitting

Cretonne’s register allocator assigns each SSA value to a register or a spill slot on the stack for its entire live range. Since the live range of an SSA value can be quite large, it is sometimes beneficial to split the live range into smaller parts.

A live range is split by creating new SSA values that are copies or the original value or each other. The copies are created by inserting copy, spill, or fill instructions, depending on whether the values are assigned to registers or stack slots.

This approach permits SSA form to be preserved throughout the register allocation pass and beyond.

a = copy x

Register-register copy.

This instruction copies its input, preserving the value type.

A pure SSA-form program does not need to copy values, but this instruction is useful for representing intermediate stages during instruction transformations, and the register allocator needs a way of representing register copies.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
a = spill x

Spill a register value to a stack slot.

This instruction behaves exactly like copy, but the result value is assigned to a spill slot.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
a = fill x

Load a register value from a stack slot.

This instruction behaves exactly like copy, but creates a new SSA value for the spilled input value.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x

Register values can be temporarily diverted to other registers by the regmove instruction, and to and from stack slots by regspill and regfill.

regmove x, src, dst

Temporarily divert x from src to dst.

This instruction moves the location of a value from one register to another without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

The register diversions created by this instruction must be undone before the value leaves the EBB. At the entry to a new EBB, all live values must be in their originally assigned registers.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
  • src (regunit) – A register unit in the target ISA
  • dst (regunit) – A register unit in the target ISA
Type Variables:
  • Any – inferred from x
regspill x, src, SS

Temporarily divert x from src to SS.

This instruction moves the location of a value from a register to a stack slot without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

See also regmove.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
  • src (regunit) – A register unit in the target ISA
  • SS (stack_slot) – A stack slot.
Type Variables:
  • Any – inferred from x
regfill x, SS, dst

Temporarily divert x from SS to dst.

This instruction moves the location of a value from a stack slot to a register without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

See also regmove.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
  • SS (stack_slot) – A stack slot.
  • dst (regunit) – A register unit in the target ISA
Type Variables:
  • Any – inferred from x

Vector operations

lo, hi = vsplit x

Split a vector into two halves.

Split the vector x into two separate values, each containing half of the lanes from x. The result may be two scalars if x only had two lanes.

Arguments:
  • x (TxN) – Vector to split
Results:
  • lo (half_vector(TxN)) – Low-numbered lanes of x
  • hi (half_vector(TxN)) – High-numbered lanes of x
Type Variables:
  • TxN – inferred from x
a = vconcat x, y

Vector concatenation.

Return a vector formed by concatenating x and y. The resulting vector type has twice as many lanes as each of the inputs. The lanes of x appear as the low-numbered lanes, and the lanes of y become the high-numbered lanes of a.

It is possible to form a vector by concatenating two scalars.

Arguments:
  • x (Any128) – Low-numbered lanes
  • y (Any128) – High-numbered lanes
Results:
  • a (double_vector(Any128)) – Concatenation of x and y
Type Variables:
  • Any128 – inferred from x
a = vselect c, x, y

Vector lane select.

Select lanes from x or y controlled by the lanes of the boolean vector c.

Arguments:
  • c (as_bool(TxN)) – Controlling vector
  • x (TxN) – Value to use where c is true
  • y (TxN) – Value to use where c is false
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – inferred from x
a = splat x

Vector splat.

Return a vector whose lanes are all x.

Arguments:
  • x (lane_of(TxN)) – None
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – explicitly provided
a = insertlane x, Idx, y

Insert y as lane Idx in x.

The lane index, Idx, is an immediate value, not an SSA value. It must indicate a valid lane index for the type of x.

Arguments:
  • x (TxN) – SIMD vector to modify
  • Idx (uimm8) – Lane index
  • y (lane_of(TxN)) – New lane value
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – inferred from x
a = extractlane x, Idx

Extract lane Idx from x.

The lane index, Idx, is an immediate value, not an SSA value. It must indicate a valid lane index for the type of x.

Arguments:
  • x (TxN) – A SIMD vector type
  • Idx (uimm8) – Lane index
Results:
  • a (lane_of(TxN)) – None
Type Variables:
  • TxN – inferred from x

Integer operations

a = icmp Cond, x, y

Integer comparison.

The condition code determines if the operands are interpreted as signed or unsigned integers.

Signed Unsigned Condition
eq eq Equal
ne ne Not equal
slt ult Less than
sge uge Greater than or equal
sgt ugt Greater than
sle ule Less than or equal

When this instruction compares integer vectors, it returns a boolean vector of lane-wise comparisons.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (as_bool(Int)) – None
Type Variables:
  • Int – inferred from x
a = icmp_imm Cond, x, Y

Compare scalar integer to a constant.

This is the same as the icmp instruction, except one operand is an immediate constant.

This instruction can only compare scalars. Use icmp for lane-wise vector comparisons.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (b1) – A boolean type with 1 bits.
Type Variables:
  • iB – inferred from x
f = ifcmp x, y

Compare scalar integers and return flags.

Compare two scalar integer values and return integer CPU flags representing the result.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
Results:
  • f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.
Type Variables:
  • iB – inferred from x
f = ifcmp_imm x, Y

Compare scalar integer to a constant and return flags.

Like icmp_imm, but returns integer CPU flags instead of testing a specific condition code.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.
Type Variables:
  • iB – inferred from x
a = iadd x, y

Wrapping integer addition: \(a := x + y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = iadd_imm x, Y

Add immediate integer.

Same as iadd, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = iadd_cin x, y, c_in

Add integers with carry in.

Same as iadd with an additional carry input. Computes:

\[a = x + y + c_{in} \pmod 2^B\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • c_in (b1) – Input carry flag
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from y
a, c_out = iadd_cout x, y

Add integers with carry out.

Same as iadd with an additional carry output.

\[\begin{split}a &= x + y \pmod 2^B \\ c_{out} &= x+y >= 2^B\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
  • c_out (b1) – Output carry flag
Type Variables:
  • iB – inferred from x
a, c_out = iadd_carry x, y, c_in

Add integers with carry in and out.

Same as iadd with an additional carry input and output.

\[\begin{split}a &= x + y + c_{in} \pmod 2^B \\ c_{out} &= x + y + c_{in} >= 2^B\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • c_in (b1) – Input carry flag
Results:
  • a (iB) – A scalar integer type
  • c_out (b1) – Output carry flag
Type Variables:
  • iB – inferred from y
a = isub x, y

Wrapping integer subtraction: \(a := x - y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = irsub_imm x, Y

Immediate reverse wrapping subtraction: \(a := Y - x \pmod{2^B}\).

Also works as integer negation when \(Y = 0\). Use iadd_imm with a negative immediate operand for the reverse immediate subtraction.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = isub_bin x, y, b_in

Subtract integers with borrow in.

Same as isub with an additional borrow flag input. Computes:

\[a = x - (y + b_{in}) \pmod 2^B\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • b_in (b1) – Input borrow flag
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from y
a, b_out = isub_bout x, y

Subtract integers with borrow out.

Same as isub with an additional borrow flag output.

\[\begin{split}a &= x - y \pmod 2^B \\ b_{out} &= x < y\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
  • b_out (b1) – Output borrow flag
Type Variables:
  • iB – inferred from x
a, b_out = isub_borrow x, y, b_in

Subtract integers with borrow in and out.

Same as isub with an additional borrow flag input and output.

\[\begin{split}a &= x - (y + b_{in}) \pmod 2^B \\ b_{out} &= x < y + b_{in}\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • b_in (b1) – Input borrow flag
Results:
  • a (iB) – A scalar integer type
  • b_out (b1) – Output borrow flag
Type Variables:
  • iB – inferred from y

Todo

Add and subtract with signed overflow.

For example, see llvm.sadd.with.overflow.* and llvm.ssub.with.overflow.* in LLVM.

a = imul x, y

Wrapping integer multiplication: \(a := x y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Polymorphic over all integer types (vector and scalar).

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = imul_imm x, Y

Integer multiplication by immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x

Todo

Larger multiplication results.

For example, smulx which multiplies i32 operands to produce a i64 result. Alternatively, smulhi and smullo pairs.

a = udiv x, y

Unsigned integer division: \(a := \lfloor {x \over y} \rfloor\).

This operation traps if the divisor is zero.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = udiv_imm x, Y

Unsigned integer division by an immediate constant.

This instruction never traps because a divisor of zero is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = sdiv x, y

Signed integer division rounded toward zero: \(a := sign(xy) \lfloor {|x| \over |y|}\rfloor\).

This operation traps if the divisor is zero, or if the result is not representable in \(B\) bits two’s complement. This only happens when \(x = -2^{B-1}, y = -1\).

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = sdiv_imm x, Y

Signed integer division by an immediate constant.

This instruction never traps because a divisor of -1 or 0 is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = urem x, y

Unsigned integer remainder.

This operation traps if the divisor is zero.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = urem_imm x, Y

Unsigned integer remainder with immediate divisor.

This instruction never traps because a divisor of zero is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = srem x, y

Signed integer remainder. The result has the sign of the dividend.

This operation traps if the divisor is zero.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = srem_imm x, Y

Signed integer remainder with immediate divisor.

This instruction never traps because a divisor of 0 or -1 is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x

Todo

Integer minimum / maximum.

NEON has smin, smax, umin, and umax instructions. We should replicate those for both scalar and vector integer types. Even if the target ISA doesn’t have scalar operations, these are good pattern matching targets.

Todo

Saturating arithmetic.

Mostly for SIMD use, but again these are good patterns for contraction. Something like usatadd, usatsub, ssatadd, and ssatsub is a good start.

Bitwise operations

The bitwise operations and operate on any value type: Integers, floating point numbers, and booleans. When operating on integer or floating point types, the bitwise operations are working on the binary representation of the values. When operating on boolean values, the bitwise operations work as logical operators.

a = band x, y

Bitwise and.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = band_imm x, Y

Bitwise and with immediate.

Same as band, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bor x, y

Bitwise or.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bor_imm x, Y

Bitwise or with immediate.

Same as bor, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bxor x, y

Bitwise xor.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bxor_imm x, Y

Bitwise xor with immediate.

Same as bxor, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bnot x

Bitwise not.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = band_not x, y

Bitwise and not.

Computes x & ~y.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bor_not x, y

Bitwise or not.

Computes x | ~y.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bxor_not x, y

Bitwise xor not.

Computes x ^ ~y.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x

The shift and rotate operations only work on integer types (scalar and vector). The shift amount does not have to be the same type as the value being shifted. Only the low B bits of the shift amount is significant.

When operating on an integer vector type, the shift amount is still a scalar type, and all the lanes are shifted the same amount. The shift amount is masked to the number of bits in a lane, not the full size of the vector type.

a = rotl x, y

Rotate left.

Rotate the bits in x by y places.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = rotl_imm x, Y

Rotate left by immediate.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = rotr x, y

Rotate right.

Rotate the bits in x by y places.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = rotr_imm x, Y

Rotate right by immediate.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = ishl x, y

Integer shift left. Shift the bits in x towards the MSB by y places. Shift in zero bits to the LSB.

The shift amount is masked to the size of x.

When shifting a B-bits integer type, this instruction computes:

\[\begin{split}s &:= y \pmod B, \\ a &:= x \cdot 2^s \pmod{2^B}.\end{split}\]
Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = ishl_imm x, Y

Integer shift left by immediate.

The shift amount is masked to the size of x.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = ushr x, y

Unsigned shift right. Shift bits in x towards the LSB by y places, shifting in zero bits to the MSB. Also called a logical shift.

The shift amount is masked to the size of the register.

When shifting a B-bits integer type, this instruction computes:

\[\begin{split}s &:= y \pmod B, \\ a &:= \lfloor x \cdot 2^{-s} \rfloor.\end{split}\]
Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = ushr_imm x, Y

Unsigned shift right by immediate.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = sshr x, y

Signed shift right. Shift bits in x towards the LSB by y places, shifting in sign bits to the MSB. Also called an arithmetic shift.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = sshr_imm x, Y

Signed shift right by immediate.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x

The bit-counting instructions below are scalar only.

a = clz x

Count leading zero bits.

Starting from the MSB in x, count the number of zero bits before reaching the first one bit. When x is zero, returns the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = cls x

Count leading sign bits.

Starting from the MSB after the sign bit in x, count the number of consecutive bits identical to the sign bit. When x is 0 or -1, returns one less than the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = ctz x

Count trailing zeros.

Starting from the LSB in x, count the number of zero bits before reaching the first one bit. When x is zero, returns the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = popcnt x

Population count

Count the number of one bits in x.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x

Floating point operations

These operations generally follow IEEE 754-2008 semantics.

a = fcmp Cond, x, y

Floating point comparison.

Two IEEE 754-2008 floating point numbers, x and y, relate to each other in exactly one of four ways:

UN Unordered when one or both numbers is NaN.
EQ When \(x = y\). (And \(0.0 = -0.0\)).
LT When \(x < y\).
GT When \(x > y\).

The 14 floatcc condition codes each correspond to a subset of the four relations, except for the empty set which would always be false, and the full set which would always be true.

The condition codes are divided into 7 ‘ordered’ conditions which don’t include UN, and 7 unordered conditions which all include UN.

Ordered Unordered Condition
ord EQ | LT | GT uno UN NaNs absent / present.
eq EQ ueq UN | EQ Equal
one LT | GT ne UN | LT | GT Not equal
lt LT ult UN | LT Less than
le LT | EQ ule UN | LT | EQ Less than or equal
gt GT ugt UN | GT Greater than
ge GT | EQ uge UN | GT | EQ Greater than or equal

The standard C comparison operators, <, <=, >, >=, are all ordered, so they are false if either operand is NaN. The C equality operator, ==, is ordered, and since inequality is defined as the logical inverse it is unordered. They map to the floatcc condition codes as follows:

C Cond Subset
== eq EQ
!= ne UN | LT | GT
< lt LT
<= le LT | EQ
> gt GT
>= ge GT | EQ

This subset of condition codes also corresponds to the WebAssembly floating point comparisons of the same name.

When this instruction compares floating point vectors, it returns a boolean vector with the results of lane-wise comparisons.

Arguments:
  • Cond (floatcc) – A floating point comparison condition code.
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (as_bool(Float)) – None
Type Variables:
  • Float – inferred from x
f = ffcmp x, y

Floating point comparison returning flags.

Compares two numbers like fcmp, but returns floating point CPU flags instead of testing a specific condition.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code.
Type Variables:
  • Float – inferred from x
a = fadd x, y

Floating point addition.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fsub x, y

Floating point subtraction.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fmul x, y

Floating point multiplication.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fdiv x, y

Floating point division.

Unlike the integer division instructions sdiv and udiv, this can’t trap. Division by zero is infinity or NaN, depending on the dividend.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = sqrt x

Floating point square root.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fma x, y, z

Floating point fused multiply-and-add.

Computes \(a := xy+z\) without any intermediate rounding of the product.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
  • z (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from y

Sign bit manipulations

The sign manipulating instructions work as bitwise operations, so they don’t have special behavior for signaling NaN operands. The exponent and trailing significand bits are always preserved.

a = fneg x

Floating point negation.

Note that this is a pure bitwise operation.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit inverted
Type Variables:
  • Float – inferred from x
a = fabs x

Floating point absolute value.

Note that this is a pure bitwise operation.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit cleared
Type Variables:
  • Float – inferred from x
a = fcopysign x, y

Floating point copy sign.

Note that this is a pure bitwise operation. The sign bit from y is copied to the sign bit of x.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit changed to that of y
Type Variables:
  • Float – inferred from x

Minimum and maximum

These instructions return the larger or smaller of their operands. Note that unlike the IEEE 754-2008 minNum and maxNum operations, these instructions return NaN when either input is NaN.

When comparing zeroes, these instructions behave as if \(-0.0 < 0.0\).

a = fmin x, y

Floating point minimum, propagating NaNs.

If either operand is NaN, this returns a NaN.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The smaller of x and y
Type Variables:
  • Float – inferred from x
a = fmax x, y

Floating point maximum, propagating NaNs.

If either operand is NaN, this returns a NaN.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The larger of x and y
Type Variables:
  • Float – inferred from x

Rounding

These instructions round their argument to a nearby integral value, still represented as a floating point number.

a = ceil x

Round floating point round to integral, towards positive infinity.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = floor x

Round floating point round to integral, towards negative infinity.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = trunc x

Round floating point round to integral, towards zero.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = nearest x

Round floating point round to integral, towards nearest with ties to even.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x

CPU flag operations

a = trueif Cond, f

Test integer CPU flags for a specific condition.

Check the CPU flags in f against the Cond condition code and return true when the condition code is satisfied.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.
Results:
  • a (b1) – A boolean type with 1 bits.
a = trueff Cond, f

Test floating point CPU flags for a specific condition.

Check the CPU flags in f against the Cond condition code and return true when the condition code is satisfied.

Arguments:
  • Cond (floatcc) – A floating point comparison condition code.
  • f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code.
Results:
  • a (b1) – A boolean type with 1 bits.

Conversion operations

a = bitcast x

Reinterpret the bits in x as a different type.

The input and output types must be storable to memory and of the same size. A bitcast is equivalent to storing one type and loading the other type from the same address.

Arguments:
  • x (Mem) – Any type that can be stored in memory
Results:
  • a (MemTo) – Bits of x reinterpreted
Type Variables:
  • MemTo – explicitly provided
  • Mem – from input operand
a = breduce x

Convert x to a smaller boolean type in the platform-defined way.

The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (BoolTo) – A smaller boolean type with the same number of lanes
Type Variables:
  • BoolTo – explicitly provided
  • Bool – from input operand
a = bextend x

Convert x to a larger boolean type in the platform-defined way.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (BoolTo) – A larger boolean type with the same number of lanes
Type Variables:
  • BoolTo – explicitly provided
  • Bool – from input operand
a = bint x

Convert x to an integer.

True maps to 1 and false maps to 0. The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (IntTo) – An integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Bool – from input operand
a = bmask x

Convert x to an integer mask.

True maps to all 1s and false maps to all 0s. The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (IntTo) – An integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Bool – from input operand
a = ireduce x

Convert x to a smaller integer type by dropping high bits.

Each lane in x is converted to a smaller integer type by discarding the most significant bits. This is the same as reducing modulo \(2^n\).

The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A smaller integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = uextend x

Convert x to a larger integer type by zero-extending.

Each lane in x is converted to a larger integer type by adding zeroes. The result has the same numerical value as x when both are interpreted as unsigned integers.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = sextend x

Convert x to a larger integer type by sign-extending.

Each lane in x is converted to a larger integer type by replicating the sign bit. The result has the same numerical value as x when both are interpreted as signed integers.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = fpromote x

Convert x to a larger floating point format.

Each lane in x is converted to the destination floating point format. This is an exact operation.

Cretonne currently only supports two floating point formats - f32 and f64. This may change in the future.

The result type must have the same number of vector lanes as the input, and the result lanes must not have fewer bits than the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Float – from input operand
a = fdemote x

Convert x to a smaller floating point format.

Each lane in x is converted to the destination floating point format by rounding to nearest, ties to even.

Cretonne currently only supports two floating point formats - f32 and f64. This may change in the future.

The result type must have the same number of vector lanes as the input, and the result lanes must not have more bits than the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Float – from input operand
a = fcvt_to_uint x

Convert floating point to unsigned integer.

Each lane in x is converted to an unsigned integer by rounding towards zero. If x is NaN or if the unsigned integral value cannot be represented in the result type, this instruction traps.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Float – from input operand
a = fcvt_to_sint x

Convert floating point to signed integer.

Each lane in x is converted to a signed integer by rounding towards zero. If x is NaN or if the signed integral value cannot be represented in the result type, this instruction traps.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Float – from input operand
a = fcvt_from_uint x

Convert unsigned integer to floating point.

Each lane in x is interpreted as an unsigned integer and converted to floating point using round to nearest, ties to even.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Int – from input operand
a = fcvt_from_sint x

Convert signed integer to floating point.

Each lane in x is interpreted as a signed integer and converted to floating point using round to nearest, ties to even.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Int – from input operand

Todo

Saturating fcvt_to_sint and fcvt_to_uint.

For example, these appear in Rust and WebAssembly.

Legalization operations

These instructions are used as helpers when legalizing types and operations for the target ISA.

lo, hi = isplit x

Split an integer into low and high parts.

Vectors of integers are split lane-wise, so the results have the same number of lanes as the input, but the lanes are half the size.

Returns the low half of x and the high half of x as two independent values.

Arguments:
  • x (WideInt) – An integer type with lanes from i16 upwards
Results:
  • lo (half_width(WideInt)) – The low bits of x
  • hi (half_width(WideInt)) – The high bits of x
Type Variables:
  • WideInt – inferred from x
a = iconcat lo, hi

Concatenate low and high bits to form a larger integer type.

Vectors of integers are concatenated lane-wise such that the result has the same number of lanes as the inputs, but the lanes are twice the size.

Arguments:
  • lo (NarrowInt) – An integer type with lanes type to i32
  • hi (NarrowInt) – An integer type with lanes type to i32
Results:
  • a (double_width(NarrowInt)) – The concatenation of lo and hi
Type Variables:
  • NarrowInt – inferred from lo

Extending loads and truncating stores

Most ISAs provide instructions that load an integer value smaller than a register and extends it to the width of the register. Similarly, store instructions that only write the low bits of an integer register are common.

In addition to the normal load and store instructions, Cretonne provides extending loads and truncation stores for 8, 16, and 32-bit memory accesses.

These instructions succeed, trap, or have undefined behavior, under the same conditions as normal loads and stores.

a = uload8 Flags, p, Offset

Load 8 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i8 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt8) – An integer type with more than 8 bits
Type Variables:
  • iExt8 – explicitly provided
  • iAddr – from input operand
a = sload8 Flags, p, Offset

Load 8 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i8 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt8) – An integer type with more than 8 bits
Type Variables:
  • iExt8 – explicitly provided
  • iAddr – from input operand
istore8 Flags, x, p, Offset

Store the low 8 bits of x to memory at p + Offset.

This is equivalent to ireduce.i8 followed by store.i8.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt8) – An integer type with more than 8 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Type Variables:
  • iExt8 – inferred from x
  • iAddr – from input operand
a = uload16 Flags, p, Offset

Load 16 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i16 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt16) – An integer type with more than 16 bits
Type Variables:
  • iExt16 – explicitly provided
  • iAddr – from input operand
a = sload16 Flags, p, Offset

Load 16 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i16 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt16) – An integer type with more than 16 bits
Type Variables:
  • iExt16 – explicitly provided
  • iAddr – from input operand
istore16 Flags, x, p, Offset

Store the low 16 bits of x to memory at p + Offset.

This is equivalent to ireduce.i16 followed by store.i16.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt16) – An integer type with more than 16 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Type Variables:
  • iExt16 – inferred from x
  • iAddr – from input operand
a = uload32 Flags, p, Offset

Load 32 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i32 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt32) – An integer type with more than 32 bits
Type Variables:
  • iAddr – inferred from p
a = sload32 Flags, p, Offset

Load 32 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i32 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt32) – An integer type with more than 32 bits
Type Variables:
  • iAddr – inferred from p
istore32 Flags, x, p, Offset

Store the low 32 bits of x to memory at p + Offset.

This is equivalent to ireduce.i32 followed by store.i32.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt32) – An integer type with more than 32 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Type Variables:
  • iExt32 – inferred from x
  • iAddr – from input operand

ISA-specific instructions

Target ISAs can define supplemental instructions that do not make sense to support generally.

Intel

Instructions that can only be used by the Intel target ISA.

q, r = x86_sdivmodx nlo, nhi, d

Extended signed division.

Concatenate the bits in nhi and nlo to form the numerator. Interpret the bits as a signed number and divide by the signed denominator d. Trap when d is zero or if the quotient is outside the range of the output.

Return both quotient and remainder.

Arguments:
  • nlo (iWord) – Low part of numerator
  • nhi (iWord) – High part of numerator
  • d (iWord) – Denominator
Results:
  • q (iWord) – Quotient
  • r (iWord) – Remainder
Type Variables:
  • iWord – inferred from nhi
q, r = x86_udivmodx nlo, nhi, d

Extended unsigned division.

Concatenate the bits in nhi and nlo to form the numerator. Interpret the bits as an unsigned number and divide by the unsigned denominator d. Trap when d is zero or if the quotient is larger than the range of the output.

Return both quotient and remainder.

Arguments:
  • nlo (iWord) – Low part of numerator
  • nhi (iWord) – High part of numerator
  • d (iWord) – Denominator
Results:
  • q (iWord) – Quotient
  • r (iWord) – Remainder
Type Variables:
  • iWord – inferred from nhi
a = x86_cvtt2si x

Convert with truncation floating point to signed integer.

The source floating point operand is converted to a signed integer by rounding towards zero. If the result can’t be represented in the output type, returns the smallest signed value the output type can represent.

This instruction does not trap.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (IntTo) – An integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Float – from input operand
a = x86_fmin x, y

Floating point minimum with Intel semantics.

This is equivalent to the C ternary operator x < y ? x : y which differs from fmin when either operand is NaN or when comparing +0.0 to -0.0.

When the two operands don’t compare as LT, y is returned unchanged, even if it is a signalling NaN.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – A scalar or vector floating point number
Type Variables:
  • Float – inferred from x
a = x86_fmax x, y

Floating point maximum with Intel semantics.

This is equivalent to the C ternary operator x > y ? x : y which differs from fmax when either operand is NaN or when comparing +0.0 to -0.0.

When the two operands don’t compare as GT, y is returned unchanged, even if it is a signalling NaN.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – A scalar or vector floating point number
Type Variables:
  • Float – inferred from x

Implementation limits

Cretonne’s intermediate representation imposes some limits on the size of functions and the number of entities allowed. If these limits are exceeded, the implementation will panic.

Number of instructions in a function
At most \(2^{31} - 1\).
Number of EBBs in a function

At most \(2^{31} - 1\).

Every EBB needs at least a terminator instruction anyway.

Number of secondary values in a function

At most \(2^{31} - 1\).

Secondary values are any SSA values that are not the first result of an instruction.

Other entities declared in the preamble

At most \(2^{32} - 1\).

This covers things like stack slots, jump tables, external functions, and function signatures, etc.

Number of arguments to an EBB
At most \(2^{16}\).
Number of arguments to a function

At most \(2^{16}\).

This follows from the limit on arguments to the entry EBB. Note that Cretonne may add a handful of ABI register arguments as function signatures are lowered. This is for representing things like the link register, the incoming frame pointer, and callee-saved registers that are saved in the prologue.

Size of function call arguments on the stack

At most \(2^{32} - 1\) bytes.

This is probably not possible to achieve given the limit on the number of arguments, except by requiring extremely large offsets for stack arguments.

Glossary

addressable
Memory in which loads and stores have defined behavior. They either succeed or trap, depending on whether the memory is accessible.
accessible
Addressable memory in which loads and stores always succeed without trapping, except where specified otherwise (eg. with the aligned flag). Heaps, globals, and the stack may contain accessible, merely addressable, and outright unaddressable regions. There may also be additional regions of addressable and/or accessible memory not explicitly declared.
basic block
A maximal sequence of instructions that can only be entered from the top, and that contains no branch or terminator instructions except for the last instruction.
entry block
The EBB that is executed first in a function. Currently, a Cretonne function must have exactly one entry block which must be the first block in the function. The types of the entry block arguments must match the types of arguments in the function signature.
extended basic block
EBB

A maximal sequence of instructions that can only be entered from the top, and that contains no terminator instructions except for the last one. An EBB can contain conditional branches that can fall through to the following instructions in the block, but only the first instruction in the EBB can be a branch target.

The last instruction in an EBB must be a terminator instruction, so execution cannot flow through to the next EBB in the function. (But there may be a branch to the next EBB.)

Note that some textbooks define an EBB as a maximal subtree in the control flow graph where only the root can be a join node. This definition is not equivalent to Cretonne EBBs.

EBB parameter
A formal parameter for an EBB is an SSA value that dominates everything in the EBB. For each parameter declared by an EBB, a corresponding argument value must be passed when branching to the EBB. The function’s entry EBB has parameters that correspond to the function’s parameters.
EBB argument
Similar to function arguments, EBB arguments must be provided when branching to an EBB that declares formal parameters. When execution begins at the top of an EBB, the formal parameters have the values of the arguments passed in the branch.
function signature

A function signature describes how to call a function. It consists of:

  • The calling convention.
  • The number of arguments and return values. (Functions can return multiple values.)
  • Type and flags of each argument.
  • Type and flags of each return value.

Not all function attributes are part of the signature. For example, a function that never returns could be marked as noreturn, but that is not necessary to know when calling it, so it is just an attribute, and not part of the signature.

function preamble

A list of declarations of entities that are used by the function body. Some of the entities that can be declared in the preamble are:

  • Local variables.
  • Functions that are called directly.
  • Function signatures for indirect function calls.
  • Function flags and attributes that are not part of the signature.
function body
The extended basic blocks which contain all the executable code in a function. The function body follows the function preamble.
intermediate language
IL
The language used to describe functions to Cretonne. This reference describes the syntax and semantics of the Cretonne IL. The IL has two forms: Textual and an in-memory intermediate representation (IR).
intermediate representation
IR
The in-memory representation of IL. The data structures Cretonne uses to represent a program internally are called the intermediate representation. Cretonne’s IR can be converted to text losslessly.
stack slot
A fixed size memory allocation in the current function’s activation frame. Also called a local variable.
terminator instruction

A control flow instruction that unconditionally directs the flow of execution somewhere else. Execution never continues at the instruction following a terminator instruction.

The basic terminator instructions are br, return, and trap. Conditional branches and instructions that trap conditionally are not terminator instructions.

trap
traps
trapping
Terminates execution of the current thread. The specific behavior after a trap depends on the underlying OS. For example, a common behavior is delivery of a signal, with the specific signal depending on the event that triggered it.