Cretonne Language Reference

The Cretonne intermediate language (IL) has two equivalent representations: an in-memory data structure that the code generator library is using, and a text format which is used for test cases and debug output. Files containing Cretonne textual IL have the .cton filename extension.

This reference uses the text format to describe IL semantics but glosses over the finer details of the lexical and syntactic structure of the format.

Overall structure

Cretonne compiles functions independently. A .cton IL file may contain multiple functions, and the programmatic API can create multiple function handles at the same time, but the functions don’t share any data or reference each other directly.

This is a simple C function that computes the average of an array of floats:

float
average(const float *array, size_t count)
{
    double sum = 0;
    for (size_t i = 0; i < count; i++)
        sum += array[i];
    return sum / count;
}

Here is the same function compiled into Cretonne IL:

function %average(i32, i32) -> f32 native {
    ss1 = local 8            ; Stack slot for ``sum``.

ebb1(v1: i32, v2: i32):
    v3 = f64const 0x0.0
    stack_store v3, ss1
    brz v2, ebb3                  ; Handle count == 0.
    v4 = iconst.i32 0
    jump ebb2(v4)

ebb2(v5: i32):
    v6 = imul_imm v5, 4
    v7 = iadd v1, v6
    v8 = load.f32 v7              ; array[i]
    v9 = fpromote.f64 v8
    v10 = stack_load.f64 ss1
    v11 = fadd v9, v10
    stack_store v11, ss1
    v12 = iadd_imm v5, 1
    v13 = icmp ult v12, v2
    brnz v13, ebb2(v12)           ; Loop backedge.
    v14 = stack_load.f64 ss1
    v15 = fcvt_from_uint.f64 v2
    v16 = fdiv v14, v15
    v17 = fdemote.f32 v16
    return v17

ebb3:
    v100 = f32const +NaN
    return v100
}

The first line of a function definition provides the function name and the function signature which declares the argument and return types. Then follows the function preamble which declares a number of entities that can be referenced inside the function. In the example above, the preamble declares a single local variable, ss1.

After the preamble follows the function body which consists of extended basic blocks (EBBs), the first of which is the entry block. Every EBB ends with a terminator instruction, so execution can never fall through to the next EBB without an explicit branch.

A .cton file consists of a sequence of independent function definitions:

function_list ::=  { function }
function      ::=  function_spec "{" preamble function_body "}"
function_spec ::=  "function" function_name signature
preamble      ::=  { preamble_decl }
function_body ::=  { extended_basic_block }

Static single assignment form

The instructions in the function body use and produce values in SSA form. This means that every value is defined exactly once, and every use of a value must be dominated by the definition.

Cretonne does not have phi instructions but uses EBB arguments instead. An EBB can be defined with a list of typed arguments. Whenever control is transferred to the EBB, values for the arguments must be provided. When entering a function, the incoming function arguments are passed as arguments to the entry EBB.

Instructions define zero, one, or more result values. All SSA values are either EBB arguments or instruction results.

In the example above, the loop induction variable i is represented as three SSA values: In the entry block, v4 is the initial value. In the loop block ebb2, the EBB argument v5 represents the value of the induction variable during each iteration. Finally, v12 is computed as the induction variable value for the next iteration.

It can be difficult to generate correct SSA form if the program being converted into Cretonne IL contains multiple assignments to the same variables. Such variables can be presented to Cretonne as stack slots instead. Stack slots are accessed with the stack_store and stack_load instructions which behave more like variable accesses in a typical programming language. Cretonne can perform the necessary data-flow analysis to convert stack slots to SSA form.

Value types

All SSA values have a type which determines the size and shape (for SIMD vectors) of the value. Many instructions are polymorphic – they can operate on different types.

Boolean types

Boolean values are either true or false. While this only requires a single bit to represent, more bits are often used when holding a boolean value in a register or in memory. The b1 type represents an abstract boolean value. It can only exist as an SSA value, it can’t be stored in memory or converted to another type. The larger boolean types can be stored in memory.

Todo

Clarify the representation of larger boolean types.

The multi-bit boolean types can be interpreted in different ways. We could declare that zero means false and non-zero means true. This may require unwanted normalization code in some places.

We could specify a fixed encoding like all ones for true. This would then lead to undefined behavior if untrusted code uses the multibit booleans incorrectly.

Something like this:

  • External code is not allowed to load/store multi-bit booleans or otherwise expose the representation.
  • Each target specifies the exact representation of a multi-bit boolean.
b1

A boolean type with 1 bits.

Bytes:Can’t be stored in memory
b8

A boolean type with 8 bits.

Bytes:1
b16

A boolean type with 16 bits.

Bytes:2
b32

A boolean type with 32 bits.

Bytes:4
b64

A boolean type with 64 bits.

Bytes:8

Integer types

Integer values have a fixed size and can be interpreted as either signed or unsigned. Some instructions will interpret an operand as a signed or unsigned number, others don’t care.

i8

An integer type with 8 bits.

Bytes:1
i16

An integer type with 16 bits.

Bytes:2
i32

An integer type with 32 bits.

Bytes:4
i64

An integer type with 64 bits.

Bytes:8

Floating point types

The floating point types have the IEEE semantics that are supported by most hardware. There is no support for higher-precision types like quads or double-double formats.

f32

A 32-bit floating point type represented in the IEEE 754-2008 binary32 interchange format. This corresponds to the float type in most C implementations.

Bytes:4
f64

A 64-bit floating point type represented in the IEEE 754-2008 binary64 interchange format. This corresponds to the double type in most C implementations.

Bytes:8

SIMD vector types

A SIMD vector type represents a vector of values from one of the scalar types (boolean, integer, and floating point). Each scalar value in a SIMD type is called a lane. The number of lanes must be a power of two in the range 2-256.

iBxN

A SIMD vector of integers. The lane type iB is one of the integer types i8 ... i64.

Some concrete integer vector types are i32x4, i64x8, and i16x4.

The size of a SIMD integer vector in memory is \(N B\over 8\) bytes.

f32xN

A SIMD vector of single precision floating point numbers.

Some concrete f32 vector types are: f32x2, f32x4, and f32x8.

The size of a f32 vector in memory is \(4N\) bytes.

f64xN

A SIMD vector of double precision floating point numbers.

Some concrete f64 vector types are: f64x2, f64x4, and f64x8.

The size of a f64 vector in memory is \(8N\) bytes.

b1xN

A boolean SIMD vector.

Boolean vectors are used when comparing SIMD vectors. For example, comparing two i32x4 values would produce a b1x4 result.

Like the b1 type, a boolean vector cannot be stored in memory.

Pseudo-types and type classes

These are not concrete types, but convenient names uses to refer to real types in this reference.

iPtr

A Pointer-sized integer.

This is either i32, or i64, depending on whether the target platform has 32-bit or 64-bit pointers.

iB

Any of the scalar integer types i8i64.

Int

Any scalar or vector integer type: iB or iBxN.

fB

Either of the floating point scalar types: f32 or f64.

Float

Any scalar or vector floating point type: fB or fBxN.

TxN

Any SIMD vector type.

Mem

Any type that can be stored in memory: Int or Float.

Logic

Either b1 or b1xN.

Testable

Either b1 or iN.

Immediate operand types

These types are not part of the normal SSA type system. They are used to indicate the different kinds of immediate operands on an instruction.

imm64

A 64-bit immediate integer. The value of this operand is interpreted as a signed two’s complement integer. Instruction encodings may limit the valid range.

In the textual format, imm64 immediates appear as decimal or hexadecimal literals using the same syntax as C.

offset32

A signed 32-bit immediate address offset.

In the textual format, offset32 immediates always have an explicit sign, and a 0 offset may be omitted.

ieee32

A 32-bit immediate floating point number in the IEEE 754-2008 binary32 interchange format. All bit patterns are allowed.

ieee64

A 64-bit immediate floating point number in the IEEE 754-2008 binary64 interchange format. All bit patterns are allowed.

bool

A boolean immediate value, either false or true.

In the textual format, bool immediates appear as ‘false’ and ‘true’.

intcc

An integer condition code. See the icmp instruction for details.

floatcc

A floating point condition code. See the fcmp instruction for details.

The two IEEE floating point immediate types ieee32 and ieee64 are displayed as hexadecimal floating point literals in the textual IL format. Decimal floating point literals are not allowed because some computer systems can round differently when converting to binary. The hexadecimal floating point format is mostly the same as the one used by C99, but extended to represent all NaN bit patterns:

Normal numbers
Compatible with C99: -0x1.Tpe where T are the trailing significand bits encoded as hexadecimal, and e is the unbiased exponent as a decimal number. ieee32 has 23 trailing significand bits. They are padded with an extra LSB to produce 6 hexadecimal digits. This is not necessary for ieee64 which has 52 trailing significand bits forming 13 hexadecimal digits with no padding.
Zeros
Positive and negative zero are displayed as 0.0 and -0.0 respectively.
Subnormal numbers
Compatible with C99: -0x0.Tpemin where T are the trailing significand bits encoded as hexadecimal, and emin is the minimum exponent as a decimal number.
Infinities
Either -Inf or Inf.
Quiet NaNs
Quiet NaNs have the MSB of the trailing significand set. If the remaining bits of the trailing significand are all zero, the value is displayed as -NaN or NaN. Otherwise, -NaN:0xT where T are the trailing significand bits encoded as hexadecimal.
Signaling NaNs
Displayed as -sNaN:0xT.

Control flow

Branches transfer control to a new EBB and provide values for the target EBB’s arguments, if it has any. Conditional branches only take the branch if their condition is satisfied, otherwise execution continues at the following instruction in the EBB.

jump EBB(args...)

Jump.

Unconditionally jump to an extended basic block, passing the specified EBB arguments. The number and types of arguments must match the destination EBB.

Arguments:
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
fallthrough EBB(args...)

Fall through to the next EBB.

This is the same as jump, except the destination EBB must be the next one in the layout.

Jumps are turned into fall-through instructions by the branch relaxation pass. There is no reason to use this instruction outside that pass.

Arguments:
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
brz c, EBB(args...)

Branch when zero.

If c is a b1 value, take the branch when c is false. If c is an integer value, take the branch when c = 0.

Arguments:
  • c (Testable) – Controlling value to test
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • Testable – inferred from c
brnz c, EBB(args...)

Branch when non-zero.

If c is a b1 value, take the branch when c is true. If c is an integer value, take the branch when c != 0.

Arguments:
  • c (Testable) – Controlling value to test
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • Testable – inferred from c
br_icmp Cond, x, y, EBB(args...)

Compare scalar integers and branch.

Compare x and y in the same way as the icmp instruction and take the branch if the condition is true:

br_icmp ugt v1, v2, ebb4(v5, v6)

is semantically equivalent to:

v10 = icmp ugt, v1, v2
brnz v10, ebb4(v5, v6)

Some RISC architectures like MIPS and RISC-V provide instructions that implement all or some of the condition codes. The instruction can also be used to represent macro-op fusion on architectures like Intel’s.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
Type Variables:
  • iB – inferred from x
brif Cond, f, EBB(args...)

Branch when condition is true in integer CPU flags.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
brff Cond, f, EBB(args...)

Branch when condition is true in floating point CPU flags.

Arguments:
  • Cond (floatcc) – A floating point comparison condition code.
  • f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code.
  • EBB (ebb) – Destination extended basic block
  • args (variable_args) – EBB arguments
br_table x, JT

Indirect branch via jump table.

Use x as an unsigned index into the jump table JT. If a jump table entry is found, branch to the corresponding EBB. If no entry was found fall through to the next instruction.

Note that this branch instruction can’t pass arguments to the targeted blocks. Split critical edges as needed to work around this.

Arguments:
  • x (iB) – index into jump table
  • JT (jump_table) – A jump table.
Type Variables:
  • iB – inferred from x
JT = jump_table EBB0, EBB1, ..., EBBn

Declare a jump table in the function preamble.

This declares a jump table for use by the br_table indirect branch instruction. Entries in the table are either EBB names, or 0 which indicates an absent entry.

The EBBs listed must belong to the current function, and they can’t have any arguments.

Arguments:
  • EBB0 – Target EBB when x = 0.
  • EBB1 – Target EBB when x = 1.
  • EBBn – Target EBB when x = n.
Result:

A jump table identifier. (Not an SSA value).

Traps stop the program because something went wrong. The exact behavior depends on the target instruction set architecture and operating system. There are explicit trap instructions defined below, but some instructions may also cause traps for certain input value. For example, udiv traps when the divisor is zero.

trap code

Terminate execution unconditionally.

Arguments:
  • code (trapcode) – A trap reason code.
trapz c, code

Trap when zero.

if c is non-zero, execution continues at the following instruction.

Arguments:
  • c (Testable) – Controlling value to test
  • code (trapcode) – A trap reason code.
Type Variables:
  • Testable – inferred from c
trapnz c, code

Trap when non-zero.

if c is zero, execution continues at the following instruction.

Arguments:
  • c (Testable) – Controlling value to test
  • code (trapcode) – A trap reason code.
Type Variables:
  • Testable – inferred from c

Function calls

A function call needs a target function and a function signature. The target function may be determined dynamically at runtime, but the signature must be known when the function call is compiled. The function signature describes how to call the function, including arguments, return values, and the calling convention:

signature  ::=  "(" [arglist] ")" ["->" retlist] [call_conv]
arglist    ::=  arg { "," arg }
retlist    ::=  arglist
arg        ::=  type [argext] [argspecial]
argext     ::=  "uext" | "sext"
argspecial ::=  "sret" | "link" | "fp" | "csr" | "vmctx"
callconv   ::=  string

Arguments and return values have flags whose meaning is mostly target dependent. They make it possible to call native functions on the target platform. When calling other Cretonne functions, the flags are not necessary.

Functions that are called directly must be declared in the function preamble:

FN = function NAME signature

Declare a function so it can be called directly.

Arguments:
  • NAME – Name of the function, passed to the linker for resolution.
  • signature – Function signature. See below.
Results:
  • FN – A function identifier that can be used with call.
rvals = call FN(args...)

Direct function call.

Call a function which has been declared in the preamble. The argument types must match the function’s signature.

Arguments:
  • FN (func_ref) – function to call, declared by function
  • args (variable_args) – call arguments
Results:
  • rvals (variable_args) – return values
return rvals...

Return from the function.

Unconditionally transfer control to the calling function, passing the provided return values. The list of return values must match the function signature’s return types.

Arguments:
  • rvals (variable_args) – return values

This simple example illustrates direct function calls and signatures:

function %gcd(i32 uext, i32 uext) -> i32 uext "C" {
    fn1 = function %divmod(i32 uext, i32 uext) -> i32 uext, i32 uext

ebb1(v1: i32, v2: i32):
    brz v2, ebb2
    v3, v4 = call fn1(v1, v2)
    br ebb1(v2, v4)

ebb2:
    return v1
}

Indirect function calls use a signature declared in the preamble.

rvals = call_indirect SIG, callee(args...)

Indirect function call.

Call the function pointed to by callee with the given arguments. The called function must match the specified signature.

Arguments:
  • SIG (sig_ref) – function signature
  • callee (iAddr) – address of function to call
  • args (variable_args) – call arguments
Results:
  • rvals (variable_args) – return values
Type Variables:
  • iAddr – inferred from callee
addr = func_addr FN

Get the address of a function.

Compute the absolute address of a function declared in the preamble. The returned address can be used as a callee argument to call_indirect. This is also a method for calling functions that are too far away to be addressable by a direct call instruction.

Arguments:
  • FN (func_ref) – function to call, declared by function
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided

Memory

Cretonne provides fully general load and store instructions for accessing memory, as well as extending loads and truncating stores. There are also more restricted operations for accessing specific types of memory objects.

a = load Flags, p, Offset

Load from memory at p + Offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (Mem) – Value loaded
Type Variables:
  • Mem – explicitly provided
  • iAddr – from input operand
store Flags, x, p, Offset

Store x to memory at p + Offset.

This is a polymorphic instruction that can store any value type with a memory representation.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (Mem) – Value to be stored
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Type Variables:
  • Mem – inferred from x
  • iAddr – from input operand

Memory operation flags

Loads and stores can have flags that loosen their semantics in order to enable optimizations.

Flag Description
notrap Trapping is not required.
aligned Trapping allowed for misaligned accesses.

Trapping is part of the semantics of memory accesses. The operating system may have configured parts of the address space to cause a trap when read and/or written, and Cretonne’s memory instructions respect that. When the notrap flat is set, the trapping behavior is optional. This allows the optimizer to delete loads whose results are not used.

Loads and stores are misaligned if the resultant address is not a multiple of the expected alignment. By default, misaligned loads and stores are allowed, but when the aligned flag is set, a misaligned memory access is allowed to trap.

Local variables

One set of restricted memory operations access the current function’s stack frame. The stack frame is divided into fixed-size stack slots that are allocated in the function preamble. Stack slots are not typed, they simply represent a contiguous sequence of bytes in the stack frame.

SS = local Bytes, Flags...

Allocate a stack slot for a local variable in the preamble.

If no alignment is specified, Cretonne will pick an appropriate alignment for the stack slot based on its size and access patterns.

Arguments:
  • Bytes – Stack slot size on bytes.
Flags:
  • align(N) – Request at least N bytes alignment.
Results:
  • SS – Stack slot index.
a = stack_load SS, Offset

Load a value from a stack slot at the constant offset.

This is a polymorphic instruction that can load any value type which has a memory representation.

The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. \(sizeof(a) + Offset <= sizeof(SS)\).

Arguments:
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • a (Mem) – Value loaded
Type Variables:
  • Mem – explicitly provided
stack_store x, SS, Offset

Store a value to a stack slot at a constant offset.

This is a polymorphic instruction that can store any value type with a memory representation.

The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. \(sizeof(a) + Offset <= sizeof(SS)\).

Arguments:
  • x (Mem) – Value to be stored
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Type Variables:
  • Mem – inferred from x

The dedicated stack access instructions are easy for the compiler to reason about because stack slots and offsets are fixed at compile time. For example, the alignment of these stack memory accesses can be inferred from the offsets and stack slot alignments.

It can be necessary to escape from the safety of the restricted instructions by taking the address of a stack slot.

addr = stack_addr SS, Offset

Get the address of a stack slot.

Compute the absolute address of a byte in a stack slot. The offset must refer to a byte inside the stack slot: \(0 <= Offset < sizeof(SS)\).

Arguments:
  • SS (stack_slot) – A stack slot.
  • Offset (offset32) – In-bounds offset into stack slot
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided

The stack_addr instruction can be used to macro-expand the stack access instructions before instruction selection:

v1 = stack_load.f64 ss3, 16
; Expands to:
v9 = stack_addr ss3, 16
v1 = load.f64 v9

Global variables

A global variable is an object in memory whose address is not known at compile time. The address is computed at runtime by global_addr, possibly using information provided by the linker via relocations. There are multiple kinds of global variables using different methods for determining their address. Cretonne does not track the type or even the size of global variables, they are just pointers to non-stack memory.

When Cretonne is generating code for a virtual machine environment, globals can be used to access data structures in the VM’s runtime. This requires functions to have access to a VM context pointer which is used as the base address. Typically, the VM context pointer is passed as a hidden function argument to Cretonne functions.

GV = vmctx+Offset

Declare a global variable in the VM context struct.

This declares a global variable whose address is a constant offset from the VM context pointer which is passed as a hidden argument to all functions JIT-compiled for the VM.

Typically, the VM context is a C struct, and the declared global variable is a member of the struct.

Arguments:
  • Offset – Byte offset from the VM context pointer to the global variable.
Results:
  • GV – Global variable.

The address of a global variable can also be derived by treating another global variable as a struct pointer. This makes it possible to chase pointers into VM runtime data structures.

GV = deref(BaseGV)+Offset

Declare a global variable in a struct pointed to by BaseGV.

The address of GV can be computed by first loading a pointer from BaseGV and adding Offset to it.

It is assumed the BaseGV resides in readable memory with the apropriate alignment for storing a pointer.

Chains of deref global variables are possible, but cycles are not allowed. They will be caught by the IL verifier.

Arguments:
  • BaseGV – Global variable containing the base pointer.
  • Offset – Byte offset from the loaded base pointer to the global variable.
Results:
  • GV – Global variable.
addr = global_addr GV

Compute the address of global variable GV.

Arguments:
  • GV (global_var) – A global variable.
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided

Heaps

Code compiled from WebAssembly or asm.js runs in a sandbox where it can’t access all process memory. Instead, it is given a small set of memory areas to work in, and all accesses are bounds checked. Cretonne models this through the concept of heaps.

A heap is declared in the function preamble and can be accessed with the heap_addr instruction that traps on out-of-bounds accesses or returns a pointer that is guaranteed to trap. Heap addresses can be smaller than the native pointer size, for example unsigned i32 offsets on a 64-bit architecture.

digraph static { node [ shape=record, fontsize=10, fontname="Vera Sans, DejaVu Sans, Liberation Sans, Arial, Helvetica, sans" ] "static" [label="mapped\npages|unmapped\npages|guard\npages"] }

Heap address space layout

A heap appears as three consecutive ranges of address space:

  1. The mapped pages are the usable memory range in the heap. Loads and stores to this range won’t trap. A heap may have a minimum guaranteed size which means that some mapped pages are always present.
  2. The unmapped pages is a possibly empty range of address space that may be mapped in the future when the heap is grown.
  3. The guard pages is a range of address space that is guaranteed to cause a trap when accessed. It is used to optimize bounds checking for heap accesses with a shared base pointer.

The heap bound is the total size of the mapped and unmapped pages. This is the bound that heap_addr checks against. Memory accesses inside the heap bounds can trap if they hit an unmapped page.

addr = heap_addr H, p, Size

Bounds check and compute absolute address of heap memory.

Verify that the offset range p .. p + Size - 1 is in bounds for the heap H, and generate an absolute address that is safe to dereference.

  1. If p + Size is not greater than the heap bound, return an absolute address corresponding to a byte offset of p from the heap’s base address.
  2. If p + Size is greater than the heap bound, generate a trap.
Arguments:
  • H (heap) – A heap.
  • p (HeapOffset) – An unsigned heap offset
  • Size (uimm32) – Size in bytes
Results:
  • addr (iAddr) – An integer address type
Type Variables:
  • iAddr – explicitly provided
  • HeapOffset – from input operand

Two styles of heaps are supported, static and dynamic. They behave differently when resized.

Static heaps

A static heap starts out with all the address space it will ever need, so it never moves to a different address. At the base address is a number of mapped pages corresponding to the heap’s current size. Then follows a number of unmapped pages where the heap can grow up to its maximum size. After the unmapped pages follow the guard pages which are also guaranteed to generate a trap when accessed.

H = static Base, min MinBytes, bound BoundBytes, guard GuardBytes

Declare a static heap in the preamble.

Arguments:
  • Base – Global variable holding the heap’s base address or reserved_reg.
  • MinBytes – Guaranteed minimum heap size in bytes. Accesses below this size will never trap.
  • BoundBytes – Fixed heap bound in bytes. This defines the amount of address space reserved for the heap, not including the guard pages.
  • GuardBytes – Size of the guard pages in bytes.

Dynamic heaps

A dynamic heap can be relocated to a different base address when it is resized, and its bound can move dynamically. The guard pages move when the heap is resized. The bound of a dynamic heap is stored in a global variable.

H = dynamic Base, min MinBytes, bound BoundGV, guard GuardBytes

Declare a dynamic heap in the preamble.

Arguments:
  • Base – Global variable holding the heap’s base address or reserved_reg.
  • MinBytes – Guaranteed minimum heap size in bytes. Accesses below this size will never trap.
  • BoundGV – Global variable containing the current heap bound in bytes.
  • GuardBytes – Size of the guard pages in bytes.

Heap examples

The SpiderMonkey VM prefers to use fixed heaps with a 4 GB bound and 2 GB of guard pages when running WebAssembly code on 64-bit CPUs. The combination of a 4 GB fixed bound and 1-byte bounds checks means that no code needs to be generated for bounds checks at all:

function %add_members(i32) -> f32 spiderwasm {
    gv0 = vmctx+64
    heap0 = static gv0, min 0x1000, bound 0x1_0000_0000, guard 0x8000_0000

ebb0(v0: i32):
    v1 = heap_addr.i64 heap0, v0, 1
    v2 = load.f32 v1+16
    v3 = load.f32 v1+20
    v4 = fadd v2, v3
    return v4
}

A static heap can also be used for 32-bit code when the WebAssembly module declares a small upper bound on its memory. A 1 MB static bound with a single 4 KB guard page still has opportunities for sharing bounds checking code:

function %add_members(i32) -> f32 spiderwasm {
    gv0 = vmctx+64
    heap0 = static gv0, min 0x1000, bound 0x10_0000, guard 0x1000

ebb0(v0: i32):
    v1 = heap_addr.i32 heap0, v0, 1
    v2 = load.f32 v1+16
    v3 = load.f32 v1+20
    v4 = fadd v2, v3
    return v4
}

If the upper bound on the heap size is too large, a dynamic heap is required instead.

Finally, a runtime environment that simply allocates a heap with malloc() may not have any guard pages at all. In that case, full bounds checking is required for each access:

function %add_members(i32) -> f32 spiderwasm {
    gv0 = vmctx+64
    gv1 = vmctx+72
    heap0 = dynamic gv0, min 0x1000, bound gv1, guard 0

ebb0(v0: i32):
    v1 = heap_addr.i64 heap0, v0, 20
    v2 = load.f32 v1+16
    v3 = heap_addr.i64 heap0, v0, 24
    v4 = load.f32 v3+20
    v5 = fadd v2, v4
    return v5
}

Operations

A few instructions have variants that take immediate operands (e.g., band / band_imm), but in general an instruction is required to load a constant into an SSA value.

a = select c, x, y

Conditional select.

This instruction selects whole values. Use vselect for lane-wise selection.

Arguments:
  • c (Testable) – Controlling value to test
  • x (Any) – Value to use when c is true
  • y (Any) – Value to use when c is false
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
  • Testable – from input operand

Constant materialization

a = iconst N

Integer constant.

Create a scalar integer SSA value with an immediate constant value, or an integer vector where all the lanes have the same value.

Arguments:
  • N (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A constant integer scalar or vector value
Type Variables:
  • Int – explicitly provided
a = f32const N

Floating point constant.

Create a f32 SSA value with an immediate constant value, or a floating point vector where all the lanes have the same value.

Arguments:
  • N (ieee32) – A 32-bit immediate floating point number.
Results:
  • a (f32) – A constant integer scalar or vector value
a = f64const N

Floating point constant.

Create a f64 SSA value with an immediate constant value, or a floating point vector where all the lanes have the same value.

Arguments:
  • N (ieee64) – A 64-bit immediate floating point number.
Results:
  • a (f64) – A constant integer scalar or vector value
a = bconst N

Boolean constant.

Create a scalar boolean SSA value with an immediate constant value, or a boolean vector where all the lanes have the same value.

Arguments:
  • N (bool) – An immediate boolean.
Results:
  • a (Bool) – A constant boolean scalar or vector value
Type Variables:
  • Bool – explicitly provided

Live range splitting

Cretonne’s register allocator assigns each SSA value to a register or a spill slot on the stack for its entire live range. Since the live range of an SSA value can be quite large, it is sometimes beneficial to split the live range into smaller parts.

A live range is split by creating new SSA values that are copies or the original value or each other. The copies are created by inserting copy, spill, or fill instructions, depending on whether the values are assigned to registers or stack slots.

This approach permits SSA form to be preserved throughout the register allocation pass and beyond.

a = copy x

Register-register copy.

This instruction copies its input, preserving the value type.

A pure SSA-form program does not need to copy values, but this instruction is useful for representing intermediate stages during instruction transformations, and the register allocator needs a way of representing register copies.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
a = spill x

Spill a register value to a stack slot.

This instruction behaves exactly like copy, but the result value is assigned to a spill slot.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x
a = fill x

Load a register value from a stack slot.

This instruction behaves exactly like copy, but creates a new SSA value for the spilled input value.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
Results:
  • a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • Any – inferred from x

Register values can be temporarily diverted to other registers by the regmove instruction, and to and from stack slots by regspill and regfill.

regmove x, src, dst

Temporarily divert x from src to dst.

This instruction moves the location of a value from one register to another without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

The register diversions created by this instruction must be undone before the value leaves the EBB. At the entry to a new EBB, all live values must be in their originally assigned registers.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
  • src (regunit) – A register unit in the target ISA
  • dst (regunit) – A register unit in the target ISA
Type Variables:
  • Any – inferred from x
regspill x, src, SS

Temporarily divert x from src to SS.

This instruction moves the location of a value from a register to a stack slot without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

See also regmove.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
  • src (regunit) – A register unit in the target ISA
  • SS (stack_slot) – A stack slot.
Type Variables:
  • Any – inferred from x
regfill x, SS, dst

Temporarily divert x from SS to dst.

This instruction moves the location of a value from a stack slot to a register without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.

See also regmove.

Arguments:
  • x (Any) – Any integer, float, or boolean scalar or vector type
  • SS (stack_slot) – A stack slot.
  • dst (regunit) – A register unit in the target ISA
Type Variables:
  • Any – inferred from x

Vector operations

lo, hi = vsplit x

Split a vector into two halves.

Split the vector x into two separate values, each containing half of the lanes from x. The result may be two scalars if x only had two lanes.

Arguments:
  • x (TxN) – Vector to split
Results:
  • lo (half_vector(TxN)) – Low-numbered lanes of x
  • hi (half_vector(TxN)) – High-numbered lanes of x
Type Variables:
  • TxN – inferred from x
a = vconcat x, y

Vector concatenation.

Return a vector formed by concatenating x and y. The resulting vector type has twice as many lanes as each of the inputs. The lanes of x appear as the low-numbered lanes, and the lanes of y become the high-numbered lanes of a.

It is possible to form a vector by concatenating two scalars.

Arguments:
  • x (Any128) – Low-numbered lanes
  • y (Any128) – High-numbered lanes
Results:
  • a (double_vector(Any128)) – Concatenation of x and y
Type Variables:
  • Any128 – inferred from x
a = vselect c, x, y

Vector lane select.

Select lanes from x or y controlled by the lanes of the boolean vector c.

Arguments:
  • c (as_bool(TxN)) – Controlling vector
  • x (TxN) – Value to use where c is true
  • y (TxN) – Value to use where c is false
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – inferred from x
a = splat x

Vector splat.

Return a vector whose lanes are all x.

Arguments:
  • x (lane_of(TxN)) – None
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – explicitly provided
a = insertlane x, Idx, y

Insert y as lane Idx in x.

The lane index, Idx, is an immediate value, not an SSA value. It must indicate a valid lane index for the type of x.

Arguments:
  • x (TxN) – SIMD vector to modify
  • Idx (uimm8) – Lane index
  • y (lane_of(TxN)) – New lane value
Results:
  • a (TxN) – A SIMD vector type
Type Variables:
  • TxN – inferred from x
a = extractlane x, Idx

Extract lane Idx from x.

The lane index, Idx, is an immediate value, not an SSA value. It must indicate a valid lane index for the type of x.

Arguments:
  • x (TxN) – A SIMD vector type
  • Idx (uimm8) – Lane index
Results:
  • a (lane_of(TxN)) – None
Type Variables:
  • TxN – inferred from x

Integer operations

a = icmp Cond, x, y

Integer comparison.

The condition code determines if the operands are interpreted as signed or unsigned integers.

Signed Unsigned Condition
eq eq Equal
ne ne Not equal
slt ult Less than
sge uge Greater than or equal
sgt ugt Greater than
sle ule Less than or equal

When this instruction compares integer vectors, it returns a boolean vector of lane-wise comparisons.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (as_bool(Int)) – None
Type Variables:
  • Int – inferred from x
a = icmp_imm Cond, x, Y

Compare scalar integer to a constant.

This is the same as the icmp instruction, except one operand is an immediate constant.

This instruction can only compare scalars. Use icmp for lane-wise vector comparisons.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (b1) – A boolean type with 1 bits.
Type Variables:
  • iB – inferred from x
f = ifcmp x, y

Compare scalar integers and return flags.

Compare two scalar integer values and return integer CPU flags representing the result.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
Results:
  • f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.
Type Variables:
  • iB – inferred from x
f = ifcmp_imm x, Y

Compare scalar integer to a constant and return flags.

Like icmp_imm, but returns integer CPU flags instead of testing a specific condition code.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.
Type Variables:
  • iB – inferred from x
a = iadd x, y

Wrapping integer addition: \(a := x + y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = iadd_imm x, Y

Add immediate integer.

Same as iadd, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = iadd_cin x, y, c_in

Add integers with carry in.

Same as iadd with an additional carry input. Computes:

\[a = x + y + c_{in} \pmod 2^B\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • c_in (b1) – Input carry flag
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from y
a, c_out = iadd_cout x, y

Add integers with carry out.

Same as iadd with an additional carry output.

\[\begin{split}a &= x + y \pmod 2^B \\ c_{out} &= x+y >= 2^B\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
  • c_out (b1) – Output carry flag
Type Variables:
  • iB – inferred from x
a, c_out = iadd_carry x, y, c_in

Add integers with carry in and out.

Same as iadd with an additional carry input and output.

\[\begin{split}a &= x + y + c_{in} \pmod 2^B \\ c_{out} &= x + y + c_{in} >= 2^B\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • c_in (b1) – Input carry flag
Results:
  • a (iB) – A scalar integer type
  • c_out (b1) – Output carry flag
Type Variables:
  • iB – inferred from y
a = isub x, y

Wrapping integer subtraction: \(a := x - y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = irsub_imm x, Y

Immediate reverse wrapping subtraction: \(a := Y - x \pmod{2^B}\).

Also works as integer negation when \(Y = 0\). Use iadd_imm with a negative immediate operand for the reverse immediate subtraction.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = isub_bin x, y, b_in

Subtract integers with borrow in.

Same as isub with an additional borrow flag input. Computes:

\[a = x - (y + b_{in}) \pmod 2^B\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • b_in (b1) – Input borrow flag
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from y
a, b_out = isub_bout x, y

Subtract integers with borrow out.

Same as isub with an additional borrow flag output.

\[\begin{split}a &= x - y \pmod 2^B \\ b_{out} &= x < y\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
  • b_out (b1) – Output borrow flag
Type Variables:
  • iB – inferred from x
a, b_out = isub_borrow x, y, b_in

Subtract integers with borrow in and out.

Same as isub with an additional borrow flag input and output.

\[\begin{split}a &= x - (y + b_{in}) \pmod 2^B \\ b_{out} &= x < y + b_{in}\end{split}\]

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • y (iB) – A scalar integer type
  • b_in (b1) – Input borrow flag
Results:
  • a (iB) – A scalar integer type
  • b_out (b1) – Output borrow flag
Type Variables:
  • iB – inferred from y
a = imul x, y

Wrapping integer multiplication: \(a := x y \pmod{2^B}\).

This instruction does not depend on the signed/unsigned interpretation of the operands.

Polymorphic over all integer types (vector and scalar).

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = imul_imm x, Y

Integer multiplication by immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x

Todo

Larger multiplication results.

For example, smulx which multiplies i32 operands to produce a i64 result. Alternatively, smulhi and smullo pairs.

a = udiv x, y

Unsigned integer division: \(a := \lfloor {x \over y} \rfloor\).

This operation traps if the divisor is zero.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = udiv_imm x, Y

Unsigned integer division by an immediate constant.

This instruction never traps because a divisor of zero is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = sdiv x, y

Signed integer division rounded toward zero: \(a := sign(xy) \lfloor {|x| \over |y|}\rfloor\).

This operation traps if the divisor is zero, or if the result is not representable in \(B\) bits two’s complement. This only happens when \(x = -2^{B-1}, y = -1\).

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = sdiv_imm x, Y

Signed integer division by an immediate constant.

This instruction never traps because a divisor of -1 or 0 is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = urem x, y

Unsigned integer remainder.

This operation traps if the divisor is zero.

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = urem_imm x, Y

Unsigned integer remainder with immediate divisor.

This instruction never traps because a divisor of zero is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = srem x, y

Signed integer remainder. The result has the sign of the dividend.

This operation traps if the divisor is zero.

Todo

Integer remainder vs modulus.

Should we add a smod instruction for the case where the result has the same sign as the divisor?

Arguments:
  • x (Int) – A scalar or vector integer type
  • y (Int) – A scalar or vector integer type
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = srem_imm x, Y

Signed integer remainder with immediate divisor.

This instruction never traps because a divisor of 0 or -1 is not allowed.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x

Todo

Minimum / maximum.

NEON has smin, smax, umin, and umax instructions. We should replicate those for both scalar and vector integer types. Even if the target ISA doesn’t have scalar operations, these are good pattern matching targets.

Todo

Saturating arithmetic.

Mostly for SIMD use, but again these are good patterns for contraction. Something like usatadd, usatsub, ssatadd, and ssatsub is a good start.

Bitwise operations

The bitwise operations and operate on any value type: Integers, floating point numbers, and booleans. When operating on integer or floating point types, the bitwise operations are working on the binary representation of the values. When operating on boolean values, the bitwise operations work as logical operators.

a = band x, y

Bitwise and.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = band_imm x, Y

Bitwise and with immediate.

Same as band, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bor x, y

Bitwise or.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bor_imm x, Y

Bitwise or with immediate.

Same as bor, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bxor x, y

Bitwise xor.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bxor_imm x, Y

Bitwise xor with immediate.

Same as bxor, but one operand is an immediate constant.

Polymorphic over all scalar integer types, but does not support vector types.

Arguments:
  • x (iB) – A scalar integer type
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = bnot x

Bitwise not.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = band_not x, y

Bitwise and not.

Computes x & ~y.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bor_not x, y

Bitwise or not.

Computes x | ~y.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x
a = bxor_not x, y

Bitwise xor not.

Computes x ^ ~y.

Arguments:
  • x (bits) – Any integer, float, or boolean scalar or vector type
  • y (bits) – Any integer, float, or boolean scalar or vector type
Results:
  • a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:
  • bits – inferred from x

The shift and rotate operations only work on integer types (scalar and vector). The shift amount does not have to be the same type as the value being shifted. Only the low B bits of the shift amount is significant.

When operating on an integer vector type, the shift amount is still a scalar type, and all the lanes are shifted the same amount. The shift amount is masked to the number of bits in a lane, not the full size of the vector type.

a = rotl x, y

Rotate left.

Rotate the bits in x by y places.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = rotl_imm x, Y

Rotate left by immediate.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = rotr x, y

Rotate right.

Rotate the bits in x by y places.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = rotr_imm x, Y

Rotate right by immediate.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = ishl x, y

Integer shift left. Shift the bits in x towards the MSB by y places. Shift in zero bits to the LSB.

The shift amount is masked to the size of x.

When shifting a B-bits integer type, this instruction computes:

\[\begin{split}s &:= y \pmod B, \\ a &:= x \cdot 2^s \pmod{2^B}.\end{split}\]
Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = ishl_imm x, Y

Integer shift left by immediate.

The shift amount is masked to the size of x.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = ushr x, y

Unsigned shift right. Shift bits in x towards the LSB by y places, shifting in zero bits to the MSB. Also called a logical shift.

The shift amount is masked to the size of the register.

When shifting a B-bits integer type, this instruction computes:

\[\begin{split}s &:= y \pmod B, \\ a &:= \lfloor x \cdot 2^{-s} \rfloor.\end{split}\]
Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = ushr_imm x, Y

Unsigned shift right by immediate.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
a = sshr x, y

Signed shift right. Shift bits in x towards the LSB by y places, shifting in sign bits to the MSB. Also called an arithmetic shift.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • y (iB) – Number of bits to shift
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x
  • iB – from input operand
a = sshr_imm x, Y

Signed shift right by immediate.

The shift amount is masked to the size of the register.

Arguments:
  • x (Int) – Scalar or vector value to shift
  • Y (imm64) – A 64-bit immediate integer.
Results:
  • a (Int) – A scalar or vector integer type
Type Variables:
  • Int – inferred from x

The bit-counting instructions below are scalar only.

a = clz x

Count leading zero bits.

Starting from the MSB in x, count the number of zero bits before reaching the first one bit. When x is zero, returns the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = cls x

Count leading sign bits.

Starting from the MSB after the sign bit in x, count the number of consecutive bits identical to the sign bit. When x is 0 or -1, returns one less than the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = ctz x

Count trailing zeros.

Starting from the LSB in x, count the number of zero bits before reaching the first one bit. When x is zero, returns the size of x in bits.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x
a = popcnt x

Population count

Count the number of one bits in x.

Arguments:
  • x (iB) – A scalar integer type
Results:
  • a (iB) – A scalar integer type
Type Variables:
  • iB – inferred from x

Floating point operations

These operations generally follow IEEE 754-2008 semantics.

a = fcmp Cond, x, y

Floating point comparison.

Two IEEE 754-2008 floating point numbers, x and y, relate to each other in exactly one of four ways:

UN Unordered when one or both numbers is NaN.
EQ When \(x = y\). (And \(0.0 = -0.0\)).
LT When \(x < y\).
GT When \(x > y\).

The 14 floatcc condition codes each correspond to a subset of the four relations, except for the empty set which would always be false, and the full set which would always be true.

The condition codes are divided into 7 ‘ordered’ conditions which don’t include UN, and 7 unordered conditions which all include UN.

Ordered Unordered Condition
ord EQ | LT | GT uno UN NaNs absent / present.
eq EQ ueq UN | EQ Equal
one LT | GT ne UN | LT | GT Not equal
lt LT ult UN | LT Less than
le LT | EQ ule UN | LT | EQ Less than or equal
gt GT ugt UN | GT Greater than
ge GT | EQ uge UN | GT | EQ Greater than or equal

The standard C comparison operators, <, <=, >, >=, are all ordered, so they are false if either operand is NaN. The C equality operator, ==, is ordered, and since inequality is defined as the logical inverse it is unordered. They map to the floatcc condition codes as follows:

C Cond Subset
== eq EQ
!= ne UN | LT | GT
< lt LT
<= le LT | EQ
> gt GT
>= ge GT | EQ

This subset of condition codes also corresponds to the WebAssembly floating point comparisons of the same name.

When this instruction compares floating point vectors, it returns a boolean vector with the results of lane-wise comparisons.

Arguments:
  • Cond (floatcc) – A floating point comparison condition code.
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (as_bool(Float)) – None
Type Variables:
  • Float – inferred from x
f = ffcmp x, y

Floating point comparison returning flags.

Compares two numbers like fcmp, but returns floating point CPU flags instead of testing a specific condition.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code.
Type Variables:
  • Float – inferred from x
a = fadd x, y

Floating point addition.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fsub x, y

Floating point subtraction.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fmul x, y

Floating point multiplication.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fdiv x, y

Floating point division.

Unlike the integer division instructions sdiv and udiv, this can’t trap. Division by zero is infinity or NaN, depending on the dividend.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = sqrt x

Floating point square root.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from x
a = fma x, y, z

Floating point fused multiply-and-add.

Computes \(a := xy+z\) without any intermediate rounding of the product.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
  • z (Float) – A scalar or vector floating point number
Results:
  • a (Float) – Result of applying operator to each lane
Type Variables:
  • Float – inferred from y

Sign bit manipulations

The sign manipulating instructions work as bitwise operations, so they don’t have special behavior for signaling NaN operands. The exponent and trailing significand bits are always preserved.

a = fneg x

Floating point negation.

Note that this is a pure bitwise operation.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit inverted
Type Variables:
  • Float – inferred from x
a = fabs x

Floating point absolute value.

Note that this is a pure bitwise operation.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit cleared
Type Variables:
  • Float – inferred from x
a = fcopysign x, y

Floating point copy sign.

Note that this is a pure bitwise operation. The sign bit from y is copied to the sign bit of x.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x with its sign bit changed to that of y
Type Variables:
  • Float – inferred from x

Minimum and maximum

These instructions return the larger or smaller of their operands. They differ in their handling of quiet NaN inputs. Note that signaling NaN operands always cause a NaN result.

When comparing zeroes, these instructions behave as if \(-0.0 < 0.0\).

a = fmin x, y

Floating point minimum, propagating NaNs.

If either operand is NaN, this returns a NaN.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The smaller of x and y
Type Variables:
  • Float – inferred from x
a = fminnum x, y

Floating point minimum, suppressing quiet NaNs.

If either operand is a quiet NaN, the other operand is returned. If either operand is a signaling NaN, NaN is returned.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The smaller of x and y
Type Variables:
  • Float – inferred from x
a = fmax x, y

Floating point maximum, propagating NaNs.

If either operand is NaN, this returns a NaN.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The larger of x and y
Type Variables:
  • Float – inferred from x
a = fmaxnum x, y

Floating point maximum, suppressing quiet NaNs.

If either operand is a quiet NaN, the other operand is returned. If either operand is a signaling NaN, NaN is returned.

Arguments:
  • x (Float) – A scalar or vector floating point number
  • y (Float) – A scalar or vector floating point number
Results:
  • a (Float) – The larger of x and y
Type Variables:
  • Float – inferred from x

Rounding

These instructions round their argument to a nearby integral value, still represented as a floating point number.

a = ceil x

Round floating point round to integral, towards positive infinity.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = floor x

Round floating point round to integral, towards negative infinity.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = trunc x

Round floating point round to integral, towards zero.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x
a = nearest x

Round floating point round to integral, towards nearest with ties to even.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (Float) – x rounded to integral value
Type Variables:
  • Float – inferred from x

CPU flag operations

a = trueif Cond, f

Test integer CPU flags for a specific condition.

Check the CPU flags in f against the Cond condition code and return true when the condition code is satisfied.

Arguments:
  • Cond (intcc) – An integer comparison condition code.
  • f (iflags) – CPU flags representing the result of an integer comparison. These flags can be tested with an intcc condition code.
Results:
  • a (b1) – A boolean type with 1 bits.
a = trueff Cond, f

Test floating point CPU flags for a specific condition.

Check the CPU flags in f against the Cond condition code and return true when the condition code is satisfied.

Arguments:
  • Cond (floatcc) – A floating point comparison condition code.
  • f (fflags) – CPU flags representing the result of a floating point comparison. These flags can be tested with a floatcc condition code.
Results:
  • a (b1) – A boolean type with 1 bits.

Conversion operations

a = bitcast x

Reinterpret the bits in x as a different type.

The input and output types must be storable to memory and of the same size. A bitcast is equivalent to storing one type and loading the other type from the same address.

Arguments:
  • x (Mem) – Any type that can be stored in memory
Results:
  • a (MemTo) – Bits of x reinterpreted
Type Variables:
  • MemTo – explicitly provided
  • Mem – from input operand
a = breduce x

Convert x to a smaller boolean type in the platform-defined way.

The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (BoolTo) – A smaller boolean type with the same number of lanes
Type Variables:
  • BoolTo – explicitly provided
  • Bool – from input operand
a = bextend x

Convert x to a larger boolean type in the platform-defined way.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (BoolTo) – A larger boolean type with the same number of lanes
Type Variables:
  • BoolTo – explicitly provided
  • Bool – from input operand
a = bint x

Convert x to an integer.

True maps to 1 and false maps to 0. The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (IntTo) – An integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Bool – from input operand
a = bmask x

Convert x to an integer mask.

True maps to all 1s and false maps to all 0s. The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Bool) – A scalar or vector boolean type
Results:
  • a (IntTo) – An integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Bool – from input operand
a = ireduce x

Convert x to a smaller integer type by dropping high bits.

Each lane in x is converted to a smaller integer type by discarding the most significant bits. This is the same as reducing modulo \(2^n\).

The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A smaller integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = uextend x

Convert x to a larger integer type by zero-extending.

Each lane in x is converted to a larger integer type by adding zeroes. The result has the same numerical value as x when both are interpreted as unsigned integers.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = sextend x

Convert x to a larger integer type by sign-extending.

Each lane in x is converted to a larger integer type by replicating the sign bit. The result has the same numerical value as x when both are interpreted as signed integers.

The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Int – from input operand
a = fpromote x

Convert x to a larger floating point format.

Each lane in x is converted to the destination floating point format. This is an exact operation.

Cretonne currently only supports two floating point formats - f32 and f64. This may change in the future.

The result type must have the same number of vector lanes as the input, and the result lanes must not have fewer bits than the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Float – from input operand
a = fdemote x

Convert x to a smaller floating point format.

Each lane in x is converted to the destination floating point format by rounding to nearest, ties to even.

Cretonne currently only supports two floating point formats - f32 and f64. This may change in the future.

The result type must have the same number of vector lanes as the input, and the result lanes must not have more bits than the input lanes. If the input and output types are the same, this is a no-op.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Float – from input operand
a = fcvt_to_uint x

Convert floating point to unsigned integer.

Each lane in x is converted to an unsigned integer by rounding towards zero. If x is NaN or if the unsigned integral value cannot be represented in the result type, this instruction traps.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Float – from input operand
a = fcvt_to_sint x

Convert floating point to signed integer.

Each lane in x is converted to a signed integer by rounding towards zero. If x is NaN or if the signed integral value cannot be represented in the result type, this instruction traps.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Float) – A scalar or vector floating point number
Results:
  • a (IntTo) – A larger integer type with the same number of lanes
Type Variables:
  • IntTo – explicitly provided
  • Float – from input operand
a = fcvt_from_uint x

Convert unsigned integer to floating point.

Each lane in x is interpreted as an unsigned integer and converted to floating point using round to nearest, ties to even.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Int – from input operand
a = fcvt_from_sint x

Convert signed integer to floating point.

Each lane in x is interpreted as a signed integer and converted to floating point using round to nearest, ties to even.

The result type must have the same number of vector lanes as the input.

Arguments:
  • x (Int) – A scalar or vector integer type
Results:
  • a (FloatTo) – A scalar or vector floating point number
Type Variables:
  • FloatTo – explicitly provided
  • Int – from input operand

Legalization operations

These instructions are used as helpers when legalizing types and operations for the target ISA.

lo, hi = isplit x

Split an integer into low and high parts.

Vectors of integers are split lane-wise, so the results have the same number of lanes as the input, but the lanes are half the size.

Returns the low half of x and the high half of x as two independent values.

Arguments:
  • x (WideInt) – An integer type with lanes from i16 upwards
Results:
  • lo (half_width(WideInt)) – The low bits of x
  • hi (half_width(WideInt)) – The high bits of x
Type Variables:
  • WideInt – inferred from x
a = iconcat lo, hi

Concatenate low and high bits to form a larger integer type.

Vectors of integers are concatenated lane-wise such that the result has the same number of lanes as the inputs, but the lanes are twice the size.

Arguments:
  • lo (NarrowInt) – An integer type with lanes type to i32
  • hi (NarrowInt) – An integer type with lanes type to i32
Results:
  • a (double_width(NarrowInt)) – The concatenation of lo and hi
Type Variables:
  • NarrowInt – inferred from lo

Extending loads and truncating stores

Most ISAs provide instructions that load an integer value smaller than a register and extends it to the width of the register. Similarly, store instructions that only write the low bits of an integer register are common.

In addition to the normal load and store instructions, Cretonne provides extending loads and truncation stores for 8, 16, and 32-bit memory accesses.

a = uload8 Flags, p, Offset

Load 8 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i8 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt8) – An integer type with more than 8 bits
Type Variables:
  • iExt8 – explicitly provided
  • iAddr – from input operand
a = sload8 Flags, p, Offset

Load 8 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i8 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt8) – An integer type with more than 8 bits
Type Variables:
  • iExt8 – explicitly provided
  • iAddr – from input operand
istore8 Flags, x, p, Offset

Store the low 8 bits of x to memory at p + Offset.

This is equivalent to ireduce.i8 followed by store.i8.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt8) – An integer type with more than 8 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Type Variables:
  • iExt8 – inferred from x
  • iAddr – from input operand
a = uload16 Flags, p, Offset

Load 16 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i16 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt16) – An integer type with more than 16 bits
Type Variables:
  • iExt16 – explicitly provided
  • iAddr – from input operand
a = sload16 Flags, p, Offset

Load 16 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i16 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt16) – An integer type with more than 16 bits
Type Variables:
  • iExt16 – explicitly provided
  • iAddr – from input operand
istore16 Flags, x, p, Offset

Store the low 16 bits of x to memory at p + Offset.

This is equivalent to ireduce.i16 followed by store.i16.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt16) – An integer type with more than 16 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Type Variables:
  • iExt16 – inferred from x
  • iAddr – from input operand
a = uload32 Flags, p, Offset

Load 32 bits from memory at p + Offset and zero-extend.

This is equivalent to load.i32 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt32) – An integer type with more than 32 bits
Type Variables:
  • iAddr – inferred from p
a = sload32 Flags, p, Offset

Load 32 bits from memory at p + Offset and sign-extend.

This is equivalent to load.i32 followed by uextend.

Arguments:
  • Flags (memflags) – Memory operation flags
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Results:
  • a (iExt32) – An integer type with more than 32 bits
Type Variables:
  • iAddr – inferred from p
istore32 Flags, x, p, Offset

Store the low 32 bits of x to memory at p + Offset.

This is equivalent to ireduce.i32 followed by store.i32.

Arguments:
  • Flags (memflags) – Memory operation flags
  • x (iExt32) – An integer type with more than 32 bits
  • p (iAddr) – An integer address type
  • Offset (offset32) – Byte offset from base address
Type Variables:
  • iExt32 – inferred from x
  • iAddr – from input operand

ISA-specific instructions

Target ISAs can define supplemental instructions that do not make sense to support generally.

Intel

Instructions that can only be used by the Intel target ISA.

q, r = x86_sdivmodx nlo, nhi, d

Extended signed division.

Concatenate the bits in nhi and nlo to form the numerator. Interpret the bits as a signed number and divide by the signed denominator d. Trap when d is zero or if the quotient is outside the range of the output.

Return both quotient and remainder.

Arguments:
  • nlo (iWord) – Low part of numerator
  • nhi (iWord) – High part of numerator
  • d (iWord) – Denominator
Results:
  • q (iWord) – Quotient
  • r (iWord) – Remainder
Type Variables:
  • iWord – inferred from nhi
q, r = x86_udivmodx nlo, nhi, d

Extended unsigned division.

Concatenate the bits in nhi and nlo to form the numerator. Interpret the bits as an unsigned number and divide by the unsigned denominator d. Trap when d is zero or if the quotient is larger than the range of the output.

Return both quotient and remainder.

Arguments:
  • nlo (iWord) – Low part of numerator
  • nhi (iWord) – High part of numerator
  • d (iWord) – Denominator
Results:
  • q (iWord) – Quotient
  • r (iWord) – Remainder
Type Variables:
  • iWord – inferred from nhi

Implementation limits

Cretonne’s intermediate representation imposes some limits on the size of functions and the number of entities allowed. If these limits are exceeded, the implementation will panic.

Number of instructions in a function
At most \(2^{31} - 1\).
Number of EBBs in a function

At most \(2^{31} - 1\).

Every EBB needs at least a terminator instruction anyway.

Number of secondary values in a function

At most \(2^{31} - 1\).

Secondary values are any SSA values that are not the first result of an instruction.

Other entities declared in the preamble

At most \(2^{32} - 1\).

This covers things like stack slots, jump tables, external functions, and function signatures, etc.

Number of arguments to an EBB
At most \(2^{16}\).
Number of arguments to a function

At most \(2^{16}\).

This follows from the limit on arguments to the entry EBB. Note that Cretonne may add a handful of ABI register arguments as function signatures are lowered. This is for representing things like the link register, the incoming frame pointer, and callee-saved registers that are saved in the prologue.

Size of function call arguments on the stack

At most \(2^{32} - 1\) bytes.

This is probably not possible to achieve given the limit on the number of arguments, except by requiring extremely large offsets for stack arguments.

Glossary

intermediate language
IL
The language used to describe functions to Cretonne. This reference describes the syntax and semantics of the Cretonne IL. The IL has two forms: Textual and an in-memory intermediate representation (IR).
intermediate representation
IR
The in-memory representation of IL. The data structures Cretonne uses to represent a program internally are called the intermediate representation. Cretonne’s IR can be converted to text losslessly.
function signature

A function signature describes how to call a function. It consists of:

  • The calling convention.
  • The number of arguments and return values. (Functions can return multiple values.)
  • Type and flags of each argument.
  • Type and flags of each return value.

Not all function attributes are part of the signature. For example, a function that never returns could be marked as noreturn, but that is not necessary to know when calling it, so it is just an attribute, and not part of the signature.

function preamble

A list of declarations of entities that are used by the function body. Some of the entities that can be declared in the preamble are:

  • Local variables.
  • Functions that are called directly.
  • Function signatures for indirect function calls.
  • Function flags and attributes that are not part of the signature.
function body
The extended basic blocks which contain all the executable code in a function. The function body follows the function preamble.
basic block
A maximal sequence of instructions that can only be entered from the top, and that contains no branch or terminator instructions except for the last instruction.
extended basic block
EBB

A maximal sequence of instructions that can only be entered from the top, and that contains no terminator instructions except for the last one. An EBB can contain conditional branches that can fall through to the following instructions in the block, but only the first instruction in the EBB can be a branch target.

The last instruction in an EBB must be a terminator instruction, so execution cannot flow through to the next EBB in the function. (But there may be a branch to the next EBB.)

Note that some textbooks define an EBB as a maximal subtree in the control flow graph where only the root can be a join node. This definition is not equivalent to Cretonne EBBs.

terminator instruction

A control flow instruction that unconditionally directs the flow of execution somewhere else. Execution never continues at the instruction following a terminator instruction.

The basic terminator instructions are br, return, and trap. Conditional branches and instructions that trap conditionally are not terminator instructions.

entry block
The EBB that is executed first in a function. Currently, a Cretonne function must have exactly one entry block which must be the first block in the function. The types of the entry block arguments must match the types of arguments in the function signature.
stack slot
A fixed size memory allocation in the current function’s activation frame. Also called a local variable.