Cretonne Language Reference¶
The Cretonne intermediate language (IL) has two equivalent
representations: an inmemory data structure that the code generator library
is using, and a text format which is used for test cases and debug output.
Files containing Cretonne textual IL have the .cton
filename extension.
This reference uses the text format to describe IL semantics but glosses over the finer details of the lexical and syntactic structure of the format.
Overall structure¶
Cretonne compiles functions independently. A .cton
IL file may contain
multiple functions, and the programmatic API can create multiple function
handles at the same time, but the functions don’t share any data or reference
each other directly.
This is a simple C function that computes the average of an array of floats:
float
average(const float *array, size_t count)
{
double sum = 0;
for (size_t i = 0; i < count; i++)
sum += array[i];
return sum / count;
}
Here is the same function compiled into Cretonne IL:
function %average(i32, i32) > f32 native {
ss1 = local 8 ; Stack slot for ``sum``.
ebb1(v1: i32, v2: i32):
v3 = f64const 0x0.0
stack_store v3, ss1
brz v2, ebb3 ; Handle count == 0.
v4 = iconst.i32 0
jump ebb2(v4)
ebb2(v5: i32):
v6 = imul_imm v5, 4
v7 = iadd v1, v6
v8 = load.f32 v7 ; array[i]
v9 = fpromote.f64 v8
v10 = stack_load.f64 ss1
v11 = fadd v9, v10
stack_store v11, ss1
v12 = iadd_imm v5, 1
v13 = icmp ult v12, v2
brnz v13, ebb2(v12) ; Loop backedge.
v14 = stack_load.f64 ss1
v15 = fcvt_from_uint.f64 v2
v16 = fdiv v14, v15
v17 = fdemote.f32 v16
return v17
ebb3:
v100 = f32const +NaN
return v100
}
The first line of a function definition provides the function name and
the function signature which declares the parameter and return types.
Then follows the function preamble which declares a number of entities
that can be referenced inside the function. In the example above, the preamble
declares a single local variable, ss1
.
After the preamble follows the function body which consists of extended basic blocks (EBBs), the first of which is the entry block. Every EBB ends with a terminator instruction, so execution can never fall through to the next EBB without an explicit branch.
A .cton
file consists of a sequence of independent function definitions:
function_list ::= { function } function ::= function_spec "{" preamble function_body "}" function_spec ::= "function" function_name signature preamble ::= { preamble_decl } function_body ::= { extended_basic_block }
Static single assignment form¶
The instructions in the function body use and produce values in SSA form. This means that every value is defined exactly once, and every use of a value must be dominated by the definition.
Cretonne does not have phi instructions but uses EBB parameters instead. An EBB can be defined with a list of typed parameters. Whenever control is transferred to the EBB, argument values for the parameters must be provided. When entering a function, the incoming function parameters are passed as arguments to the entry EBB’s parameters.
Instructions define zero, one, or more result values. All SSA values are either EBB parameters or instruction results.
In the example above, the loop induction variable i
is represented as three
SSA values: In the entry block, v4
is the initial value. In the loop block
ebb2
, the EBB parameter v5
represents the value of the induction
variable during each iteration. Finally, v12
is computed as the induction
variable value for the next iteration.
The cton_frontend crate contains utilities for translating from programs containing multiple assignments to the same variables into SSA form for Cretonne IL.
Such variables can also be presented to Cretonne as stack slots.
Stack slots are accessed with the stack_store
and stack_load
instructions, and can have their address taken with stack_addr
, which
supports Clike programming languages where local variables can have their
address taken.
Value types¶
All SSA values have a type which determines the size and shape (for SIMD vectors) of the value. Many instructions are polymorphic – they can operate on different types.
Boolean types¶
Boolean values are either true or false. While this only requires a single bit
to represent, more bits are often used when holding a boolean value in a
register or in memory. The b1
type represents an abstract boolean
value. It can only exist as an SSA value, it can’t be stored in memory or
converted to another type. The larger boolean types can be stored in memory.
They are represented as either all zero bits or all one bits.

b1
¶ A boolean type with 1 bits.
Bytes: Can’t be stored in memory

b8
¶ A boolean type with 8 bits.
Bytes: 1

b16
¶ A boolean type with 16 bits.
Bytes: 2

b32
¶ A boolean type with 32 bits.
Bytes: 4

b64
¶ A boolean type with 64 bits.
Bytes: 8
Integer types¶
Integer values have a fixed size and can be interpreted as either signed or unsigned. Some instructions will interpret an operand as a signed or unsigned number, others don’t care.

i8
¶ An integer type with 8 bits.
Bytes: 1

i16
¶ An integer type with 16 bits.
Bytes: 2

i32
¶ An integer type with 32 bits.
Bytes: 4

i64
¶ An integer type with 64 bits.
Bytes: 8
Floating point types¶
The floating point types have the IEEE 754 semantics that are supported by most hardware, except that nondefault rounding modes, unmasked exceptions, and exception flags are not currently supported.
There is currently no support for higherprecision types like quadprecision, doubledouble, or extendedprecision, nor for narrowerprecision types like halfprecision.
NaNs are encoded following the IEEE 7542008 recommendation, with quiet NaN being encoded with the MSB of the trailing significand set to 1, and signaling NaNs being indicated by the MSB of the trailing significand set to 0.
Except for bitwise and memory instructions, NaNs returned from arithmetic instructions are encoded as follows:
 If all NaN inputs to an instruction are quiet NaNs with all bits of the trailing significand other than the MSB set to 0, the result is a quiet NaN with a nondeterministic sign bit and all bits of the trailing significand other than the MSB set to 0.
 Otherwise the result is a quiet NaN with a nondeterministic sign bit and all bits of the trailing significand other than the MSB set to nondeterministic values.

f32
¶ A 32bit floating point type represented in the IEEE 7542008 binary32 interchange format. This corresponds to the
float
type in most C implementations.Bytes: 4

f64
¶ A 64bit floating point type represented in the IEEE 7542008 binary64 interchange format. This corresponds to the
double
type in most C implementations.Bytes: 8
CPU flags types¶
Some target ISAs use CPU flags to represent the result of a comparison. These CPU flags are represented as two value types depending on the type of values compared.
Since some ISAs don’t have CPU flags, these value types should not be used
until the legalization phase of compilation where the code is adapted to fit
the target ISA. Use instructions like icmp
instead.
The CPU flags types are also restricted such that two flags values can not be live at the same time. After legalization, some instruction encodings will clobber the flags, and flags values are not allowed to be live across such instructions either. The verifier enforces these rules.
SIMD vector types¶
A SIMD vector type represents a vector of values from one of the scalar types (boolean, integer, and floating point). Each scalar value in a SIMD type is called a lane. The number of lanes must be a power of two in the range 2256.

i
Bx
N¶ A SIMD vector of integers. The lane type
iB
is one of the integer typesi8
…i64
.Some concrete integer vector types are
i32x4
,i64x8
, andi16x4
.The size of a SIMD integer vector in memory is \(N B\over 8\) bytes.

f32x
N¶ A SIMD vector of single precision floating point numbers.
Some concrete
f32
vector types are:f32x2
,f32x4
, andf32x8
.The size of a
f32
vector in memory is \(4N\) bytes.
Pseudotypes and type classes¶
These are not concrete types, but convenient names used to refer to real types in this reference.

iAddr
¶ A Pointersized integer representing an address.
This is either
i32
, ori64
, depending on whether the target platform has 32bit or 64bit pointers.

T
x
N¶ Any SIMD vector type.
Immediate operand types¶
These types are not part of the normal SSA type system. They are used to indicate the different kinds of immediate operands on an instruction.

imm64
¶ A 64bit immediate integer. The value of this operand is interpreted as a signed two’s complement integer. Instruction encodings may limit the valid range.
In the textual format,
imm64
immediates appear as decimal or hexadecimal literals using the same syntax as C.

offset32
¶ A signed 32bit immediate address offset.
In the textual format,
offset32
immediates always have an explicit sign, and a 0 offset may be omitted.

ieee32
¶ A 32bit immediate floating point number in the IEEE 7542008 binary32 interchange format. All bit patterns are allowed.

ieee64
¶ A 64bit immediate floating point number in the IEEE 7542008 binary64 interchange format. All bit patterns are allowed.

bool
¶ A boolean immediate value, either false or true.
In the textual format,
bool
immediates appear as ‘false’ and ‘true’.
The two IEEE floating point immediate types ieee32
and ieee64
are displayed as hexadecimal floating point literals in the textual IL
format. Decimal floating point literals are not allowed because some computer
systems can round differently when converting to binary. The hexadecimal
floating point format is mostly the same as the one used by C99, but extended
to represent all NaN bit patterns:
 Normal numbers
 Compatible with C99:
0x1.Tpe
whereT
are the trailing significand bits encoded as hexadecimal, ande
is the unbiased exponent as a decimal number.ieee32
has 23 trailing significand bits. They are padded with an extra LSB to produce 6 hexadecimal digits. This is not necessary forieee64
which has 52 trailing significand bits forming 13 hexadecimal digits with no padding.  Zeros
 Positive and negative zero are displayed as
0.0
and0.0
respectively.  Subnormal numbers
 Compatible with C99:
0x0.Tpemin
whereT
are the trailing significand bits encoded as hexadecimal, andemin
is the minimum exponent as a decimal number.  Infinities
 Either
Inf
orInf
.  Quiet NaNs
 Quiet NaNs have the MSB of the trailing significand set. If the remaining
bits of the trailing significand are all zero, the value is displayed as
NaN
orNaN
. Otherwise,NaN:0xT
whereT
are the trailing significand bits encoded as hexadecimal.  Signaling NaNs
 Displayed as
sNaN:0xT
.
Control flow¶
Branches transfer control to a new EBB and provide values for the target EBB’s arguments, if it has any. Conditional branches only take the branch if their condition is satisfied, otherwise execution continues at the following instruction in the EBB.

jump
EBB(args…)¶ Jump.
Unconditionally jump to an extended basic block, passing the specified EBB arguments. The number and types of arguments must match the destination EBB.
Arguments:  EBB (ebb) – Destination extended basic block
 args (variable_args) – EBB arguments

fallthrough
EBB(args…)¶ Fall through to the next EBB.
This is the same as
jump
, except the destination EBB must be the next one in the layout.Jumps are turned into fallthrough instructions by the branch relaxation pass. There is no reason to use this instruction outside that pass.
Arguments:  EBB (ebb) – Destination extended basic block
 args (variable_args) – EBB arguments

brz
c, EBB(args…)¶ Branch when zero.
If
c
is ab1
value, take the branch whenc
is false. Ifc
is an integer value, take the branch whenc = 0
.Arguments:  c (Testable) – Controlling value to test
 EBB (ebb) – Destination extended basic block
 args (variable_args) – EBB arguments
Type Variables:  Testable – inferred from c

brnz
c, EBB(args…)¶ Branch when nonzero.
If
c
is ab1
value, take the branch whenc
is true. Ifc
is an integer value, take the branch whenc != 0
.Arguments:  c (Testable) – Controlling value to test
 EBB (ebb) – Destination extended basic block
 args (variable_args) – EBB arguments
Type Variables:  Testable – inferred from c

br_icmp
Cond, x, y, EBB(args…)¶ Compare scalar integers and branch.
Compare
x
andy
in the same way as theicmp
instruction and take the branch if the condition is true:br_icmp ugt v1, v2, ebb4(v5, v6)
is semantically equivalent to:
v10 = icmp ugt, v1, v2 brnz v10, ebb4(v5, v6)
Some RISC architectures like MIPS and RISCV provide instructions that implement all or some of the condition codes. The instruction can also be used to represent macroop fusion on architectures like Intel’s.
Arguments: Type Variables:  iB – inferred from x

brif
Cond, f, EBB(args…)¶ Branch when condition is true in integer CPU flags.
Arguments:

brff
Cond, f, EBB(args…)¶ Branch when condition is true in floating point CPU flags.
Arguments:

br_table
x, JT¶ Indirect branch via jump table.
Use
x
as an unsigned index into the jump tableJT
. If a jump table entry is found, branch to the corresponding EBB. If no entry was found fall through to the next instruction.Note that this branch instruction can’t pass arguments to the targeted blocks. Split critical edges as needed to work around this.
Arguments:  x (iB) – index into jump table
 JT (jump_table) – A jump table.
Type Variables:  iB – inferred from x

JT =
jump_table
EBB0, EBB1, …, EBBn¶ Declare a jump table in the function preamble.
This declares a jump table for use by the
br_table
indirect branch instruction. Entries in the table are either EBB names, or0
which indicates an absent entry.The EBBs listed must belong to the current function, and they can’t have any arguments.
Arguments:  EBB0 – Target EBB when
x = 0
.  EBB1 – Target EBB when
x = 1
.  EBBn – Target EBB when
x = n
.
Result: A jump table identifier. (Not an SSA value).
 EBB0 – Target EBB when
Traps stop the program because something went wrong. The exact behavior depends
on the target instruction set architecture and operating system. There are
explicit trap instructions defined below, but some instructions may also cause
traps for certain input value. For example, udiv
traps when the divisor
is zero.

trap
code¶ Terminate execution unconditionally.
Arguments:  code (trapcode) – A trap reason code.
Function calls¶
A function call needs a target function and a function signature. The target function may be determined dynamically at runtime, but the signature must be known when the function call is compiled. The function signature describes how to call the function, including parameters, return values, and the calling convention:
signature ::= "(" [paramlist] ")" [">" retlist] [call_conv] paramlist ::= param { "," param } retlist ::= paramlist param ::= type [paramext] [paramspecial] paramext ::= "uext"  "sext" paramspecial ::= "sret"  "link"  "fp"  "csr"  "vmctx" callconv ::= "native"  "spiderwasm"
Parameters and return values have flags whose meaning is mostly target dependent. They make it possible to call native functions on the target platform. When calling other Cretonne functions, the flags are not necessary.
Functions that are called directly must be declared in the function preamble:

FN =
function
NAME signature¶ Declare a function so it can be called directly.
Arguments:  NAME – Name of the function, passed to the linker for resolution.
 signature – Function signature. See below.
Results:  FN – A function identifier that can be used with
call
.

rvals =
call
FN(args…)¶ Direct function call.
Call a function which has been declared in the preamble. The argument types must match the function’s signature.
Arguments:  FN (func_ref) – function to call, declared by
function
 args (variable_args) – call arguments
Results:  rvals (variable_args) – return values
 FN (func_ref) – function to call, declared by

return
rvals…¶ Return from the function.
Unconditionally transfer control to the calling function, passing the provided return values. The list of return values must match the function signature’s return types.
Arguments:  rvals (variable_args) – return values
This simple example illustrates direct function calls and signatures:
function %gcd(i32 uext, i32 uext) > i32 uext native {
fn1 = function %divmod(i32 uext, i32 uext) > i32 uext, i32 uext
ebb1(v1: i32, v2: i32):
brz v2, ebb2
v3, v4 = call fn1(v1, v2)
return v3
ebb2:
return v1
}
Indirect function calls use a signature declared in the preamble.

rvals =
call_indirect
SIG, callee(args…)¶ Indirect function call.
Call the function pointed to by callee with the given arguments. The called function must match the specified signature.
Arguments:  SIG (sig_ref) – function signature
 callee (iAddr) – address of function to call
 args (variable_args) – call arguments
Results:  rvals (variable_args) – return values
Type Variables:  iAddr – inferred from callee

addr =
func_addr
FN¶ Get the address of a function.
Compute the absolute address of a function declared in the preamble. The returned address can be used as a
callee
argument tocall_indirect
. This is also a method for calling functions that are too far away to be addressable by a directcall
instruction.Arguments:  FN (func_ref) – function to call, declared by
function
Results:  addr (iAddr) – An integer address type
Type Variables:  iAddr – explicitly provided
 FN (func_ref) – function to call, declared by
Memory¶
Cretonne provides fully general load
and store
instructions for
accessing memory, as well as extending loads and truncating stores.
If the memory at the given addresss is not addressable, the behavior of these instructions is undefined. If it is addressable but not accessible, they trap.

a =
load
Flags, p, Offset¶ Load from memory at
p + Offset
.This is a polymorphic instruction that can load any value type which has a memory representation.
Arguments: Results:  a (Mem) – Value loaded
Type Variables:  Mem – explicitly provided
 iAddr – from input operand

store
Flags, x, p, Offset¶ Store
x
to memory atp + Offset
.This is a polymorphic instruction that can store any value type with a memory representation.
Arguments: Type Variables:  Mem – inferred from x
 iAddr – from input operand
There are also more restricted operations for accessing specific types of memory objects.
Memory operation flags¶
Loads and stores can have flags that loosen their semantics in order to enable optimizations.
Flag  Description 

notrap  Memory is assumed to be accessible. 
aligned  Trapping allowed for misaligned accesses. 
When the accessible
flag is set, the behavior is undefined if the memory
is not accessible.
Loads and stores are misaligned if the resultant address is not a multiple of
the expected alignment. By default, misaligned loads and stores are allowed,
but when the aligned
flag is set, a misaligned memory access is allowed to
trap.
Local variables¶
One set of restricted memory operations access the current function’s stack frame. The stack frame is divided into fixedsize stack slots that are allocated in the function preamble. Stack slots are not typed, they simply represent a contiguous sequence of accessible bytes in the stack frame.

SS =
local
Bytes, Flags…¶ Allocate a stack slot for a local variable in the preamble.
If no alignment is specified, Cretonne will pick an appropriate alignment for the stack slot based on its size and access patterns.
Arguments:  Bytes – Stack slot size on bytes.
Flags:  align(N) – Request at least N bytes alignment.
Results:  SS – Stack slot index.

a =
stack_load
SS, Offset¶ Load a value from a stack slot at the constant offset.
This is a polymorphic instruction that can load any value type which has a memory representation.
The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. \(sizeof(a) + Offset <= sizeof(SS)\).
Arguments:  SS (stack_slot) – A stack slot.
 Offset (offset32) – Inbounds offset into stack slot
Results:  a (Mem) – Value loaded
Type Variables:  Mem – explicitly provided

stack_store
x, SS, Offset¶ Store a value to a stack slot at a constant offset.
This is a polymorphic instruction that can store any value type with a memory representation.
The offset is an immediate constant, not an SSA value. The memory access cannot go out of bounds, i.e. \(sizeof(a) + Offset <= sizeof(SS)\).
Arguments: Type Variables:  Mem – inferred from x
The dedicated stack access instructions are easy for the compiler to reason about because stack slots and offsets are fixed at compile time. For example, the alignment of these stack memory accesses can be inferred from the offsets and stack slot alignments.
It’s also possible to obtain the address of a stack slot, which can be used in unrestricted loads and stores.

addr =
stack_addr
SS, Offset¶ Get the address of a stack slot.
Compute the absolute address of a byte in a stack slot. The offset must refer to a byte inside the stack slot: \(0 <= Offset < sizeof(SS)\).
Arguments:  SS (stack_slot) – A stack slot.
 Offset (offset32) – Inbounds offset into stack slot
Results:  addr (iAddr) – An integer address type
Type Variables:  iAddr – explicitly provided
The stack_addr
instruction can be used to macroexpand the stack access
instructions before instruction selection:
v1 = stack_load.f64 ss3, 16
; Expands to:
v9 = stack_addr ss3, 16
v1 = load.f64 v9
Global variables¶
A global variable is an accessible object in memory whose address is
not known at compile time. The address is computed at runtime by
global_addr
, possibly using information provided by the linker via
relocations. There are multiple kinds of global variables using different
methods for determining their address. Cretonne does not track the type or even
the size of global variables, they are just pointers to nonstack memory.
When Cretonne is generating code for a virtual machine environment, globals can be used to access data structures in the VM’s runtime. This requires functions to have access to a VM context pointer which is used as the base address. Typically, the VM context pointer is passed as a hidden function argument to Cretonne functions.

GV =
vmctx+Offset
¶ Declare a global variable in the VM context struct.
This declares a global variable whose address is a constant offset from the VM context pointer which is passed as a hidden argument to all functions JITcompiled for the VM.
Typically, the VM context is a C struct, and the declared global variable is a member of the struct.
Arguments:  Offset – Byte offset from the VM context pointer to the global variable.
Results:  GV – Global variable.
The address of a global variable can also be derived by treating another global variable as a struct pointer. This makes it possible to chase pointers into VM runtime data structures.

GV =
deref(BaseGV)+Offset
¶ Declare a global variable in a struct pointed to by BaseGV.
The address of GV can be computed by first loading a pointer from BaseGV and adding Offset to it.
It is assumed the BaseGV resides in readable memory with the apropriate alignment for storing a pointer.
Chains of
deref
global variables are possible, but cycles are not allowed. They will be caught by the IL verifier.Arguments:  BaseGV – Global variable containing the base pointer.
 Offset – Byte offset from the loaded base pointer to the global variable.
Results:  GV – Global variable.

GV =
globalsym
name¶ Declare a global variable at a symbolic address.
The address of GV is symbolic and will be assigned a relocation, so that it can be resolved by a later linking phase.
Arguments:  name – External name.
Results:  GV – Global variable.
Heaps¶
Code compiled from WebAssembly or asm.js runs in a sandbox where it can’t access all process memory. Instead, it is given a small set of memory areas to work in, and all accesses are bounds checked. Cretonne models this through the concept of heaps.
A heap is declared in the function preamble and can be accessed with the
heap_addr
instruction that traps on outofbounds accesses or
returns a pointer that is guaranteed to trap. Heap addresses can be smaller than
the native pointer size, for example unsigned i32
offsets on a 64bit
architecture.
A heap appears as three consecutive ranges of address space:
 The mapped pages are the accessible memory range in the heap. A heap may have a minimum guaranteed size which means that some mapped pages are always present.
 The unmapped pages is a possibly empty range of address space that may be mapped in the future when the heap is grown. They are addressable but not accessible.
 The guard pages is a range of address space that is guaranteed to cause a trap when accessed. It is used to optimize bounds checking for heap accesses with a shared base pointer. They are addressable but not accessible.
The heap bound is the total size of the mapped and unmapped pages. This is
the bound that heap_addr
checks against. Memory accesses inside the
heap bounds can trap if they hit an unmapped page (which is not
accessible).

addr =
heap_addr
H, p, Size¶ Bounds check and compute absolute address of heap memory.
Verify that the offset range
p .. p + Size  1
is in bounds for the heap H, and generate an absolute address that is safe to dereference. If
p + Size
is not greater than the heap bound, return an absolute address corresponding to a byte offset ofp
from the heap’s base address.  If
p + Size
is greater than the heap bound, generate a trap.
Arguments:  H (heap) – A heap.
 p (HeapOffset) – An unsigned heap offset
 Size (uimm32) – Size in bytes
Results:  addr (iAddr) – An integer address type
Type Variables:  iAddr – explicitly provided
 HeapOffset – from input operand
 If
Two styles of heaps are supported, static and dynamic. They behave differently when resized.
Static heaps¶
A static heap starts out with all the address space it will ever need, so it never moves to a different address. At the base address is a number of mapped pages corresponding to the heap’s current size. Then follows a number of unmapped pages where the heap can grow up to its maximum size. After the unmapped pages follow the guard pages which are also guaranteed to generate a trap when accessed.

H =
static
Base, min MinBytes, bound BoundBytes, guard GuardBytes¶ Declare a static heap in the preamble.
Arguments:  Base – Global variable holding the heap’s base address or
reserved_reg
.  MinBytes – Guaranteed minimum heap size in bytes. Accesses below this size will never trap.
 BoundBytes – Fixed heap bound in bytes. This defines the amount of address space reserved for the heap, not including the guard pages.
 GuardBytes – Size of the guard pages in bytes.
 Base – Global variable holding the heap’s base address or
Dynamic heaps¶
A dynamic heap can be relocated to a different base address when it is resized, and its bound can move dynamically. The guard pages move when the heap is resized. The bound of a dynamic heap is stored in a global variable.

H =
dynamic
Base, min MinBytes, bound BoundGV, guard GuardBytes¶ Declare a dynamic heap in the preamble.
Arguments:  Base – Global variable holding the heap’s base address or
reserved_reg
.  MinBytes – Guaranteed minimum heap size in bytes. Accesses below this size will never trap.
 BoundGV – Global variable containing the current heap bound in bytes.
 GuardBytes – Size of the guard pages in bytes.
 Base – Global variable holding the heap’s base address or
Heap examples¶
The SpiderMonkey VM prefers to use fixed heaps with a 4 GB bound and 2 GB of guard pages when running WebAssembly code on 64bit CPUs. The combination of a 4 GB fixed bound and 1byte bounds checks means that no code needs to be generated for bounds checks at all:
function %add_members(i32) > f32 spiderwasm {
gv0 = vmctx+64
heap0 = static gv0, min 0x1000, bound 0x1_0000_0000, guard 0x8000_0000
ebb0(v0: i32):
v1 = heap_addr.i64 heap0, v0, 1
v2 = load.f32 v1+16
v3 = load.f32 v1+20
v4 = fadd v2, v3
return v4
}
A static heap can also be used for 32bit code when the WebAssembly module declares a small upper bound on its memory. A 1 MB static bound with a single 4 KB guard page still has opportunities for sharing bounds checking code:
function %add_members(i32) > f32 spiderwasm {
gv0 = vmctx+64
heap0 = static gv0, min 0x1000, bound 0x10_0000, guard 0x1000
ebb0(v0: i32):
v1 = heap_addr.i32 heap0, v0, 1
v2 = load.f32 v1+16
v3 = load.f32 v1+20
v4 = fadd v2, v3
return v4
}
If the upper bound on the heap size is too large, a dynamic heap is required instead.
Finally, a runtime environment that simply allocates a heap with
malloc()
may not have any guard pages at all. In that case, full
bounds checking is required for each access:
function %add_members(i32) > f32 spiderwasm {
gv0 = vmctx+64
gv1 = vmctx+72
heap0 = dynamic gv0, min 0x1000, bound gv1, guard 0
ebb0(v0: i32):
v1 = heap_addr.i64 heap0, v0, 20
v2 = load.f32 v1+16
v3 = heap_addr.i64 heap0, v0, 24
v4 = load.f32 v3+20
v5 = fadd v2, v4
return v5
}
Operations¶

a =
select
c, x, y¶ Conditional select.
This instruction selects whole values. Use
vselect
for lanewise selection.Arguments:  c (Testable) – Controlling value to test
 x (Any) – Value to use when c is true
 y (Any) – Value to use when c is false
Results:  a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:  Any – inferred from x
 Testable – from input operand
Constant materialization¶
A few instructions have variants that take immediate operands (e.g.,
band
/ band_imm
), but in general an instruction is required to
load a constant into an SSA value.

a =
iconst
N¶ Integer constant.
Create a scalar integer SSA value with an immediate constant value, or an integer vector where all the lanes have the same value.
Arguments:  N (imm64) – A 64bit immediate integer.
Results:  a (Int) – A constant integer scalar or vector value
Type Variables:  Int – explicitly provided

a =
f32const
N¶ Floating point constant.
Create a
f32
SSA value with an immediate constant value.Arguments:  N (ieee32) – A 32bit immediate floating point number.
Results:  a (f32) – A constant f32 scalar value

a =
f64const
N¶ Floating point constant.
Create a
f64
SSA value with an immediate constant value.Arguments:  N (ieee64) – A 64bit immediate floating point number.
Results:  a (f64) – A constant f64 scalar value

a =
bconst
N¶ Boolean constant.
Create a scalar boolean SSA value with an immediate constant value, or a boolean vector where all the lanes have the same value.
Arguments:  N (bool) – An immediate boolean.
Results:  a (Bool) – A constant boolean scalar or vector value
Type Variables:  Bool – explicitly provided
Live range splitting¶
Cretonne’s register allocator assigns each SSA value to a register or a spill slot on the stack for its entire live range. Since the live range of an SSA value can be quite large, it is sometimes beneficial to split the live range into smaller parts.
A live range is split by creating new SSA values that are copies or the
original value or each other. The copies are created by inserting copy
,
spill
, or fill
instructions, depending on whether the values
are assigned to registers or stack slots.
This approach permits SSA form to be preserved throughout the register allocation pass and beyond.

a =
copy
x¶ Registerregister copy.
This instruction copies its input, preserving the value type.
A pure SSAform program does not need to copy values, but this instruction is useful for representing intermediate stages during instruction transformations, and the register allocator needs a way of representing register copies.
Arguments:  x (Any) – Any integer, float, or boolean scalar or vector type
Results:  a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:  Any – inferred from x

a =
spill
x¶ Spill a register value to a stack slot.
This instruction behaves exactly like
copy
, but the result value is assigned to a spill slot.Arguments:  x (Any) – Any integer, float, or boolean scalar or vector type
Results:  a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:  Any – inferred from x

a =
fill
x¶ Load a register value from a stack slot.
This instruction behaves exactly like
copy
, but creates a new SSA value for the spilled input value.Arguments:  x (Any) – Any integer, float, or boolean scalar or vector type
Results:  a (Any) – Any integer, float, or boolean scalar or vector type
Type Variables:  Any – inferred from x
Register values can be temporarily diverted to other registers by the
regmove
instruction, and to and from stack slots by regspill
and regfill
.

regmove
x, src, dst¶ Temporarily divert
x
fromsrc
todst
.This instruction moves the location of a value from one register to another without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.
The register diversions created by this instruction must be undone before the value leaves the EBB. At the entry to a new EBB, all live values must be in their originally assigned registers.
Arguments:  x (Any) – Any integer, float, or boolean scalar or vector type
 src (regunit) – A register unit in the target ISA
 dst (regunit) – A register unit in the target ISA
Type Variables:  Any – inferred from x

regspill
x, src, SS¶ Temporarily divert
x
fromsrc
toSS
.This instruction moves the location of a value from a register to a stack slot without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.
See also
regmove
.Arguments:  x (Any) – Any integer, float, or boolean scalar or vector type
 src (regunit) – A register unit in the target ISA
 SS (stack_slot) – A stack slot.
Type Variables:  Any – inferred from x

regfill
x, SS, dst¶ Temporarily divert
x
fromSS
todst
.This instruction moves the location of a value from a stack slot to a register without creating a new SSA value. It is used by the register allocator to temporarily rearrange register assignments in order to satisfy instruction constraints.
See also
regmove
.Arguments:  x (Any) – Any integer, float, or boolean scalar or vector type
 SS (stack_slot) – A stack slot.
 dst (regunit) – A register unit in the target ISA
Type Variables:  Any – inferred from x
Vector operations¶

lo, hi =
vsplit
x¶ Split a vector into two halves.
Split the vector x into two separate values, each containing half of the lanes from
x
. The result may be two scalars ifx
only had two lanes.Arguments:  x (TxN) – Vector to split
Results:  lo (half_vector(TxN)) – Lownumbered lanes of x
 hi (half_vector(TxN)) – Highnumbered lanes of x
Type Variables:  TxN – inferred from x

a =
vconcat
x, y¶ Vector concatenation.
Return a vector formed by concatenating
x
andy
. The resulting vector type has twice as many lanes as each of the inputs. The lanes ofx
appear as the lownumbered lanes, and the lanes ofy
become the highnumbered lanes ofa
.It is possible to form a vector by concatenating two scalars.
Arguments:  x (Any128) – Lownumbered lanes
 y (Any128) – Highnumbered lanes
Results:  a (double_vector(Any128)) – Concatenation of x and y
Type Variables:  Any128 – inferred from x

a =
vselect
c, x, y¶ Vector lane select.
Select lanes from
x
ory
controlled by the lanes of the boolean vectorc
.Arguments:  c (as_bool(TxN)) – Controlling vector
 x (TxN) – Value to use where c is true
 y (TxN) – Value to use where c is false
Results:  a (TxN) – A SIMD vector type
Type Variables:  TxN – inferred from x

a =
splat
x¶ Vector splat.
Return a vector whose lanes are all
x
.Arguments:  x (lane_of(TxN)) – None
Results:  a (TxN) – A SIMD vector type
Type Variables:  TxN – explicitly provided

a =
insertlane
x, Idx, y¶ Insert
y
as laneIdx
in x.The lane index,
Idx
, is an immediate value, not an SSA value. It must indicate a valid lane index for the type ofx
.Arguments:  x (TxN) – SIMD vector to modify
 Idx (uimm8) – Lane index
 y (lane_of(TxN)) – New lane value
Results:  a (TxN) – A SIMD vector type
Type Variables:  TxN – inferred from x

a =
extractlane
x, Idx¶ Extract lane
Idx
fromx
.The lane index,
Idx
, is an immediate value, not an SSA value. It must indicate a valid lane index for the type ofx
.Arguments:  x (TxN) – A SIMD vector type
 Idx (uimm8) – Lane index
Results:  a (lane_of(TxN)) – None
Type Variables:  TxN – inferred from x
Integer operations¶

a =
icmp
Cond, x, y¶ Integer comparison.
The condition code determines if the operands are interpreted as signed or unsigned integers.
Signed Unsigned Condition eq eq Equal ne ne Not equal slt ult Less than sge uge Greater than or equal sgt ugt Greater than sle ule Less than or equal When this instruction compares integer vectors, it returns a boolean vector of lanewise comparisons.
Arguments: Results:  a (as_bool(Int)) – None
Type Variables:  Int – inferred from x

a =
icmp_imm
Cond, x, Y¶ Compare scalar integer to a constant.
This is the same as the
icmp
instruction, except one operand is an immediate constant.This instruction can only compare scalars. Use
icmp
for lanewise vector comparisons.Arguments: Results:  a (b1) – A boolean type with 1 bits.
Type Variables:  iB – inferred from x

f =
ifcmp
x, y¶ Compare scalar integers and return flags.
Compare two scalar integer values and return integer CPU flags representing the result.
Arguments: Results: Type Variables:  iB – inferred from x

f =
ifcmp_imm
x, Y¶ Compare scalar integer to a constant and return flags.
Like
icmp_imm
, but returns integer CPU flags instead of testing a specific condition code.Arguments: Results: Type Variables:  iB – inferred from x

a =
iadd
x, y¶ Wrapping integer addition: \(a := x + y \pmod{2^B}\).
This instruction does not depend on the signed/unsigned interpretation of the operands.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
iadd_imm
x, Y¶ Add immediate integer.
Same as
iadd
, but one operand is an immediate constant.Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
iadd_cin
x, y, c_in¶ Add integers with carry in.
Same as
iadd
with an additional carry input. Computes:\[a = x + y + c_{in} \pmod 2^B\]Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from y

a, c_out =
iadd_cout
x, y¶ Add integers with carry out.
Same as
iadd
with an additional carry output.\[\begin{split}a &= x + y \pmod 2^B \\ c_{out} &= x+y >= 2^B\end{split}\]Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results: Type Variables:  iB – inferred from x

a, c_out =
iadd_carry
x, y, c_in¶ Add integers with carry in and out.
Same as
iadd
with an additional carry input and output.\[\begin{split}a &= x + y + c_{in} \pmod 2^B \\ c_{out} &= x + y + c_{in} >= 2^B\end{split}\]Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results: Type Variables:  iB – inferred from y

a =
isub
x, y¶ Wrapping integer subtraction: \(a := x  y \pmod{2^B}\).
This instruction does not depend on the signed/unsigned interpretation of the operands.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
irsub_imm
x, Y¶ Immediate reverse wrapping subtraction: \(a := Y  x \pmod{2^B}\).
Also works as integer negation when \(Y = 0\). Use
iadd_imm
with a negative immediate operand for the reverse immediate subtraction.Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
isub_bin
x, y, b_in¶ Subtract integers with borrow in.
Same as
isub
with an additional borrow flag input. Computes:\[a = x  (y + b_{in}) \pmod 2^B\]Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from y

a, b_out =
isub_bout
x, y¶ Subtract integers with borrow out.
Same as
isub
with an additional borrow flag output.\[\begin{split}a &= x  y \pmod 2^B \\ b_{out} &= x < y\end{split}\]Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results: Type Variables:  iB – inferred from x

a, b_out =
isub_borrow
x, y, b_in¶ Subtract integers with borrow in and out.
Same as
isub
with an additional borrow flag input and output.\[\begin{split}a &= x  (y + b_{in}) \pmod 2^B \\ b_{out} &= x < y + b_{in}\end{split}\]Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results: Type Variables:  iB – inferred from y
Todo
Add and subtract with signed overflow.
For example, see llvm.sadd.with.overflow.* and llvm.ssub.with.overflow.* in LLVM.

a =
imul
x, y¶ Wrapping integer multiplication: \(a := x y \pmod{2^B}\).
This instruction does not depend on the signed/unsigned interpretation of the operands.
Polymorphic over all integer types (vector and scalar).
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
imul_imm
x, Y¶ Integer multiplication by immediate constant.
Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x
Todo
Larger multiplication results.
For example, smulx
which multiplies i32
operands to produce a
i64
result. Alternatively, smulhi
and smullo
pairs.

a =
udiv
x, y¶ Unsigned integer division: \(a := \lfloor {x \over y} \rfloor\).
This operation traps if the divisor is zero.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
udiv_imm
x, Y¶ Unsigned integer division by an immediate constant.
This instruction never traps because a divisor of zero is not allowed.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
sdiv
x, y¶ Signed integer division rounded toward zero: \(a := sign(xy) \lfloor {x \over y}\rfloor\).
This operation traps if the divisor is zero, or if the result is not representable in \(B\) bits two’s complement. This only happens when \(x = 2^{B1}, y = 1\).
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
sdiv_imm
x, Y¶ Signed integer division by an immediate constant.
This instruction never traps because a divisor of 1 or 0 is not allowed.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
urem
x, y¶ Unsigned integer remainder.
This operation traps if the divisor is zero.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
urem_imm
x, Y¶ Unsigned integer remainder with immediate divisor.
This instruction never traps because a divisor of zero is not allowed.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
srem
x, y¶ Signed integer remainder. The result has the sign of the dividend.
This operation traps if the divisor is zero.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
srem_imm
x, Y¶ Signed integer remainder with immediate divisor.
This instruction never traps because a divisor of 0 or 1 is not allowed.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x
Todo
Integer minimum / maximum.
NEON has smin
, smax
, umin
, and umax
instructions. We should
replicate those for both scalar and vector integer types. Even if the
target ISA doesn’t have scalar operations, these are good pattern matching
targets.
Todo
Saturating arithmetic.
Mostly for SIMD use, but again these are good patterns for contraction.
Something like usatadd
, usatsub
, ssatadd
, and ssatsub
is a
good start.
Bitwise operations¶
The bitwise operations and operate on any value type: Integers, floating point numbers, and booleans. When operating on integer or floating point types, the bitwise operations are working on the binary representation of the values. When operating on boolean values, the bitwise operations work as logical operators.

a =
band
x, y¶ Bitwise and.
Arguments:  x (bits) – Any integer, float, or boolean scalar or vector type
 y (bits) – Any integer, float, or boolean scalar or vector type
Results:  a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:  bits – inferred from x

a =
band_imm
x, Y¶ Bitwise and with immediate.
Same as
band
, but one operand is an immediate constant.Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
bor
x, y¶ Bitwise or.
Arguments:  x (bits) – Any integer, float, or boolean scalar or vector type
 y (bits) – Any integer, float, or boolean scalar or vector type
Results:  a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:  bits – inferred from x

a =
bor_imm
x, Y¶ Bitwise or with immediate.
Same as
bor
, but one operand is an immediate constant.Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
bxor
x, y¶ Bitwise xor.
Arguments:  x (bits) – Any integer, float, or boolean scalar or vector type
 y (bits) – Any integer, float, or boolean scalar or vector type
Results:  a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:  bits – inferred from x

a =
bxor_imm
x, Y¶ Bitwise xor with immediate.
Same as
bxor
, but one operand is an immediate constant.Polymorphic over all scalar integer types, but does not support vector types.
Arguments: Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
bnot
x¶ Bitwise not.
Arguments:  x (bits) – Any integer, float, or boolean scalar or vector type
Results:  a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:  bits – inferred from x

a =
band_not
x, y¶ Bitwise and not.
Computes x & ~y.
Arguments:  x (bits) – Any integer, float, or boolean scalar or vector type
 y (bits) – Any integer, float, or boolean scalar or vector type
Results:  a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:  bits – inferred from x

a =
bor_not
x, y¶ Bitwise or not.
Computes x  ~y.
Arguments:  x (bits) – Any integer, float, or boolean scalar or vector type
 y (bits) – Any integer, float, or boolean scalar or vector type
Results:  a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:  bits – inferred from x

a =
bxor_not
x, y¶ Bitwise xor not.
Computes x ^ ~y.
Arguments:  x (bits) – Any integer, float, or boolean scalar or vector type
 y (bits) – Any integer, float, or boolean scalar or vector type
Results:  a (bits) – Any integer, float, or boolean scalar or vector type
Type Variables:  bits – inferred from x
The shift and rotate operations only work on integer types (scalar and vector). The shift amount does not have to be the same type as the value being shifted. Only the low B bits of the shift amount is significant.
When operating on an integer vector type, the shift amount is still a scalar type, and all the lanes are shifted the same amount. The shift amount is masked to the number of bits in a lane, not the full size of the vector type.

a =
rotl
x, y¶ Rotate left.
Rotate the bits in
x
byy
places.Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x
 iB – from input operand

a =
rotl_imm
x, Y¶ Rotate left by immediate.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
rotr
x, y¶ Rotate right.
Rotate the bits in
x
byy
places.Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x
 iB – from input operand

a =
rotr_imm
x, Y¶ Rotate right by immediate.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
ishl
x, y¶ Integer shift left. Shift the bits in
x
towards the MSB byy
places. Shift in zero bits to the LSB.The shift amount is masked to the size of
x
.When shifting a Bbits integer type, this instruction computes:
\[\begin{split}s &:= y \pmod B, \\ a &:= x \cdot 2^s \pmod{2^B}.\end{split}\]Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x
 iB – from input operand

a =
ishl_imm
x, Y¶ Integer shift left by immediate.
The shift amount is masked to the size of
x
.Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
ushr
x, y¶ Unsigned shift right. Shift bits in
x
towards the LSB byy
places, shifting in zero bits to the MSB. Also called a logical shift.The shift amount is masked to the size of the register.
When shifting a Bbits integer type, this instruction computes:
\[\begin{split}s &:= y \pmod B, \\ a &:= \lfloor x \cdot 2^{s} \rfloor.\end{split}\]Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x
 iB – from input operand

a =
ushr_imm
x, Y¶ Unsigned shift right by immediate.
The shift amount is masked to the size of the register.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x

a =
sshr
x, y¶ Signed shift right. Shift bits in
x
towards the LSB byy
places, shifting in sign bits to the MSB. Also called an arithmetic shift.The shift amount is masked to the size of the register.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x
 iB – from input operand

a =
sshr_imm
x, Y¶ Signed shift right by immediate.
The shift amount is masked to the size of the register.
Arguments: Results:  a (Int) – A scalar or vector integer type
Type Variables:  Int – inferred from x
The bitcounting instructions below are scalar only.

a =
clz
x¶ Count leading zero bits.
Starting from the MSB in
x
, count the number of zero bits before reaching the first one bit. Whenx
is zero, returns the size of x in bits.Arguments:  x (iB) – A scalar integer type
Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x

a =
cls
x¶ Count leading sign bits.
Starting from the MSB after the sign bit in
x
, count the number of consecutive bits identical to the sign bit. Whenx
is 0 or 1, returns one less than the size of x in bits.Arguments:  x (iB) – A scalar integer type
Results:  a (iB) – A scalar integer type
Type Variables:  iB – inferred from x
Floating point operations¶
These operations generally follow IEEE 7542008 semantics.

a =
fcmp
Cond, x, y¶ Floating point comparison.
Two IEEE 7542008 floating point numbers, x and y, relate to each other in exactly one of four ways:
UN Unordered when one or both numbers is NaN. EQ When \(x = y\). (And \(0.0 = 0.0\)). LT When \(x < y\). GT When \(x > y\). The 14
floatcc
condition codes each correspond to a subset of the four relations, except for the empty set which would always be false, and the full set which would always be true.The condition codes are divided into 7 ‘ordered’ conditions which don’t include UN, and 7 unordered conditions which all include UN.
Ordered Unordered Condition ord EQ  LT  GT uno UN NaNs absent / present. eq EQ ueq UN  EQ Equal one LT  GT ne UN  LT  GT Not equal lt LT ult UN  LT Less than le LT  EQ ule UN  LT  EQ Less than or equal gt GT ugt UN  GT Greater than ge GT  EQ uge UN  GT  EQ Greater than or equal The standard C comparison operators, <, <=, >, >=, are all ordered, so they are false if either operand is NaN. The C equality operator, ==, is ordered, and since inequality is defined as the logical inverse it is unordered. They map to the
floatcc
condition codes as follows:C Cond Subset == eq EQ != ne UN  LT  GT < lt LT <= le LT  EQ > gt GT >= ge GT  EQ This subset of condition codes also corresponds to the WebAssembly floating point comparisons of the same name.
When this instruction compares floating point vectors, it returns a boolean vector with the results of lanewise comparisons.
Arguments: Results:  a (as_bool(Float)) – None
Type Variables:  Float – inferred from x

f =
ffcmp
x, y¶ Floating point comparison returning flags.
Compares two numbers like
fcmp
, but returns floating point CPU flags instead of testing a specific condition.Arguments: Results: Type Variables:  Float – inferred from x

a =
fadd
x, y¶ Floating point addition.
Arguments: Results:  a (Float) – Result of applying operator to each lane
Type Variables:  Float – inferred from x

a =
fsub
x, y¶ Floating point subtraction.
Arguments: Results:  a (Float) – Result of applying operator to each lane
Type Variables:  Float – inferred from x

a =
fmul
x, y¶ Floating point multiplication.
Arguments: Results:  a (Float) – Result of applying operator to each lane
Type Variables:  Float – inferred from x

a =
fdiv
x, y¶ Floating point division.
Unlike the integer division instructions
sdiv
andudiv
, this can’t trap. Division by zero is infinity or NaN, depending on the dividend.Arguments: Results:  a (Float) – Result of applying operator to each lane
Type Variables:  Float – inferred from x

a =
sqrt
x¶ Floating point square root.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (Float) – Result of applying operator to each lane
Type Variables:  Float – inferred from x

a =
fma
x, y, z¶ Floating point fused multiplyandadd.
Computes \(a := xy+z\) without any intermediate rounding of the product.
Arguments: Results:  a (Float) – Result of applying operator to each lane
Type Variables:  Float – inferred from y
Sign bit manipulations¶
The sign manipulating instructions work as bitwise operations, so they don’t have special behavior for signaling NaN operands. The exponent and trailing significand bits are always preserved.

a =
fneg
x¶ Floating point negation.
Note that this is a pure bitwise operation.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (Float) –
x
with its sign bit inverted
Type Variables:  Float – inferred from x

a =
fabs
x¶ Floating point absolute value.
Note that this is a pure bitwise operation.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (Float) –
x
with its sign bit cleared
Type Variables:  Float – inferred from x
Minimum and maximum¶
These instructions return the larger or smaller of their operands. Note that unlike the IEEE 7542008 minNum and maxNum operations, these instructions return NaN when either input is NaN.
When comparing zeroes, these instructions behave as if \(0.0 < 0.0\).

a =
fmin
x, y¶ Floating point minimum, propagating NaNs.
If either operand is NaN, this returns a NaN.
Arguments: Results:  a (Float) – The smaller of
x
andy
Type Variables:  Float – inferred from x
 a (Float) – The smaller of
Rounding¶
These instructions round their argument to a nearby integral value, still represented as a floating point number.

a =
ceil
x¶ Round floating point round to integral, towards positive infinity.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (Float) –
x
rounded to integral value
Type Variables:  Float – inferred from x

a =
floor
x¶ Round floating point round to integral, towards negative infinity.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (Float) –
x
rounded to integral value
Type Variables:  Float – inferred from x
CPU flag operations¶

a =
trueif
Cond, f¶ Test integer CPU flags for a specific condition.
Check the CPU flags in
f
against theCond
condition code and return true when the condition code is satisfied.Arguments: Results:  a (b1) – A boolean type with 1 bits.
Conversion operations¶

a =
bitcast
x¶ Reinterpret the bits in x as a different type.
The input and output types must be storable to memory and of the same size. A bitcast is equivalent to storing one type and loading the other type from the same address.
Arguments:  x (Mem) – Any type that can be stored in memory
Results:  a (MemTo) – Bits of x reinterpreted
Type Variables:  MemTo – explicitly provided
 Mem – from input operand

a =
breduce
x¶ Convert x to a smaller boolean type in the platformdefined way.
The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a noop.
Arguments:  x (Bool) – A scalar or vector boolean type
Results:  a (BoolTo) – A smaller boolean type with the same number of lanes
Type Variables:  BoolTo – explicitly provided
 Bool – from input operand

a =
bextend
x¶ Convert x to a larger boolean type in the platformdefined way.
The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a noop.
Arguments:  x (Bool) – A scalar or vector boolean type
Results:  a (BoolTo) – A larger boolean type with the same number of lanes
Type Variables:  BoolTo – explicitly provided
 Bool – from input operand

a =
bint
x¶ Convert x to an integer.
True maps to 1 and false maps to 0. The result type must have the same number of vector lanes as the input.
Arguments:  x (Bool) – A scalar or vector boolean type
Results:  a (IntTo) – An integer type with the same number of lanes
Type Variables:  IntTo – explicitly provided
 Bool – from input operand

a =
bmask
x¶ Convert x to an integer mask.
True maps to all 1s and false maps to all 0s. The result type must have the same number of vector lanes as the input.
Arguments:  x (Bool) – A scalar or vector boolean type
Results:  a (IntTo) – An integer type with the same number of lanes
Type Variables:  IntTo – explicitly provided
 Bool – from input operand

a =
ireduce
x¶ Convert x to a smaller integer type by dropping high bits.
Each lane in x is converted to a smaller integer type by discarding the most significant bits. This is the same as reducing modulo \(2^n\).
The result type must have the same number of vector lanes as the input, and each lane must not have more bits that the input lanes. If the input and output types are the same, this is a noop.
Arguments:  x (Int) – A scalar or vector integer type
Results:  a (IntTo) – A smaller integer type with the same number of lanes
Type Variables:  IntTo – explicitly provided
 Int – from input operand

a =
uextend
x¶ Convert x to a larger integer type by zeroextending.
Each lane in x is converted to a larger integer type by adding zeroes. The result has the same numerical value as x when both are interpreted as unsigned integers.
The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a noop.
Arguments:  x (Int) – A scalar or vector integer type
Results:  a (IntTo) – A larger integer type with the same number of lanes
Type Variables:  IntTo – explicitly provided
 Int – from input operand

a =
sextend
x¶ Convert x to a larger integer type by signextending.
Each lane in x is converted to a larger integer type by replicating the sign bit. The result has the same numerical value as x when both are interpreted as signed integers.
The result type must have the same number of vector lanes as the input, and each lane must not have fewer bits that the input lanes. If the input and output types are the same, this is a noop.
Arguments:  x (Int) – A scalar or vector integer type
Results:  a (IntTo) – A larger integer type with the same number of lanes
Type Variables:  IntTo – explicitly provided
 Int – from input operand

a =
fpromote
x¶ Convert x to a larger floating point format.
Each lane in x is converted to the destination floating point format. This is an exact operation.
Cretonne currently only supports two floating point formats 
f32
andf64
. This may change in the future.The result type must have the same number of vector lanes as the input, and the result lanes must not have fewer bits than the input lanes. If the input and output types are the same, this is a noop.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (FloatTo) – A scalar or vector floating point number
Type Variables:  FloatTo – explicitly provided
 Float – from input operand

a =
fdemote
x¶ Convert x to a smaller floating point format.
Each lane in x is converted to the destination floating point format by rounding to nearest, ties to even.
Cretonne currently only supports two floating point formats 
f32
andf64
. This may change in the future.The result type must have the same number of vector lanes as the input, and the result lanes must not have more bits than the input lanes. If the input and output types are the same, this is a noop.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (FloatTo) – A scalar or vector floating point number
Type Variables:  FloatTo – explicitly provided
 Float – from input operand

a =
fcvt_to_uint
x¶ Convert floating point to unsigned integer.
Each lane in x is converted to an unsigned integer by rounding towards zero. If x is NaN or if the unsigned integral value cannot be represented in the result type, this instruction traps.
The result type must have the same number of vector lanes as the input.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (IntTo) – A larger integer type with the same number of lanes
Type Variables:  IntTo – explicitly provided
 Float – from input operand

a =
fcvt_to_sint
x¶ Convert floating point to signed integer.
Each lane in x is converted to a signed integer by rounding towards zero. If x is NaN or if the signed integral value cannot be represented in the result type, this instruction traps.
The result type must have the same number of vector lanes as the input.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (IntTo) – A larger integer type with the same number of lanes
Type Variables:  IntTo – explicitly provided
 Float – from input operand

a =
fcvt_from_uint
x¶ Convert unsigned integer to floating point.
Each lane in x is interpreted as an unsigned integer and converted to floating point using round to nearest, ties to even.
The result type must have the same number of vector lanes as the input.
Arguments:  x (Int) – A scalar or vector integer type
Results:  a (FloatTo) – A scalar or vector floating point number
Type Variables:  FloatTo – explicitly provided
 Int – from input operand

a =
fcvt_from_sint
x¶ Convert signed integer to floating point.
Each lane in x is interpreted as a signed integer and converted to floating point using round to nearest, ties to even.
The result type must have the same number of vector lanes as the input.
Arguments:  x (Int) – A scalar or vector integer type
Results:  a (FloatTo) – A scalar or vector floating point number
Type Variables:  FloatTo – explicitly provided
 Int – from input operand
Legalization operations¶
These instructions are used as helpers when legalizing types and operations for the target ISA.

lo, hi =
isplit
x¶ Split an integer into low and high parts.
Vectors of integers are split lanewise, so the results have the same number of lanes as the input, but the lanes are half the size.
Returns the low half of x and the high half of x as two independent values.
Arguments:  x (WideInt) – An integer type with lanes from i16 upwards
Results:  lo (half_width(WideInt)) – The low bits of x
 hi (half_width(WideInt)) – The high bits of x
Type Variables:  WideInt – inferred from x

a =
iconcat
lo, hi¶ Concatenate low and high bits to form a larger integer type.
Vectors of integers are concatenated lanewise such that the result has the same number of lanes as the inputs, but the lanes are twice the size.
Arguments:  lo (NarrowInt) – An integer type with lanes type to i32
 hi (NarrowInt) – An integer type with lanes type to i32
Results:  a (double_width(NarrowInt)) – The concatenation of lo and hi
Type Variables:  NarrowInt – inferred from lo
Extending loads and truncating stores¶
Most ISAs provide instructions that load an integer value smaller than a register and extends it to the width of the register. Similarly, store instructions that only write the low bits of an integer register are common.
In addition to the normal load
and store
instructions, Cretonne
provides extending loads and truncation stores for 8, 16, and 32bit memory
accesses.
These instructions succeed, trap, or have undefined behavior, under the same conditions as normal loads and stores.

a =
uload8
Flags, p, Offset¶ Load 8 bits from memory at
p + Offset
and zeroextend.This is equivalent to
load.i8
followed byuextend
.Arguments: Results:  a (iExt8) – An integer type with more than 8 bits
Type Variables:  iExt8 – explicitly provided
 iAddr – from input operand

a =
sload8
Flags, p, Offset¶ Load 8 bits from memory at
p + Offset
and signextend.This is equivalent to
load.i8
followed byuextend
.Arguments: Results:  a (iExt8) – An integer type with more than 8 bits
Type Variables:  iExt8 – explicitly provided
 iAddr – from input operand

istore8
Flags, x, p, Offset¶ Store the low 8 bits of
x
to memory atp + Offset
.This is equivalent to
ireduce.i8
followed bystore.i8
.Arguments: Type Variables:  iExt8 – inferred from x
 iAddr – from input operand

a =
uload16
Flags, p, Offset¶ Load 16 bits from memory at
p + Offset
and zeroextend.This is equivalent to
load.i16
followed byuextend
.Arguments: Results:  a (iExt16) – An integer type with more than 16 bits
Type Variables:  iExt16 – explicitly provided
 iAddr – from input operand

a =
sload16
Flags, p, Offset¶ Load 16 bits from memory at
p + Offset
and signextend.This is equivalent to
load.i16
followed byuextend
.Arguments: Results:  a (iExt16) – An integer type with more than 16 bits
Type Variables:  iExt16 – explicitly provided
 iAddr – from input operand

istore16
Flags, x, p, Offset¶ Store the low 16 bits of
x
to memory atp + Offset
.This is equivalent to
ireduce.i16
followed bystore.i16
.Arguments: Type Variables:  iExt16 – inferred from x
 iAddr – from input operand

a =
uload32
Flags, p, Offset¶ Load 32 bits from memory at
p + Offset
and zeroextend.This is equivalent to
load.i32
followed byuextend
.Arguments: Results:  a (iExt32) – An integer type with more than 32 bits
Type Variables:  iAddr – inferred from p

a =
sload32
Flags, p, Offset¶ Load 32 bits from memory at
p + Offset
and signextend.This is equivalent to
load.i32
followed byuextend
.Arguments: Results:  a (iExt32) – An integer type with more than 32 bits
Type Variables:  iAddr – inferred from p

istore32
Flags, x, p, Offset¶ Store the low 32 bits of
x
to memory atp + Offset
.This is equivalent to
ireduce.i32
followed bystore.i32
.Arguments: Type Variables:  iExt32 – inferred from x
 iAddr – from input operand
ISAspecific instructions¶
Target ISAs can define supplemental instructions that do not make sense to support generally.
Intel¶
Instructions that can only be used by the Intel target ISA.

q, r =
x86_sdivmodx
nlo, nhi, d¶ Extended signed division.
Concatenate the bits in nhi and nlo to form the numerator. Interpret the bits as a signed number and divide by the signed denominator d. Trap when d is zero or if the quotient is outside the range of the output.
Return both quotient and remainder.
Arguments:  nlo (iWord) – Low part of numerator
 nhi (iWord) – High part of numerator
 d (iWord) – Denominator
Results:  q (iWord) – Quotient
 r (iWord) – Remainder
Type Variables:  iWord – inferred from nhi

q, r =
x86_udivmodx
nlo, nhi, d¶ Extended unsigned division.
Concatenate the bits in nhi and nlo to form the numerator. Interpret the bits as an unsigned number and divide by the unsigned denominator d. Trap when d is zero or if the quotient is larger than the range of the output.
Return both quotient and remainder.
Arguments:  nlo (iWord) – Low part of numerator
 nhi (iWord) – High part of numerator
 d (iWord) – Denominator
Results:  q (iWord) – Quotient
 r (iWord) – Remainder
Type Variables:  iWord – inferred from nhi

a =
x86_cvtt2si
x¶ Convert with truncation floating point to signed integer.
The source floating point operand is converted to a signed integer by rounding towards zero. If the result can’t be represented in the output type, returns the smallest signed value the output type can represent.
This instruction does not trap.
Arguments:  x (Float) – A scalar or vector floating point number
Results:  a (IntTo) – An integer type with the same number of lanes
Type Variables:  IntTo – explicitly provided
 Float – from input operand

a =
x86_fmin
x, y¶ Floating point minimum with Intel semantics.
This is equivalent to the C ternary operator x < y ? x : y which differs from
fmin
when either operand is NaN or when comparing +0.0 to 0.0.When the two operands don’t compare as LT, y is returned unchanged, even if it is a signalling NaN.
Arguments: Results:  a (Float) – A scalar or vector floating point number
Type Variables:  Float – inferred from x

a =
x86_fmax
x, y¶ Floating point maximum with Intel semantics.
This is equivalent to the C ternary operator x > y ? x : y which differs from
fmax
when either operand is NaN or when comparing +0.0 to 0.0.When the two operands don’t compare as GT, y is returned unchanged, even if it is a signalling NaN.
Arguments: Results:  a (Float) – A scalar or vector floating point number
Type Variables:  Float – inferred from x
Instruction groups¶
All of the shared instructions are part of the base
instruction
group.

base.instructions.GROUP
Shared base instruction set
adjust_sp_imm
band
band_imm
band_not
bconst
bextend
bint
bitcast
bmask
bnot
bor
bor_imm
bor_not
br_icmp
br_table
breduce
brff
brif
brnz
brz
bxor
bxor_imm
bxor_not
call
call_indirect
ceil
cls
clz
copy
copy_special
ctz
extractlane
f32const
f64const
fabs
fadd
fallthrough
fcmp
fcopysign
fcvt_from_sint
fcvt_from_uint
fcvt_to_sint
fcvt_to_uint
fdemote
fdiv
ffcmp
fill
floor
fma
fmax
fmin
fmul
fneg
fpromote
fsub
func_addr
global_addr
globalsym_addr
heap_addr
iadd
iadd_carry
iadd_cin
iadd_cout
iadd_imm
icmp
icmp_imm
iconcat
iconst
ifcmp
ifcmp_imm
imul
imul_imm
insertlane
ireduce
irsub_imm
ishl
ishl_imm
isplit
istore16
istore32
istore8
isub
isub_bin
isub_borrow
isub_bout
jump
load
nearest
popcnt
regfill
regmove
regspill
return
rotl
rotl_imm
rotr
rotr_imm
sdiv
sdiv_imm
select
sextend
sload16
sload32
sload8
spill
splat
sqrt
srem
srem_imm
sshr
sshr_imm
stack_addr
stack_load
stack_store
store
trap
trapnz
trapz
trueff
trueif
trunc
udiv
udiv_imm
uextend
uload16
uload32
uload8
urem
urem_imm
ushr
ushr_imm
vconcat
vselect
vsplit
Target ISAs may define further instructions in their own instruction groups:

isa.intel.instructions.GROUP
Intelspecific instruction set
x86_cvtt2si
x86_fmax
x86_fmin
x86_pop
x86_push
x86_sdivmodx
x86_udivmodx
Implementation limits¶
Cretonne’s intermediate representation imposes some limits on the size of functions and the number of entities allowed. If these limits are exceeded, the implementation will panic.
 Number of instructions in a function
 At most \(2^{31}  1\).
 Number of EBBs in a function
At most \(2^{31}  1\).
Every EBB needs at least a terminator instruction anyway.
 Number of secondary values in a function
At most \(2^{31}  1\).
Secondary values are any SSA values that are not the first result of an instruction.
 Other entities declared in the preamble
At most \(2^{32}  1\).
This covers things like stack slots, jump tables, external functions, and function signatures, etc.
 Number of arguments to an EBB
 At most \(2^{16}\).
 Number of arguments to a function
At most \(2^{16}\).
This follows from the limit on arguments to the entry EBB. Note that Cretonne may add a handful of ABI register arguments as function signatures are lowered. This is for representing things like the link register, the incoming frame pointer, and calleesaved registers that are saved in the prologue.
 Size of function call arguments on the stack
At most \(2^{32}  1\) bytes.
This is probably not possible to achieve given the limit on the number of arguments, except by requiring extremely large offsets for stack arguments.
Glossary¶
 addressable
 Memory in which loads and stores have defined behavior. They either succeed or trap, depending on whether the memory is accessible.
 accessible
 Addressable memory in which loads and stores always succeed without trapping, except where specified otherwise (eg. with the aligned flag). Heaps, globals, and the stack may contain accessible, merely addressable, and outright unaddressable regions. There may also be additional regions of addressable and/or accessible memory not explicitly declared.
 basic block
 A maximal sequence of instructions that can only be entered from the top, and that contains no branch or terminator instructions except for the last instruction.
 entry block
 The EBB that is executed first in a function. Currently, a Cretonne function must have exactly one entry block which must be the first block in the function. The types of the entry block arguments must match the types of arguments in the function signature.
 extended basic block
 EBB
A maximal sequence of instructions that can only be entered from the top, and that contains no terminator instructions except for the last one. An EBB can contain conditional branches that can fall through to the following instructions in the block, but only the first instruction in the EBB can be a branch target.
The last instruction in an EBB must be a terminator instruction, so execution cannot flow through to the next EBB in the function. (But there may be a branch to the next EBB.)
Note that some textbooks define an EBB as a maximal subtree in the control flow graph where only the root can be a join node. This definition is not equivalent to Cretonne EBBs.
 EBB parameter
 A formal parameter for an EBB is an SSA value that dominates everything in the EBB. For each parameter declared by an EBB, a corresponding argument value must be passed when branching to the EBB. The function’s entry EBB has parameters that correspond to the function’s parameters.
 EBB argument
 Similar to function arguments, EBB arguments must be provided when branching to an EBB that declares formal parameters. When execution begins at the top of an EBB, the formal parameters have the values of the arguments passed in the branch.
 function signature
A function signature describes how to call a function. It consists of:
 The calling convention.
 The number of arguments and return values. (Functions can return multiple values.)
 Type and flags of each argument.
 Type and flags of each return value.
Not all function attributes are part of the signature. For example, a function that never returns could be marked as
noreturn
, but that is not necessary to know when calling it, so it is just an attribute, and not part of the signature. function preamble
A list of declarations of entities that are used by the function body. Some of the entities that can be declared in the preamble are:
 Local variables.
 Functions that are called directly.
 Function signatures for indirect function calls.
 Function flags and attributes that are not part of the signature.
 function body
 The extended basic blocks which contain all the executable code in a function. The function body follows the function preamble.
 intermediate language
 IL
 The language used to describe functions to Cretonne. This reference describes the syntax and semantics of the Cretonne IL. The IL has two forms: Textual and an inmemory intermediate representation (IR).
 intermediate representation
 IR
 The inmemory representation of IL. The data structures Cretonne uses to represent a program internally are called the intermediate representation. Cretonne’s IR can be converted to text losslessly.
 stack slot
 A fixed size memory allocation in the current function’s activation frame. Also called a local variable.
 terminator instruction
A control flow instruction that unconditionally directs the flow of execution somewhere else. Execution never continues at the instruction following a terminator instruction.
The basic terminator instructions are
br
,return
, andtrap
. Conditional branches and instructions that trap conditionally are not terminator instructions. trap
 traps
 trapping
 Terminates execution of the current thread. The specific behavior after a trap depends on the underlying OS. For example, a common behavior is delivery of a signal, with the specific signal depending on the event that triggered it.