Cretonne Meta Language Reference

The Cretonne meta language is used to define instructions for Cretonne. It is a domain specific language embedded in Python. This document describes the Python modules that form the embedded DSL.

The meta language descriptions are Python modules under the lib/cretonne/meta directory. The descriptions are processed in two steps:

  1. The Python modules are imported. This has the effect of building static data structures in global variables in the modules. These static data structures in the base and isa packages use the classes in the cdsl package to describe instruction sets and other properties.
  2. The static data structures are processed to produce Rust source code and constant tables.

The main driver for this source code generation process is the lib/cretonne/meta/build.py script which is invoked as part of the build process if anything in the lib/cretonne/meta directory has changed since the last build.

Settings

Settings are used by the environment embedding Cretonne to control the details of code generation. Each setting is defined in the meta language so a compact and consistent Rust representation can be generated. Shared settings are defined in the base.settings module. Some settings are specific to a target ISA, and defined in a settings.py module under the appropriate lib/cretonne/meta/isa/* directory.

Settings can take boolean on/off values, small numbers, or explicitly enumerated symbolic values. Each type is represented by a sub-class of Setting:

Inheritance diagram of Setting, BoolSetting, NumSetting, EnumSetting

class cdsl.settings.Setting(doc)

A named setting variable that can be configured externally to Cretonne.

Settings are normally not named when they are created. They get their name from the extract_names method.

class cdsl.settings.BoolSetting(doc, default=False)

A named setting with a boolean on/off value.

Parameters:
  • doc – Documentation string.
  • default – The default value of this setting.
class cdsl.settings.NumSetting(doc, default=0)

A named setting with an integral value in the range 0–255.

Parameters:
  • doc – Documentation string.
  • default – The default value of this setting.
class cdsl.settings.EnumSetting(doc, *args)

A named setting with an enumerated set of possible values.

The default value is always the first enumerator.

Parameters:
  • doc – Documentation string.
  • args – Tuple of unique strings representing the possible values.

All settings must belong to a group, represented by a SettingGroup object.

class cdsl.settings.SettingGroup(name, parent=None)

A group of settings.

Whenever a Setting object is created, it is added to the currently open group. A setting group must be closed explicitly before another can be opened.

Parameters:
  • name – Short mnemonic name for setting group.
  • parent – Parent settings group.

Normally, a setting group corresponds to all settings defined in a module. Such a module looks like this:

group = SettingGroup('example')

foo = BoolSetting('use the foo')
bar = BoolSetting('enable bars', True)
opt = EnumSetting('optimization level', 'Debug', 'Release')

group.close(globals())

Instruction descriptions

New instructions are defined as instances of the Instruction class. As instruction instances are created, they are added to the currently open InstructionGroup.

class cdsl.instructions.InstructionGroup(name, doc)

Every instruction must belong to exactly one instruction group. A given target architecture can support instructions from multiple groups, and it does not necessarily support all instructions in a group.

New instructions are automatically added to the currently open instruction group.

close()

Close this instruction group. This function should be called before opening another instruction group.

open()

Open this instruction group such that future new instructions are added to this group.

The basic Cretonne instruction set described in Cretonne Language Reference is defined by the Python module base.instructions. This module has a global variable base.instructions.GROUP which is an InstructionGroup instance containing all the base instructions.

class cdsl.instructions.Instruction(name, doc, ins=(), outs=(), constraints=(), **kwargs)

The operands to the instruction are specified as two tuples: ins and outs. Since the Python singleton tuple syntax is a bit awkward, it is allowed to specify a singleton as just the operand itself, i.e., ins=x and ins=(x,) are both allowed and mean the same thing.

Parameters:
  • name – Instruction mnemonic, also becomes opcode name.
  • doc – Documentation string.
  • ins – Tuple of input operands. This can be a mix of SSA value operands and other operand kinds.
  • outs – Tuple of output operands. The output operands must be SSA values or variable_args.
  • constraints – Tuple of instruction-specific TypeConstraints.
  • is_terminator – This is a terminator instruction.
  • is_branch – This is a branch instruction.
  • is_call – This is a call instruction.
  • is_return – This is a return instruction.
  • can_trap – This instruction can trap.
  • can_load – This instruction can load from memory.
  • can_store – This instruction can store to memory.
  • other_side_effects – Instruction has other side effects.

An instruction is defined with a set of distinct input and output operands which must be instances of the Operand class.

class cdsl.operands.Operand(name, typ, doc='')

An instruction operand can be an immediate, an SSA value, or an entity reference. The type of the operand is one of:

  1. A ValueType instance indicates an SSA value operand with a concrete type.
  2. A TypeVar instance indicates an SSA value operand, and the instruction is polymorphic over the possible concrete types that the type variable can assume.
  3. An ImmediateKind instance indicates an immediate operand whose value is encoded in the instruction itself rather than being passed as an SSA value.
  4. An EntityRefKind instance indicates an operand that references another entity in the function, typically something declared in the function preamble.

Cretonne uses two separate type systems for operand kinds and SSA values.

Type variables

Instruction descriptions can be made polymorphic by using cdsl.operands.Operand instances that refer to a type variable instead of a concrete value type. Polymorphism only works for SSA value operands. Other operands have a fixed operand kind.

class cdsl.typevar.TypeVar(name, doc, ints=False, floats=False, bools=False, scalars=True, simd=False, bitvecs=False, base=None, derived_func=None, specials=None)

Type variables can be used in place of concrete types when defining instructions. This makes the instructions polymorphic.

A type variable is restricted to vary over a subset of the value types. This subset is specified by a set of flags that control the permitted base types and whether the type variable can assume scalar or vector types, or both.

Parameters:
  • name – Short name of type variable used in instruction descriptions.
  • doc – Documentation string.
  • ints – Allow all integer base types, or (min, max) bit-range.
  • floats – Allow all floating point base types, or (min, max) bit-range.
  • bools – Allow all boolean base types, or (min, max) bit-range.
  • scalars – Allow type variable to assume scalar types.
  • simd – Allow type variable to assume vector types, or (min, max) lane count range.
  • bitvecs – Allow all BitVec base types, or (min, max) bit-range.
as_bool()

Return a derived type variable that has the same vector geometry as this type variable, but with boolean lanes. Scalar types map to b1.

constrain_types(other)

Constrain the range of types this variable can assume to a subset of those other can assume.

constrain_types_by_ts(ts)

Constrain the range of types this variable can assume to a subset of those in the typeset ts.

static derived(base, derived_func)

Create a type variable that is a function of another.

double_vector()

Return a derived type variable that has twice the number of vector lanes as this one, with the same lane type.

double_width()

Return a derived type variable that has the same number of vector lanes as this one, but the lanes are double the width.

free_typevar()

Get the free type variable controlling this one.

static from_typeset(ts)

Create a type variable from a type set.

get_fresh_copy(name)

Get a fresh copy of self. Can only be called on free typevars.

get_typeset()

Returns the typeset for this TV. If the TV is derived, computes it recursively from the derived function and the base’s typeset.

half_vector()

Return a derived type variable that has half the number of vector lanes as this one, with the same lane type.

half_width()

Return a derived type variable that has the same number of vector lanes as this one, but the lanes are half the width.

lane_of()

Return a derived type variable that is the scalar lane type of this type variable.

When this type variable assumes a scalar type, the derived type will be the same scalar type.

rust_expr()

Get a Rust expression that computes the type of this type variable.

static singleton(typ)

Create a type variable that can only assume a single type.

singleton_type()

If the associated typeset has a single type return it. Otherwise return None

to_bitvec()

Return a derived type variable that represent a flat bitvector with the same size as self

If multiple operands refer to the same type variable they will be required to have the same concrete type. For example, this defines an integer addition instruction:

Int = TypeVar('Int', 'A scalar or vector integer type', ints=True, simd=True)
a = Operand('a', Int)
x = Operand('x', Int)
y = Operand('y', Int)

iadd = Instruction('iadd', 'Integer addition', ins=(x, y), outs=a)

The type variable Int is allowed to vary over all scalar and vector integer value types, but in a given instance of the iadd instruction, the two operands must have the same type, and the result will be the same type as the inputs.

There are some practical restrictions on the use of type variables, see Restricted polymorphism.

Immediate operands

Immediate instruction operands don’t correspond to SSA values, but have values that are encoded directly in the instruction. Immediate operands don’t have types from the cdsl.types.ValueType type system; they often have enumerated values of a specific type. The type of an immediate operand is indicated with an instance of ImmediateKind.

class cdsl.operands.ImmediateKind(name, doc, default_member='imm', rust_type=None, values=None)

The kind of an immediate instruction operand.

Parameters:default_member – The default member name of this kind the InstructionData data structure.

The cretonne.immediates module predefines all the Cretonne immediate operand types.

base.immediates.boolean = ImmediateKind(bool)

An immediate boolean operand.

This type of immediate boolean can interact with SSA values with any cretonne.BoolType type.

base.immediates.floatcc = ImmediateKind(floatcc)

A condition code for comparing floating point values.

This enumerated operand kind is used for the fcmp instruction and corresponds to the condcodes::FloatCC Rust type.

base.immediates.ieee32 = ImmediateKind(ieee32)

A 32-bit immediate floating point operand.

IEEE 754-2008 binary32 interchange format.

base.immediates.ieee64 = ImmediateKind(ieee64)

A 64-bit immediate floating point operand.

IEEE 754-2008 binary64 interchange format.

base.immediates.imm64 = ImmediateKind(imm64)

A 64-bit immediate integer operand.

This type of immediate integer can interact with SSA values with any cretonne.IntType type.

base.immediates.intcc = ImmediateKind(intcc)

A condition code for comparing integer values.

This enumerated operand kind is used for the icmp instruction and corresponds to the condcodes::IntCC Rust type.

base.immediates.memflags = ImmediateKind(memflags)

Flags for memory operations like load and store.

base.immediates.offset32 = ImmediateKind(offset32)

A 32-bit immediate signed offset.

This is used to represent an immediate address offset in load/store instructions.

base.immediates.regunit = ImmediateKind(regunit)

A register unit in the current target ISA.

base.immediates.trapcode = ImmediateKind(trapcode)

A trap code indicating the reason for trapping.

The Rust enum type also has a User(u16) variant for user-provided trap codes.

base.immediates.uimm32 = ImmediateKind(uimm32)

An unsigned 32-bit immediate integer operand.

base.immediates.uimm8 = ImmediateKind(uimm8)

An unsigned 8-bit immediate integer operand.

This small operand is used to indicate lane indexes in SIMD vectors and immediate bit counts on shift instructions.

Entity references

Instruction operands can also refer to other entities in the same function. This can be extended basic blocks, or entities declared in the function preamble.

class cdsl.operands.EntityRefKind(name, doc, default_member=None, rust_type=None)

The kind of an entity reference instruction operand.

The cretonne.entities module predefines all the Cretonne entity reference operand types. There are corresponding definitions in the cretonne.entities Rust module.

base.entities.ebb = EntityRefKind(ebb)

A reference to an extended basic block in the same function. This is primarliy used in control flow instructions.

base.entities.func_ref = EntityRefKind(func_ref)

A reference to an external function declared in the function preamble. This is used to provide the callee and signature in a call instruction.

base.entities.global_var = EntityRefKind(global_var)

A reference to a global variable.

base.entities.heap = EntityRefKind(heap)

A reference to a heap declared in the function preamble.

base.entities.jump_table = EntityRefKind(jump_table)

A reference to a jump table declared in the function preamble.

base.entities.sig_ref = EntityRefKind(sig_ref)

A reference to a function sugnature declared in the function preamble. Tbis is used to provide the call signature in an indirect call instruction.

base.entities.stack_slot = EntityRefKind(stack_slot)

A reference to a stack slot declared in the function preamble.

Value types

Concrete value types are represented as instances of ValueType. There are subclasses to represent scalar and vector types.

class cdsl.types.ValueType(name, membytes, doc)

A concrete SSA value type.

All SSA values have a type that is described by an instance of ValueType or one of its subclasses.

Inheritance diagram of ValueType, LaneType, VectorType, IntType, FloatType, BoolType, SpecialType, FlagsType

class cdsl.types.LaneType(name, membytes, doc)

A concrete scalar type that can appear as a vector lane too.

Also tracks a unique set of VectorType instances with this type as the lane type.

by(lanes)

Get a vector type with this type as the lane type.

For example, i32.by(4) returns the i32x4 type.

lane_count()

Return the number of lanes.

class cdsl.types.VectorType(base, lanes)

A concrete SIMD vector type.

A vector type has a lane type which is an instance of LaneType, and a positive number of lanes.

class cdsl.types.SpecialType(name, membytes, doc)

A concrete scalar type that is neither a vector nor a lane type.

Special types cannot be used to form vectors.

class cdsl.types.IntType(bits)

A concrete scalar integer type.

class cdsl.types.FloatType(bits, doc)

A concrete scalar floating point type.

class cdsl.types.BoolType(bits)

A concrete scalar boolean type.

class cdsl.types.FlagsType(name, doc)

A type representing CPU flags.

Flags can’t be stored in memory.

The base.types module predefines all the Cretonne scalar types.

base.types.b1 = BoolType(bits=1)

1-bit bool. Type is abstract (can’t be stored in mem)

base.types.b16 = BoolType(bits=16)

16-bit bool.

base.types.b32 = BoolType(bits=32)

32-bit bool.

base.types.b64 = BoolType(bits=64)

64-bit bool.

base.types.b8 = BoolType(bits=8)

8-bit bool.

base.types.f32 = FloatType(bits=32)

IEEE single precision.

base.types.f64 = FloatType(bits=64)

IEEE double precision.

base.types.fflags = FlagsType(fflags)

CPU flags from a floating point comparison.

base.types.i16 = IntType(bits=16)

16-bit int.

base.types.i32 = IntType(bits=32)

32-bit int.

base.types.i64 = IntType(bits=64)

64-bit int.

base.types.i8 = IntType(bits=8)

8-bit int.

base.types.iflags = FlagsType(iflags)

CPU flags from an integer comparison.

There are no predefined vector types, but they can be created as needed with the LaneType.by() function.

Instruction representation

The Rust in-memory representation of instructions is derived from the instruction descriptions. Part of the representation is generated, and part is written as Rust code in the cretonne.instructions module. The instruction representation depends on the input operand kinds and whether the instruction can produce multiple results.

class cdsl.operands.OperandKind(name, doc, default_member=None, rust_type=None)

An instance of the OperandKind class corresponds to a kind of operand. Each operand kind has a corresponding type in the Rust representation of an instruction.

Inheritance diagram of OperandKind, ImmediateKind, EntityRefKind

Since all SSA value operands are represented as a Value in Rust code, value types don’t affect the representation. Two special operand kinds are used to represent SSA values:

cdsl.operands.VALUE = OperandKind(value)

An SSA value operand. This is a value defined by another instruction.

cdsl.operands.VARIABLE_ARGS = OperandKind(variable_args)

A variable-sized list of value operands. Use for Ebb and function call arguments.

When an instruction description is created, it is automatically assigned a predefined instruction format which is an instance of InstructionFormat:

class cdsl.formats.InstructionFormat(*kinds, **kwargs)

Every instruction opcode has a corresponding instruction format which determines the number of operands and their kinds. Instruction formats are identified structurally, i.e., the format of an instruction is derived from the kinds of operands used in its declaration.

The instruction format stores two separate lists of operands: Immediates and values. Immediate operands (including entity references) are represented as explicit members in the InstructionData variants. The value operands are stored differently, depending on how many there are. Beyond a certain point, instruction formats switch to an external value list for storing value arguments. Value lists can hold an arbitrary number of values.

All instruction formats must be predefined in the cretonne.formats module.

Parameters:
  • kinds – List of OperandKind objects describing the operands.
  • name – Instruction format name in CamelCase. This is used as a Rust variant name in both the InstructionData and InstructionFormat enums.
  • typevar_operand – Index of the value input operand that is used to infer the controlling type variable. By default, this is 0, the first value operand. The index is relative to the values only, ignoring immediate operands.

Restricted polymorphism

The instruction format strictly controls the kinds of operands on an instruction, but it does not constrain value types at all. A given instruction description typically does constrain the allowed value types for its value operands. The type variables give a lot of freedom in describing the value type constraints, in practice more freedom than what is needed for normal instruction set architectures. In order to simplify the Rust representation of value type constraints, some restrictions are imposed on the use of type variables.

A polymorphic instruction has a single controlling type variable. For a given opcode, this type variable must be the type of the first result or the type of the input value operand designated by the typevar_operand argument to the InstructionFormat constructor. By default, this is the first value operand, which works most of the time.

The value types of instruction results must be one of the following:

  1. A concrete value type.
  2. The controlling type variable.
  3. A type variable derived from the controlling type variable.

This means that all result types can be computed from the controlling type variable.

Input values to the instruction are allowed a bit more freedom. Input value types must be one of:

  1. A concrete value type.
  2. The controlling type variable.
  3. A type variable derived from the controlling type variable.
  4. A free type variable that is not used by any other operands.

This means that the type of an input operand can either be computed from the controlling type variable, or it can vary independently of the other operands.

Encodings

Encodings describe how Cretonne instructions are mapped to binary machine code for the target architecture. After the legalization pass, all remaining instructions are expected to map 1-1 to native instruction encodings. Cretonne instructions that can’t be encoded for the current architecture are called illegal instructions.

Some instruction set architectures have different CPU modes with incompatible encodings. For example, a modern ARMv8 CPU might support three different CPU modes: A64 where instructions are encoded in 32 bits, A32 where all instructions are 32 bits, and T32 which has a mix of 16-bit and 32-bit instruction encodings. These are incompatible encoding spaces, and while an iadd instruction can be encoded in 32 bits in each of them, it’s not the same 32 bits. It’s a judgement call if CPU modes should be modelled as separate targets, or as sub-modes of the same target. In the ARMv8 case, the different register banks means that it makes sense to model A64 as a separate target architecture, while A32 and T32 are CPU modes of the 32-bit ARM target.

In a given CPU mode, there may be multiple valid encodings of the same instruction. Both RISC-V and ARMv8’s T32 mode have 32-bit encodings of all instructions with 16-bit encodings available for some opcodes if certain constraints are satisfied.

class cdsl.isa.CPUMode(name, isa)

A CPU mode determines which instruction encodings are active.

All instruction encodings are associated with exactly one CPUMode, and all CPU modes are associated with exactly one TargetISA.

Parameters:
  • name – Short mnemonic name for the CPU mode.
  • target – Associated TargetISA.

Encodings are guarded by sub-target predicates. For example, the RISC-V “C” extension which specifies the compressed encodings may not be supported, and a predicate would be used to disable all of the 16-bit encodings in that case. This can also affect whether an instruction is legal. For example, x86 has a predicate that controls the SSE 4.1 instruction encodings. When that predicate is false, the SSE 4.1 instructions are not available.

Encodings also have a instruction predicate which depends on the specific values of the instruction’s immediate fields. This is used to ensure that immediate address offsets are within range, for example. The instructions in the base Cretonne instruction set can often represent a wider range of immediates than any specific encoding. The fixed-size RISC-style encodings tend to have more range limitations than CISC-style variable length encodings like x86.

The diagram below shows the relationship between the classes involved in specifying instruction encodings:

digraph encoding { node [shape=record] EncRecipe -> SubtargetPred EncRecipe -> InstrFormat EncRecipe -> InstrPred Encoding [label="{Encoding|Opcode+TypeVars}"] Encoding -> EncRecipe [label="+EncBits"] Encoding -> CPUMode Encoding -> SubtargetPred Encoding -> InstrPred Encoding -> Opcode Opcode -> InstrFormat CPUMode -> Target }

An Encoding instance specifies the encoding of a concrete instruction. The following properties are used to select instructions to be encoded:

  • An opcode, i.e. iadd_imm, that must match the instruction’s opcode.
  • Values for any type variables if the opcode represents a polymorphic instruction.
  • An instruction predicate that must be satisfied by the instruction’s immediate operands.
  • The CPU mode that must be active.
  • A sub-target predicate that must be satisfied by the currently active sub-target.

An encoding specifies an encoding recipe along with some encoding bits that the recipe can use for native opcode fields etc. The encoding recipe has additional constraints that must be satisfied:

The additional predicates in the EncRecipe are merged with the per-encoding predicates when generating the encoding matcher code. Often encodings only need the recipe predicates.

class cdsl.isa.EncRecipe(name, format, size, ins, outs, branch_range=None, clobbers_flags=True, instp=None, isap=None, emit=None)

A recipe for encoding instructions with a given format.

Many different instructions can be encoded by the same recipe, but they must all have the same instruction format.

The ins and outs arguments are tuples specifying the register allocation constraints for the value operands and results respectively. The possible constraints for an operand are:

  • A RegClass specifying the set of allowed registers.
  • A Register specifying a fixed-register operand.
  • An integer indicating that this result is tied to a value operand, so they must use the same register.
  • A Stack specifying a value in a stack slot.

The branch_range argument must be provided for recipes that can encode branch instructions. It is an (origin, bits) tuple describing the exact range that can be encoded in a branch instruction.

For ISAs that use CPU flags in iflags and fflags value types, the clobbers_flags is used to indicate instruction encodings that clobbers the CPU flags, so they can’t be used where a flag value is live.

Parameters:
  • name – Short mnemonic name for this recipe.
  • format – All encoded instructions must have this InstructionFormat.
  • size – Number of bytes in the binary encoded instruction.
  • ins – Tuple of register constraints for value operands.
  • outs – Tuple of register constraints for results.
  • branch_range(origin, bits) range for branches.
  • clobbers_flags – This instruction clobbers iflags and fflags.
  • instp – Instruction predicate.
  • isap – ISA predicate.
  • emit – Rust code for binary emission.

Register constraints

After an encoding recipe has been chosen for an instruction, it is the register allocator’s job to make sure that the recipe’s Register constraints are satisfied. Most ISAs have separate integer and floating point registers, and instructions can usually only use registers from one of the banks. Some instruction encodings are even more constrained and can only use a subset of the registers in a bank. These constraints are expressed in terms of register classes.

Sometimes the result of an instruction is placed in a register that must be the same as one of the input registers. Some instructions even use a fixed register for inputs or results.

Each encoding recipe specifies separate constraints for its value operands and result. These constraints are separate from the instruction predicate which can only evaluate the instruction’s immediate operands.

class cdsl.registers.RegBank(name, isa, doc, units, pressure_tracking=True, prefix='r', names=())

A register bank belonging to an ISA.

A register bank controls a set of register units disjoint from all the other register banks in the ISA. The register units are numbered uniquely within the target ISA, and the units in a register bank form a contiguous sequence starting from a sufficiently aligned point that their low bits can be used directly when encoding machine code instructions.

Register units can be given generated names like r0, r1, ..., or a tuple of special register unit names can be provided.

Parameters:
  • name – Name of this register bank.
  • doc – Documentation string.
  • units – Number of register units.
  • pressure_tracking – Enable tracking of register pressure.
  • prefix – Prefix for generated unit names.
  • names – Special names for the first units. May be shorter than units, the remaining units are named using prefix.

Register class constraints

The most common type of register constraint is the register class. It specifies that an operand or result must be allocated one of the registers from the given register class:

IntRegs = RegBank('IntRegs', ISA, 'General purpose registers', units=16, prefix='r')
GPR = RegClass(IntRegs)
R = EncRecipe('R', Binary, ins=(GPR, GPR), outs=GPR)

This defines an encoding recipe for the Binary instruction format where both input operands must be allocated from the GPR register class.

class cdsl.registers.RegClass(bank, count=None, width=1, start=0)

A register class is a subset of register units in a RegBank along with a strategy for allocating registers.

The width parameter determines how many register units are allocated at a time. Usually it that is one, but for example the ARM D registers are allocated two units at a time. When multiple units are allocated, it is always a contiguous set of unit numbers.

Parameters:
  • bank – The register bank we’re allocating from.
  • count – The maximum number of allocations in this register class. By default, the whole register bank can be allocated.
  • width – How many units to allocate at a time.
  • start – The first unit to allocate, relative to bank.first.unit.

Tied register operands

In more compact machine code encodings, it is common to require that the result register is the same as one of the inputs. This is represented with tied operands:

CR = EncRecipe('CR', Binary, ins=(GPR, GPR), outs=0)

This indicates that the result value must be allocated to the same register as the first input value. Tied operand constraints can only be used for result values, so the number always refers to one of the input values.

Fixed register operands

Some instructions use hard-coded input and output registers for some value operands. An example is the pblendvb Intel SSE instruction which takes one of its three value operands in the hard-coded %xmm0 register:

XMM0 = FPR[0]
SSE66_XMM0 = EncRecipe('SSE66_XMM0', Ternary, ins=(FPR, FPR, XMM0), outs=0)

The syntax FPR[0] selects the first register from the FPR register class which consists of all the XMM registers.

Stack operands

Cretonne’s register allocator can assign an SSA value to a stack slot if there isn’t enough registers. It will insert spill and fill instructions as needed to satisfy instruction operand constraints, but it is also possible to have instructions that can access stack slots directly:

CSS = EncRecipe('CSS', Unary, ins=GPR, outs=Stack(GPR))

An output stack value implies a store to the stack, an input value implies a load.

Targets

Cretonne can be compiled with support for multiple target instruction set architectures. Each ISA is represented by a cdsl.isa.TargetISA instance.

class cdsl.isa.TargetISA(name, instruction_groups)

A target instruction set architecture.

The TargetISA class collects everything known about a target ISA.

Parameters:
  • name – Short mnemonic name for the ISA.
  • instruction_groups – List of InstructionGroup instances that are relevant for this ISA.

The definitions for each supported target live in a package under lib/cretonne/meta/isa.

Cretonne target ISA definitions

The isa package contains sub-packages for each target instruction set architecture supported by Cretonne.

isa.all_isas()

Get a list of all the supported target ISAs. Each target ISA is represented as a cretonne.TargetISA instance.

RISC-V Target

RISC-V is an open instruction set architecture originally developed at UC Berkeley. It is a RISC-style ISA with either a 32-bit (RV32I) or 64-bit (RV32I) base instruction set and a number of optional extensions:

RV32M / RV64M
Integer multiplication and division.
RV32A / RV64A
Atomics.
RV32F / RV64F
Single-precision IEEE floating point.
RV32D / RV64D
Double-precision IEEE floating point.
RV32G / RV64G
General purpose instruction sets. This represents the union of the I, M, A, F, and D instruction sets listed above.

Intel Target Architecture

This target ISA generates code for Intel CPUs with two separate CPU modes:

I32
IA-32 architecture, also known as ‘x86’. Generates code for the Intel 386 and later processors in 32-bit mode.
I64
Intel 64 architecture, also known as ‘x86-64, ‘x64’, and ‘amd64’. Intel and AMD CPUs running in 64-bit mode.

Floating point is supported only on CPUs with support for SSE2 or later. There is no x87 floating point support.

ARM 32-bit Architecture

This target ISA generates code for ARMv7 and ARMv8 CPUs in 32-bit mode (AArch32). We support both ARM and Thumb2 instruction encodings.

ARM 64-bit Architecture

ARMv8 CPUs running the Aarch64 architecture.

Glossary

Illegal instruction
An instruction is considered illegal if there is no encoding available for the current CPU mode. The legality of an instruction depends on the value of sub-target predicates, so it can’t always be determined ahead of time.
CPU mode
Every target defines one or more CPU modes that determine how the CPU decodes binary instructions. Some CPUs can switch modes dynamically with a branch instruction (like ARM/Thumb), while other modes are process-wide (like x86 32/64-bit).
Sub-target predicate
A predicate that depends on the current sub-target configuration. Examples are “Use SSE 4.1 instructions”, “Use RISC-V compressed encodings”. Sub-target predicates can depend on both detected CPU features and configuration settings.
Instruction predicate
A predicate that depends on the immediate fields of an instruction. An example is “the load address offset must be a 10-bit signed integer”. Instruction predicates do not depend on the registers selected for value operands.
Register constraint
Value operands and results correspond to machine registers. Encodings may constrain operands to either a fixed register or a register class. There may also be register constraints between operands, for example some encodings require that the result register is one of the input registers.