SIR, IR, and Lowering¶
Accepted
Accepted for the V1 compiler-facing SIR/IR schema obligations and representative lowering recipes. Exact opcode names, dump syntax, block-argument representation, and optimizer strategy remain implementation details.
This page defines the handoff contract between semantic analysis, lowering, and backend consumers. Feature pages own source semantics. This page owns the compiler-facing facts that must survive long enough for lowering and backend emission without backends reaching back into AST or SIR.
Ownership Rule¶
SIR owns resolved source semantics. IR owns executable backend-neutral representation.
SIR may carry language-level semantic evidence such as selected conformances, selected operations, cleanup obligations, attribute-produced semantic facts, type identities, source spans, and comptime facts. IR must not carry unresolved source names, semantic contracts, generic parameters, source-level method calls, attribute syntax, AST IDs, or SIR IDs. IR may carry source-derived debug names and source spans as provenance for diagnostics, deterministic dumps, and debug-friendly generated internal symbols, but those provenance fields are not semantic identity, lookup keys, verifier facts, ABI names, or linkage names.
Backends consume verified IR only. If a backend needs a fact that is not present in IR, the fix belongs in SIR-to-IR lowering or in the IR schema.
SIR Schema Obligations¶
SIR is typed and resolved but still language-level. It must be able to represent:
- resolved declarations, functions, parameters, locals, fields, types, modules, namespaces, and imports;
- typed expressions, statements, places, values, blocks, and control-flow constructs;
- resolved calls, including inherent calls, contract-operation calls, compiler intrinsic calls, direct function calls, indirect function-pointer calls, dynamic calls, and materialized default arguments for omitted trailing parameters;
- selected conformances, concrete implementer types, substituted operation signatures, receiver lowering modes, required coercions, autoref decisions, temporary materialization, and result types;
- function item values, including declaration identity, instantiated semantic signature, runtime-callable versus
comptime fncallability, and function-pointer coercion eligibility; - cleanup obligations for lexical scopes, including explicit
defer, hidden cleanup for source-created temporaries, resource liveness, live-overwrite cleanup, move-disabled cleanup, storage reinitialization, and all normal/early/error/control-flow exit edges; - fallible expression semantics, including success type, error type, error-set coercions/unions,
try,catch, and propagation cleanup obligations; - optional value semantics, including presence tests,
null,orelse, forced unwrap, conditional optional binding, payload copy/move rules, and optional-to-bool coercion; - slice, array, range, index, element-place, and mutability facts, including bounds facts proven by sema;
- final semantic metadata after attribute-provider evaluation, plus inspectable attribute reflection/provenance where accepted;
- C ABI facts validated by sema, including call convention, export/import/linkage status, link name, and ABI-safe signature facts;
- compiler-owned comptime handle values and demanded comptime evaluation requests;
- source span references for diagnostics, tooling, generated cleanup, inserted coercions, checks, traps, and lowering explanations.
SIR must not preserve purely syntactic ambiguity. For example, a + b in SIR records the selected operation/conformance/intrinsic and result type, not merely a source operator token.
IR Schema Obligations¶
IR is a structured non-SSA backend-facing representation. V1 freezes verifier-visible categories and representational obligations, not a final serialized format.
IR has at least these categories:
- module, function, block, parameter, local, value, type reference, constant, instruction, terminator, and optional source-span reference;
- optional source-derived debug names for declarations or blocks where they improve dumps or generated internal symbol readability;
- backend-neutral primitive, aggregate, pointer, optional, slice, result, dynamic pointer, descriptor, and comptime-handle value categories where needed by accepted V1 lowering;
- function facts for call convention, linkage, export/import status, external symbol name, parameter types, return type, and ABI-safe lowered types.
IR identities are local to the IR artifact. Lowering may use SIR symbols, declarations, expressions, and type IDs while translating, but the resulting IR must refer to IR-local function, type, block, local, parameter, value, and instruction IDs. Direct calls target IR function IDs, not SIR declaration or symbol IDs. Forward references are allowed when lowering creates function shells before lowering bodies; the verifier checks target existence and signature compatibility.
Runtime IR constants are typed. Sema may use comptime_int internally, but a runtime IR integer constant has an explicit IR integer type and a payload that is representable by that type. Untyped integer constants do not survive into runtime IR.
IR instruction families must be able to represent:
- control flow: branch, conditional branch, return, unreachable, and trap;
- calls: direct calls, indirect function-pointer calls, and dynamic descriptor/vtable calls;
- memory and places: locals, uninitialized storage, address-taking, dereference, load, store, field address/load/store, pointer casts, opaque pointer casts, placement initialization, and lifetime/end-of-storage facts when needed by verifier or cleanup;
- aggregates, optionals, and slices: aggregate construction/extraction, optional construction/test/extraction, array element address, slice construction, slice pointer, slice length, slice element address, and slice range construction;
- checks and traps: bounds checks, slice-range checks, overflow checks, optional-present checks, alignment checks, and stable trap kinds;
- error results: construct success, construct error, test error, extract success, and extract error for backend-neutral
T!Erepresentation; - dynamic dispatch: dynamic pointer construction, descriptor constants, descriptor selection for upcasts/subsets, slot loads or dynamic calls with concrete ABI signatures, and runtime-storable dispose/pointer descriptors;
- compiler-owned comptime handles and compiler-intrinsic calls for comptime-executed units only.
IR cleanup is ordinary control flow: explicit calls, stores, branches, and returns. IR must not contain a source-level defer operation that backends reinterpret.
The following tables define the minimum required V1 families for phase-boundary handoff, not an exhaustive opcode catalog. Implementations may add ordinary scalar, aggregate, constant, conversion, and helper operations as needed, provided they preserve the phase boundary and verifier-visible facts. The descriptive names in the tables are not required opcode spellings. An implementation may combine or split operations, such as representing field_load as field_addr followed by load, when the verifier-visible facts remain equivalent.
IR Value Categories¶
| Category | Represents | Verifier obligations |
|---|---|---|
| Primitive scalar | Concrete booleans, integers, floats, error tags, enum tags, and pointer-sized values. | Width, signedness, target facts, and operation result types are explicit. |
| Pointer | Typed *T / *const T values. |
Pointee type, mutability, and load/store legality are known. |
| Opaque pointer | Erased *opaque / *const opaque storage pointers. |
Typed recovery requires an explicit cast/recovery operation with alignment/layout facts. |
| Aggregate value | Struct-like values, arrays, enums, and transparent descriptors when represented as values. | Field/member access uses known layout and field identity. |
| Optional value | ?T values, including null and present payloads. |
Payload type is explicit; presence tests and payload extraction match the optional layer. |
| Slice descriptor | []T / []const T pointer-plus-length values. |
Element type, mutability, pointer field, and usize length field are explicit. |
| Result value | Backend-neutral representation of T!E. |
Success and error component types match all result operations. |
| Function pointer | Runtime-callable *const fn(...) Return values. |
Signature and call convention are explicit; comptime fn items cannot appear here. |
| Dynamic pointer | Borrowed *dyn C / *const dyn C values represented as data pointer plus descriptor. |
Receiver mutability, descriptor type, and dynamic-call signatures are explicit. |
| Runtime descriptor | Runtime-storable pointer, vtable, dispose, or owner-helper descriptor values produced from comptime metadata. | Descriptor kind and compatible dynamic surface or dispose signature are explicit. |
| Comptime handle | Type, Namespace, Scope, Type.Predicate, function item, and reflection metadata handles in comptime-executed units. |
Handle kind is explicit and cannot flow into runtime storage. |
void / never |
No-value completion and non-returning control flow. | never-typed paths do not produce runtime values; void has no payload. |
IR Instruction Families¶
| Family | Descriptive operations | Purpose |
|---|---|---|
| Control terminators | branch, cond_branch, return, unreachable, trap |
Represent structured control flow and stable checked-failure traps. |
| Scalar and conversion ops | arithmetic, comparison, boolean, numeric conversion, enum/error tag, constant, and target-layout constant operations | Represent ordinary primitive computation after sema has selected concrete operand/result types and safety behavior. |
| Calls | direct_call, indirect_call, dyn_call, dyn_load_slot plus indirect_call |
Call known functions, function pointers, and dynamic vtable/descriptor slots. |
| Locals and memory | local, uninitialized_local, address_of, deref, load, store, field_addr, field_load, field_store, placement_init, end_lifetime |
Represent places, reads/writes, address-taking, dereference, field access, initialized storage, explicit undefined storage, and cleanup-relevant lifetime facts. |
| Pointer casts | ptr_cast, opaque_cast, typed recovery from *opaque, assume_aligned |
Represent explicit pointer erasure and recovery with layout/alignment facts. |
| Aggregates | aggregate_construct, field_extract, array_index_addr |
Represent concrete aggregate values and element/field access. |
| Slices and arrays | slice_construct, slice_ptr, slice_len, slice_index_addr, slice_range, array_to_slice |
Preserve descriptor semantics and array/slice lowering facts. |
| Optionals | optional_null, optional_present, optional_is_present, optional_unwrap, optional-to-bool coercion |
Represent absence, presence, orelse, conditional optional binding, forced unwrap after checking, and payload extraction. |
| Checks | bounds_check, slice_range_check, overflow_check, optional_present_check, alignment_check |
Make required Checked safety checks explicit before protected operations. |
| Results/errors | result_success, result_error, result_is_error, result_unwrap_success, result_unwrap_error |
Represent T!E construction, branching, and extraction. |
| Dynamic descriptors | dyn_make_ptr, dyn_select_descriptor, dyn_descriptor_const, dyn_dispose_descriptor_const, descriptor calls |
Represent borrowed dynamic pointers, upcasts/subsets, and runtime-storable metadata descriptors. |
| Comptime intrinsics | compiler-owned intrinsic call with typed handle inputs/outputs | Execute accepted compiler-object operations in comptime-executed units. |
| ABI metadata | function-level callconv, linkage, export/import, link-name, ABI-safe type facts | Provide backend-facing declaration facts without AST attribute access. |
Cleanup does not need its own instruction family. Cleanup blocks are ordinary blocks containing calls, stores, branches, and returns.
Dispatch Lowering¶
SIR records semantic dispatch evidence. A resolved call includes:
- callee kind: inherent function, contract operation, compiler intrinsic, direct function item, function pointer, or dynamic operation;
- selected declaration or function identity;
- selected conformance identity when the call came from a contract;
- concrete implementer type;
- substituted operation signature;
- receiver mode: value, pointer, const pointer, dynamic pointer, or no receiver;
- required coercions, autoref, temporary materialization, and result type;
- omitted-argument expansion for defaults, including which declaration-scope default expression produced each materialized argument;
- source spans for diagnostics/tooling.
Static contract dispatch lowers to concrete IR. By the time a static contract call reaches IR, contract lookup and visibility checks are complete. IR contains a direct call, primitive intrinsic/lowered operation, or inlined body with concrete argument and return types.
By the time any call reaches IR, default-parameter semantics are gone. Lowering emits a complete concrete argument list. If an argument was omitted in source, SIR records the declaration-scope default expression and evaluates or lowers that expression as an ordinary argument at the call site.
For example, source arithmetic:
const z = x + y
has SIR evidence like:
selected contract: Add(Rhs, Out)
selected conformance: impl Add(u32, u32) for u32
selected operation: Add.add
dispatch: static
result type: u32
IR then contains a concrete primitive operation or direct call with u32 operands. It does not contain Add, impl, or operator lookup.
Dynamic dispatch lowers through descriptor values. SIR records the visible erased surface, selected operation identity, receiver constness, vtable slot identity, and substituted ABI signature. IR represents the dynamic pointer as data plus descriptor and emits either a dynamic-call instruction or descriptor-slot load followed by an indirect call. The backend emits an indirect call with a concrete signature; it does not know what the source contract means.
For example:
var item = it.next()
where it: *dyn Iterator(u8) has SIR evidence like:
receiver type: *dyn Iterator(u8)
selected operation: Iterator(u8).next
slot: next in Iterator(u8) dynamic surface
ABI signature: fn(*opaque) ?u8
IR can lower this as either:
or:
Both forms are equivalent when the descriptor, slot, and ABI signature are verifier-visible.
Dynamic upcasts and intersection subset upcasts select already represented descriptor metadata. IR may represent this as descriptor selection from an existing descriptor plus the same data pointer. It must not trigger hidden conformance lookup or dyn-safety checking.
Cleanup Lowering¶
SIR owns cleanup obligations. Each lexical scope records explicit defer statements, hidden temporaries that require cleanup, resource liveness, live-overwrite cleanup, move effects, storage reinitialization after undefined or cleanup, and which cleanup obligations run on fallthrough, return, try propagation, catch, break, and continue.
IR lowering expands those obligations into ordinary blocks:
- every scope exit edge branches through the required cleanup sequence;
- assignment to a live resource-owning place can lower any required old-value cleanup before the replacement store when source semantics or inserted generated cleanup requires it;
- cleanup order follows the source semantics, including reverse source order for
defer; - moved values disable cleanup for the moved-from owner;
- double-cleanup, missing-cleanup, and move-with-active-defer remain sema/lint facts, not backend inference.
Backends never infer disposal from a type or a source defer; they emit the cleanup calls and branches already present in IR.
For example:
fn load(path: Path) i32!LoadError {
var file = try File.open(path)
defer file.dispose()
return 1
}
lowers conceptually to IR control flow like:
open_result = direct_call File.open(path)
cond_branch result_is_error(open_result), open_error, open_success
open_error:
return result_error(result_unwrap_error(open_result))
open_success:
file = result_unwrap_success(open_result)
branch return_path
return_path:
direct_call File.dispose(&file)
return result_success(1)
The exact block shape is implementation-defined, but the cleanup call must be explicit on every exit path that leaves the scope while file is still live.
Error Returns and Traps¶
SIR preserves T!E semantics: success type, error type, error-set coercions/unions, try, catch, and cleanup on propagation.
IR represents fallible values with a backend-neutral result concept:
- construct success;
- construct error;
- test whether a result is an error;
- extract success;
- extract error.
try expr lowers to an explicit branch on the result value. On the error path, IR runs active cleanup obligations and returns or branches with the propagated error result. On the success path, IR extracts the success value.
For example:
const n = try parse_count(text)
lowers conceptually to:
tmp = direct_call parse_count(text)
cond_branch result_is_error(tmp), error_path, success_path
error_path:
err = result_unwrap_error(tmp)
run_active_cleanup
return result_error(err)
success_path:
n = result_unwrap_success(tmp)
Checked safety failures are traps, not T!E values. Bounds failures, signed overflow, invalid slice ranges, and similar checked failures lower to stable trap kinds and do not change function signatures.
Slices, Arrays, and Bounds¶
SIR owns array, slice, range, index, element-place, mutability, and known-bound facts. Index expressions require usize after accepted coercions. Inline slice-selector bounds have expected type usize; stored range slice selectors must have endpoint type usize in V1.
IR keeps slices as explicit backend-neutral descriptor values instead of flattening them immediately to raw pointer arithmetic. IR can represent:
- slice construction from pointer plus length;
- array-to-slice view formation;
- slice pointer and length extraction;
- array length and known-length metadata;
- element address/index operations;
- slice range construction;
- bounds and range checks when required by safety mode.
Inclusive-end slicing lowers by checking that the inclusive end can become an exclusive bound, computing end + 1, and checking the resulting range against the slice length. Required overflow or invalid-range checks become explicit IR checks/traps in Checked mode.
For example:
const x = items[i]
const view = items[start..=end]
can lower conceptually to:
len = slice_len(items)
bounds_check i < len, trap(bounds_check_failure)
p = slice_index_addr(items, i)
x = load p
overflow_check end + 1, trap(invalid_slice_range)
exclusive = add end, 1
slice_range_check start <= exclusive and exclusive <= len, trap(invalid_slice_range)
view = slice_range(items, start, exclusive)
The C backend may choose descriptor structs, split pointer/length locals, or inline operations away internally. Source and IR semantics remain descriptor-based.
Function Items and Comptime-Only Values¶
Function item values are semantic/comptime values in SIR. SIR records declaration identity, instantiated signature, runtime-callable versus comptime fn callability, and function-pointer coercion eligibility.
Runtime IR must not contain unresolved function item values. It contains one of:
- a direct call to a known function;
- a concrete function-pointer constant with signature and call convention;
- an indirect function-pointer call;
- a comptime handle for function items only inside comptime-executed units.
SIR rejects attempts to store comptime-only values, including Type, Namespace, Scope, Type.Predicate, and comptime fn function items, in runtime storage before IR is emitted.
For example:
const f = scale
const p: *const fn(i32) i32 = scale
SIR records scale as a function item. The first binding remains comptime-only unless a demanded runtime use requires rejection. The second binding lowers to a concrete function-pointer constant with the resolved signature and call convention.
This is rejected before IR:
const bad: *const fn(u32) u32 = fibonacci
when fibonacci is a comptime fn.
Optionals and Conditional Bindings¶
SIR owns optional typing, optional-to-bool coercion, and conditional optional binding semantics. For if const value = optional_expr, SIR records whether the source optional is an lvalue or fresh rvalue, whether the payload is copyable or movable, and the payload binding pattern introduced in the true branch.
IR represents optionals with backend-neutral optional operations:
- construct
null; - construct present payload;
- test presence;
- unwrap payload after a presence branch or explicit present check;
- emit
optional_present_checkbefore payload extraction only for source forms that require a checked presence assertion.
For example:
if const item = iter.next() {
use(item)
}
lowers conceptually to:
tmp = direct_call Iterator.next(&iter)
cond_branch optional_is_present(tmp), some_path, none_path
some_path:
item = optional_unwrap(tmp)
direct_call use(item)
branch done
none_path:
branch done
The optional branch is normal control flow. If the payload binding owns a resource value, cleanup obligations are the same ordinary SIR cleanup obligations described above; optional binding itself does not insert hidden disposal.
Dynamic Metadata and Prelude Owners¶
Owned dynamic values such as Box(dyn C) are prelude/library constructs, not compiler-special source forms. SIR and IR must support the general primitives that make those constructs implementable:
- allocator calls through static
*impl Allocator; - allocator calls through erased
*dyn Allocator; - opaque allocation and deallocation with explicit size and alignment;
- typed pointer recovery from
*opaque; - placement initialization and ordinary stores into allocated storage;
- compiler-produced dynamic metadata from
I.dyn_metadata(C, scope); - runtime-storable dynamic pointer descriptors and dispose descriptors;
- descriptor-based indirect calls;
- ordinary
Disposable.disposecalls and cleanup control flow.
IR should expose these lower-level allocation, pointer, descriptor, and call operations clearly enough that optimization can inline and simplify prelude owner patterns. V1 does not require Box-specific intrinsics, Box-specific IR nodes, or guaranteed devirtualization.
For concrete prelude owner code such as:
var b = try Box(Buffer).create(alloc, buffer)
b.ptr().clear()
the inlined IR should be able to expose ordinary operations:
For erased owner code:
var p = box.ptr()
p.draw()
IR exposes the data pointer plus descriptor:
If later optimization proves the descriptor constant and the value does not escape, it may devirtualize. V1 does not require that optimization.
Attribute Metadata¶
Attribute providers run during sema and mutate the target's semantic metadata through accepted setter APIs. SIR stores the target's final semantic facts plus inspectable attribute metadata/provenance for comptime reflection and tooling.
Examples of final semantic facts include:
- call convention;
- export/import/linkage status;
- external link name;
- copyability and resource metadata;
- layout and alignment constraints;
- public custom metadata exposed by accepted attribute reflection APIs.
IR carries only backend-facing consequences of the final metadata, such as call convention, linkage, external symbol name, ABI-safe lowered types, and layout/alignment facts. IR does not carry raw attribute syntax or provider calls. Backends must not inspect AST attributes, rerun provider evaluation, or redo ABI-safety checks.
C ABI Facts¶
C ABI validation is complete before backend emission. SIR carries the resolved declaration metadata and diagnostics facts. IR functions and function-pointer values carry the backend-facing ABI facts:
- call convention;
- export/import status;
- external symbol name or link name;
- visibility/linkage status;
- ABI-safe lowered parameter and return types.
V1 C ABI signatures accept only the active scalar ABI-safe surface. If a signature is rejected by C ABI validation, no IR function with that exported/imported ABI is emitted.
Source Spans¶
SIR carries rich source spans for diagnostics and tooling. IR may carry source-span references on instructions, terminators, checks, traps, calls, and generated cleanup blocks when needed for trap reports, backend diagnostics, verifier diagnostics, deterministic dumps, or debug output.
Source spans are diagnostic references only. They must not authorize backend access to AST or SIR, affect code generation semantics, or become optimizer facts.
Verifier Expectations¶
The IR verifier must enforce the phase boundary in addition to ordinary type and control-flow checks:
- no unresolved names, semantic contracts, generic parameters, source-level method calls, AST IDs, or SIR IDs;
- all calls have concrete ABI signatures and valid targets or descriptors;
- dynamic calls use descriptor/slot facts with concrete signatures;
- slice/index/range operations use valid descriptor, pointer, and
usizeoperands; - optional construction, presence tests, and payload extraction use matching optional payload types;
- pointer erasure, typed recovery, dereference, address-taking, and alignment assertions use valid pointee, mutability, layout, and alignment facts;
- explicit checks guard the operations they protect in structured IR terms;
- traps carry stable trap kinds;
- result operations are used with matching success and error types;
- externally visible functions carry complete ABI/export/import/linkage facts;
- cleanup appears as ordinary valid control flow;
- IR printing order is deterministic.