Arrays, Slices, Ranges, and Indexing¶
Accepted
Accepted for V1 array, slice, range, indexing, slicing, diagnostics, contract dependency, and reflection exposure design; implementation verification is tracked in V1 Compiler Implementation Tasks. Depends on Semantic Contracts and Core Contracts.
Arrays and slices are the core contiguous data model for buffer kernels. The source model should be explicit enough for predictable performance while still leaving room for contract-shaped indexing and iteration.
Array and Slice Types¶
Fixed-size arrays use [N]T:
var frame: [128]f32 = undefined
Rules:
[N]Tis an inline fixed-size array value withNelements.Nmust be a compile-time-known value exactly representable asusize.- integer literals and comptime integer constants may coerce to
usizein array length position. - zero-length arrays are valid.
- assigning an array follows ordinary value semantics and copies elements when the element type is copyable.
Slices use []T and []const T:
var writable: []f32 = undefined
var readonly: []const f32 = undefined
Rules:
[]Tis a borrowed mutable view.[]const Tis a borrowed readonly view.- assigning a slice copies only the slice descriptor.
- slices do not own storage.
- slices do not carry capacity.
[]Tmay coerce to[]const T;[]const Tmust not coerce to[]T.
Transparent Slice Descriptor¶
Slices are transparent struct-like descriptors:
[]T == struct-like {
ptr: *T
len: usize
}
[]const T == struct-like {
ptr: *const T
len: usize
}
The fields are ordinary source-visible fields:
items.ptr
items.len
Because slices are transparent descriptors, manually changing or constructing them is allowed once aggregate construction exists. Suspicious descriptor manipulation should be linted, not hard-banned:
- assigning to
slice.ptr - assigning to
slice.len - constructing a slice descriptor manually
- creating pointer/length pairs from unrelated sources
- increasing
.lenbeyond a known backing array
This follows Catalyst's trust-the-programmer direction while making low-level choices visible to tooling.
Array Literals¶
Array literals use [expr, ...] and produce fixed-size array values:
var xs: [3]f32 = [1.0, 2.0, 3.0]
var ys: [3]f32 = [1.0, 2.0, 3.0]
Rules:
- literal length is the number of elements.
- element expressions evaluate in source order.
- element type is inferred from the elements and/or expected type.
- heterogeneous elements require a common inferred type or expected type.
- empty array literal
[]requires an expected array type. - array literals do not directly produce slices.
- array literals may construct arrays containing non-copyable values when the literal initializes a new owning array value, such as
[move a, move b].
Implicit array-literal-to-slice coercion is not part of V1. Array literals are rvalues, not addressable array places, so they do not provide borrowed slice storage:
process([1.0, 2.0, 3.0]) // error when process expects []const f32
var view: []const f32 = [1.0, 2.0, 3.0][..] // deferred
Use named array storage when a slice view is needed:
const gains: [3]f32 = [1.0, 2.0, 3.0]
process(gains)
Slice literals and explicit anonymous aggregate borrowing syntax such as &[1, 2, 3] are deferred. A slice is a borrowed view, so these features need a general backing-storage and temporary-lifetime design rather than a call-specific exception.
Array-to-Slice Coercion¶
Addressable array places may coerce to slice views when a slice is expected:
fn process(samples: []f32) void
var frame: [128]f32 = make_frame()
process(frame) // allowed: []f32 view
Rules:
- a mutable array place may coerce to
[]T. - a mutable or readonly array place may coerce to
[]const T. - the coercion copies no elements and allocates nothing.
- the slice length is
N. - the slice borrow must not outlive the array storage; obvious local escapes may be diagnosed.
- slices do not implicitly coerce back to arrays.
Array literals and other array rvalues are not included in this coercion. If a pointer-to-array is needed, take the address of a named array place:
const table = [1, 2, 3]
take_array(&table)
Indexing¶
Indexing uses items[i]:
var x = items[i]
items[i] = value
var p: *f32 = &items[i]
Semantic rules:
- index expressions require
usize. - integer literals and integer values may coerce to
usizethrough the general integer coercion rules in Type System. - signed integer values do not coerce to
usizeunless the source type's full value range fits inusize; signed indexes usually require an explicit checked conversion. - negative indexing is not supported.
- indexing produces element access, not a pointer by itself.
- pointer-to-element access uses
&items[i]. - mutable-place contexts require mutable element access.
- value-reading an element copies the element and requires the element type to be copyable.
move items[i]is rejected in V1; moving out of array or slice elements needs a later partial-move design.
Assigning to an element is ordinary place assignment for copyable, non-resource values. Replacing a live resource-owning element through raw element assignment is not a general V1 operation; use type- or container-specific APIs that define cleanup and replacement semantics.
For slices, indexing is equivalent to pointer indexing through the transparent descriptor plus the bounds policy:
items[i] ~= items.ptr[i]
Bounds Behavior¶
Indexing and slicing use the same safety-mode policy:
- compile-time-known out-of-bounds indexing is a compile error.
- compile-time-known invalid slicing ranges are compile errors.
- when bounds are provable, no runtime check is needed.
Checkedsafety mode emits dynamic checks for remaining indexing/slicing operations.Uncheckedsafety mode may omit dynamic checks.- checked safety failures trap in a deterministic panic-like way outside ordinary error returns.
For slicing:
items[start..end]
Checked safety mode checks:
start <= endend <= items.len- for inclusive-end slicing,
start <= endandend < items.len; the exclusive bound is thenend + 1
Slice Expressions¶
Slice ranges are half-open: start is inclusive and end is exclusive. Slice syntax uses the same range-expression surface as first-class ranges, but in indexing position it is consumed by slicing rather than producing a standalone range value.
items[start..end]
items[start..=end]
items[range]
items[start..]
items[..end]
items[..]
Rules:
- omitted start defaults to
0. - omitted end defaults to
items.len. start..endis half-open: it includesstartand excludesend.start..=endis inclusive-end: it includes bothstartandend.items[..]creates a whole-view slice.start == endcreates an empty slice.start > endis invalid.- for inclusive-end slicing,
end == usize.maxis invalid because it cannot name a valid element of any finite slice and cannot be converted to an exclusive bound. - slicing an array or slice always has slice type through sema.
- omitted-bound forms
..end,start.., and..are valid only in slicing contexts in V1. - finite first-class range values such as
Range(usize)andRangeInclusive(usize)are valid slice selectors:items[range]. - inline slice selector bounds have expected type
usize, so ordinary integer coercions apply to each bound.items[0..len]works naturally whenlen: usize, anditems[start_u8..end_u8]works whenu8safely coerces tousize. - stored range values must have endpoint type
usizein V1. A storedRange(u8)orRange(i32)is not accepted as a slice selector without explicit conversion.
Sema types all slice expressions as []T or []const T.
If bounds are compile-time-known, IR may preserve known-length metadata and later optimization may lower more efficiently. The source-level type does not implicitly become [N]T or *[N]T.
Range Expressions¶
First-class Range values and range iteration are V1 design surface.
Range Syntax¶
start..endis the canonical half-open range expression and producesRange(T).start..=endis the inclusive-end range expression and producesRangeInclusive(T).- range operators bind looser than arithmetic, so
a + 1..b - 1means(a + 1)..(b - 1). - range operators are non-associative, so
a..b..cis rejected. - comparisons should not chain through ranges; unclear expressions should be rejected until parenthesized.
Range syntax is intended to be reused by future pattern matching, but V1 defines it only for expressions and slicing.
Range Values¶
Range expressions outside slicing contexts produce finite range values that implement Iterable(Item) and yield values, not element pointers.
V1 standalone ranges are integer ranges:
- endpoint types must be integer types, integer literals, or comptime integer constants that can infer or coerce to the selected integer endpoint type.
- both endpoints must resolve to the same integer type after literal inference and allowed coercions.
- the resulting range has one endpoint type and implements
Iterable(T)for that same integer typeT. - range values are ordinary copyable prelude value types, not resources.
- range values are not compiler magic beyond the literal syntax and primitive lowering hooks needed for efficient iteration and slicing.
Ranges do not implement Sequence(T) in V1 because Sequence(T) is location/indexing based and inherits pointer-yielding iteration.
They may still provide ordinary inherent helpers such as len() and is_empty(). Contract membership is semantic; having a method named len does not make a type a Sequence.
Endpoint Inference¶
Range endpoint inference follows ordinary expression context:
- a typed endpoint can choose the range type, such as
0..lenbecomingRange(usize)whenlen: usize. - an expected iterable item type can choose the endpoint type, such as
for i: usize in 0..10. - an expected range type can choose the endpoint type, such as
const r: Range(usize) = 0..10. - a slicing context gives inline slice selector bounds expected type
usize, as documented above. - if both endpoint expressions are unresolved
comptime_intvalues and no endpoint, item, range, or slicing context supplies an endpoint type, the range falls back to the smallest primitive integer type that can represent both stored endpoint values. - if both fallback endpoint values are non-negative, fallback chooses the smallest
uN; if either endpoint is negative, fallback chooses the smallestiN. - if the fallback endpoint values require a width outside the V1 primitive integer range, the expression is rejected.
Fallback inference uses stored endpoint values, not only yielded values. For 0..256, the selected endpoint type must represent both 0 and 256, so the fallback type is u9. For 0..=255, the fallback type is u8.
Examples:
const a = 0..10 // Range(u4)
const b = -5..5 // Range(i4)
const c = 0..256 // Range(u9)
const d = 0..=255 // RangeInclusive(u8)
Fallback range inference is lintable as ranges/inferred-narrow-endpoints because narrow inferred endpoint types can surprise readers. The lint should suggest spelling the intended range type or loop item type explicitly:
const r: Range(usize) = 0..10
for i: usize in 0..10 {
}
Concrete typed endpoints keep their source types. If endpoint types differ, range inference uses ordinary integer widening to an existing endpoint type only when one endpoint type can represent every value of the other. It does not synthesize a third type for concrete endpoints unless an explicit expected range type requests that type and both endpoints can coerce to it.
const end: u32 = 10
const r = 0..end // Range(u32)
When neither endpoint type represents the other, the unannotated range is rejected:
const start: u32 = 0
const stop: i32 = 10
const bad = start..stop // error: neither endpoint type represents the other
An explicit expected range type may still request a third type through ordinary coercion:
const start: u32 = 0
const stop: i32 = 10
const ok: Range(i64) = start..stop // ok
Iteration¶
Range iteration is successor-based and steps by +1 in the endpoint type.
- range iterators are plain copyable iterator values unless a future custom range family documents otherwise.
- if
start > end, a standalone range is empty; it does not infer a negative step. - inclusive-end ranges must not overflow after yielding their final item; for example,
u8.max..=u8.maxyields one item and then stops.
Deferred Range Forms¶
Deferred:
- standalone omitted-bound ranges such as
..end,start.., and.. - floating-point ranges
- arbitrary ordered ranges
- user-defined range protocols
Supporting custom ranges is a V2 candidate feature; it needs more than Ord(T), including successor or step semantics and overflow/end behavior. Code that needs non-integer stepping in V1 should use an explicit iterator type.
Diagnostics¶
Array, slice, indexing, slicing, and range diagnostics are phase-owned and deterministic.
Parser diagnostics cover malformed source forms:
- malformed array type syntax such as a missing length, missing closing bracket, or missing element type in
[N]T. - malformed slice type syntax such as
[]in type position without an element type. - malformed array literals such as missing commas or missing closing brackets.
- malformed index or slice brackets.
- malformed range operators, including
a..b..canda..=.
Sema diagnostics cover invalid typed meanings:
- array lengths that are not compile-time-known or not exactly representable as
usize. []array literals without an expected array type.- heterogeneous array literal elements without a common inferred or expected element type.
- implicit array-literal-to-slice coercion and slicing of array rvalues.
- array-to-slice coercions from non-addressable array values.
- indexes that cannot resolve to
usizethrough allowed integer coercions. - negative indexing and signed indexes whose full value range does not fit in
usize. - indexing or slicing receivers that do not provide the required
Indexable,MutableIndexable,Sequence, or native array/slice operation. - mutable indexing through readonly receivers or receivers without mutable element access.
- value-reading an element whose type is not copyable.
move items[i]and raw replacement of live resource-owning elements through ordinary element assignment.- slice bounds that cannot resolve to
usizein inline slicing contexts. - stored slice selector ranges whose endpoint type is not
usize. - standalone omitted-bound ranges such as
..end,start.., and..outside slicing contexts. - compile-time-known out-of-bounds indexes, invalid slice ranges, and inclusive-end slice ranges whose end cannot be converted to an exclusive bound.
- standalone range expressions whose endpoints are not integer endpoints, cannot infer one shared integer endpoint type, or require a deferred range domain.
- range expressions that depend on unsupported floating-point, arbitrary ordered, or user-defined range semantics.
Checked runtime bounds failures are not ordinary compile diagnostics and do not use error-return types. When an indexing or slicing operation cannot be proven in bounds, Checked safety mode emits dynamic checks; check failure traps in the deterministic panic-like way documented in Bounds Behavior. Unchecked safety mode may omit those dynamic checks.
Implementation snapshots should cover representative parser and sema failures plus checked bounds lowering. Snapshot coverage is tracked in V1 Compiler Implementation Tasks, not as a precondition for this design surface to be accepted.
Length and Pointer Fields¶
Slices expose .ptr and .len as fields of the transparent descriptor.
Arrays expose no fields in V1. Array length is part of the array type [N]T; public value-level length access comes through the array's visible Sequence(T) conformance:
array.len() // Sequence(T).len(&array), when lookup is unambiguous
Arrays do not expose .ptr; use &array[i], array-to-slice coercion, or future array extension methods when pointer-oriented interop helpers are designed. General extension methods are tracked by CEP-0011: General Extension Methods.
Fields and inherent methods share one ordinary member namespace. A type cannot define both a field and inherent method with the same name. Contract implementations may need qualified syntax to avoid conflicts with ordinary fields.
Contract-Shaped Model¶
Indexing and for loops are specified in terms of semantic contracts. The detailed contract model is documented in Semantic Contracts. The canonical Indexable, MutableIndexable, Sequence, and MutableSequence definitions are documented in Indexing and Sequence Contracts. Iterator contracts are documented in Iterator Contracts.
The important direction is that indexing and iteration are statically resolved contract-shaped syntax for concrete receivers. Indexing uses Indexable/MutableIndexable; iteration uses Iterable. They are not ad hoc magic, and they do not imply dynamic dispatch unless the receiver is explicitly a dyn contract object.
For slices, the len(self) contract implementation can simply return self.len. The ordinary .len field blocks same-name contract method-call sugar, so slice.len() is rejected in V1. Use qualified contract-call syntax when the contract operation is needed explicitly:
Sequence(T).len(&slice)
items[i] chooses the operation from context:
- read contexts use
at, then load/copy the value. - mutable-place contexts use
at_mut, then mutate through the pointer. - if only readonly access is available, mutation is rejected.
- the choice is statically resolved in sema.
Array and slice support is provided by compiler-provided, reflection-visible conformances to these contracts. Sequence(T) provides Iterable(*const T) through default static and dynamic index iterators; MutableSequence(T) provides MutableIterable(*T) through default static and dynamic mutable index iterators. The compiler may still lower native arrays and slices to primitive operations.
Loops¶
for iteration uses the iterator contracts defined in Core Contracts. The loop binding has the iterator item type directly:
for item in items {
// item: *const T for arrays/slices
use(item.*)
}
for var item in items {
// item: *T for mutable arrays/slices
item.* = item.* * 0.5
}
Semantically, plain static for calls Iterable(*const T).iter; for var calls MutableIterable(*T).iter_mut. Arrays and slices obtain these conformances through Sequence(T) and MutableSequence(T). Dynamic iteration means explicitly calling fallible iter_dyn or iter_dyn_mut to create an owning Box(dyn Iterator(Item)), then iterating that boxed iterator.
for var requires a mutable receiver place. Iterating a readonly slice or const array binding with for var is rejected.
Plain for uses readonly iteration even when mutable iteration is available. Use for var to request mutable iteration.
var in for var item in items requests mutable iteration only. The loop binding itself is not reassignable.
Optimized lowering may become the equivalent direct index loop:
var loop_slice = items
var loop_len = loop_slice.len
var i: usize = 0
while i < loop_len {
const item = &loop_slice[i]
...
i = i + 1
}
Rules:
- iteration order is source order from
0tolen, exclusive. - the iterable expression is evaluated once and captured at loop entry.
- reassigning the original slice variable inside the loop does not change the iteration range.
- mutating through an iterator item of type
*Tchanges underlying storage. - no element copies happen by default.
- bounds checks inside optimized index-loop lowerings may be eliminated because
i < lenproves indexing safe.
Array iteration uses the same model through array-to-slice view/coercion or equivalent direct array metadata. Array length N is known and may be optimized aggressively.
SIR and IR¶
AST preserves source syntax:
- index expressions
- slice expressions
- field/member access
- loop syntax
Sema resolves the typed meaning:
- array/slice type
- element type
- mutability of the resulting element place/view
usizeindex requirements- compile-time impossible bounds
- contract-shaped
at/at_mut/len/iter/iter_mutresolution
IR should keep slices as explicit backend-neutral values and should retain array/slice operations instead of flattening them too early to raw C-like pointer arithmetic.
IR should be able to represent:
- slice descriptor values
- array and slice length
- array-to-slice view formation
- element address/index operations
- slice construction from pointer+length
- bounds checks when required by safety mode
- known-length metadata when available
C Lowering¶
Internal generated C may represent slices as deterministic descriptor structs:
typedef struct {
float *ptr;
size_t len;
} cata_slice_f32;
typedef struct {
const float *ptr;
size_t len;
} cata_slice_const_f32;
This is an internal generated-C representation, not the v1 C ABI export shape for slices.
The C backend may optimize internal lowering by splitting descriptors into pointer/length locals or by inlining them away. The source and IR semantics remain slice-based.
Zero Abstraction Tax¶
Contract-shaped indexing and iteration must not impose a runtime abstraction tax when statically resolved.
Requirements:
- array/slice
len,at, andat_mutlower to direct array/slice operations. - normal indexing and loops must not introduce hidden dynamic dispatch, allocation, iterator objects, or vtables.
- optimized output should match what direct built-in array/slice operations would produce.
- future user-defined contract implementations should be statically resolved and inlineable where possible.
Future Decisions¶
Deferred work related to this page is listed here, with CEP links where a focused proposal exists:
- explicit value-copy iteration syntax
- slice literals
- explicit anonymous aggregate borrowing, such as
&[1, 2, 3] - custom ranges and step semantics, tracked by CEP-0014: Custom Ranges and Step Semantics
- C ABI representation for slices, tracked by CEP-0007: C ABI Compound Types and Error Returns
- unsafe or explicitly unchecked indexing syntax, if needed, tracked by CEP-0024: Unchecked Indexing
- array extension methods for pointer-oriented helpers such as
to_c, tracked by CEP-0011: General Extension Methods