Arrays, Slices, Ranges, and Indexing¶

Accepted

Accepted for V1 array, slice, range, indexing, slicing, diagnostics, contract dependency, and reflection exposure design; implementation verification is tracked in V1 Compiler Implementation Tasks. Depends on Semantic Contracts and Core Contracts.

Arrays and slices are the core contiguous data model for buffer kernels. The source model should be explicit enough for predictable performance while still leaving room for contract-shaped indexing and iteration.

Array and Slice Types¶

Fixed-size arrays use [N]T:

var frame: [128]f32 = undefined

Rules:

[N]T is an inline fixed-size array value with N elements.
N must be a compile-time-known value exactly representable as usize.
integer literals and comptime integer constants may coerce to usize in array length position.
zero-length arrays are valid.
assigning an array follows ordinary value semantics and copies elements when the element type is copyable.

Slices use []T and []const T:

var writable: []f32 = undefined
var readonly: []const f32 = undefined

Rules:

[]T is a borrowed mutable view.
[]const T is a borrowed readonly view.
assigning a slice copies only the slice descriptor.
slices do not own storage.
slices do not carry capacity.
[]T may coerce to []const T; []const T must not coerce to []T.

Transparent Slice Descriptor¶

Slices are transparent struct-like descriptors:

[]T == struct-like {
  ptr: *T
  len: usize
}

[]const T == struct-like {
  ptr: *const T
  len: usize
}

The fields are ordinary source-visible fields:

items.ptr
items.len

Because slices are transparent descriptors, manually changing or constructing them is allowed once aggregate construction exists. Suspicious descriptor manipulation should be linted, not hard-banned:

assigning to slice.ptr
assigning to slice.len
constructing a slice descriptor manually
creating pointer/length pairs from unrelated sources
increasing .len beyond a known backing array

This follows Catalyst's trust-the-programmer direction while making low-level choices visible to tooling.

Array Literals¶

Array literals use [expr, ...] and produce fixed-size array values:

var xs: [3]f32 = [1.0, 2.0, 3.0]
var ys: [3]f32 = [1.0, 2.0, 3.0]

Rules:

literal length is the number of elements.
element expressions evaluate in source order.
element type is inferred from the elements and/or expected type.
heterogeneous elements require a common inferred type or expected type.
empty array literal [] requires an expected array type.
array literals do not directly produce slices.
array literals may construct arrays containing non-copyable values when the literal initializes a new owning array value, such as [move a, move b].

Implicit array-literal-to-slice coercion is not part of V1. Array literals are rvalues, not addressable array places, so they do not provide borrowed slice storage:

process([1.0, 2.0, 3.0])       // error when process expects []const f32
var view: []const f32 = [1.0, 2.0, 3.0][..] // deferred

Use named array storage when a slice view is needed:

const gains: [3]f32 = [1.0, 2.0, 3.0]
process(gains)

Slice literals and explicit anonymous aggregate borrowing syntax such as &[1, 2, 3] are deferred. A slice is a borrowed view, so these features need a general backing-storage and temporary-lifetime design rather than a call-specific exception.

Array-to-Slice Coercion¶

Addressable array places may coerce to slice views when a slice is expected:

fn process(samples: []f32) void

var frame: [128]f32 = make_frame()
process(frame) // allowed: []f32 view

Rules:

a mutable array place may coerce to []T.
a mutable or readonly array place may coerce to []const T.
the coercion copies no elements and allocates nothing.
the slice length is N.
the slice borrow must not outlive the array storage; obvious local escapes may be diagnosed.
slices do not implicitly coerce back to arrays.

Array literals and other array rvalues are not included in this coercion. If a pointer-to-array is needed, take the address of a named array place:

const table = [1, 2, 3]
take_array(&table)

Indexing¶

Indexing uses items[i]:

var x = items[i]
items[i] = value
var p: *f32 = &items[i]

Semantic rules:

index expressions require usize.
integer literals and integer values may coerce to usize through the general integer coercion rules in Type System.
signed integer values do not coerce to usize unless the source type's full value range fits in usize; signed indexes usually require an explicit checked conversion.
negative indexing is not supported.
indexing produces element access, not a pointer by itself.
pointer-to-element access uses &items[i].
mutable-place contexts require mutable element access.
value-reading an element copies the element and requires the element type to be copyable.
move items[i] is rejected in V1; moving out of array or slice elements needs a later partial-move design.

Assigning to an element is ordinary place assignment for copyable, non-resource values. Replacing a live resource-owning element through raw element assignment is not a general V1 operation; use type- or container-specific APIs that define cleanup and replacement semantics.

For slices, indexing is equivalent to pointer indexing through the transparent descriptor plus the bounds policy:

items[i] ~= items.ptr[i]

Bounds Behavior¶

Indexing and slicing use the same safety-mode policy:

compile-time-known out-of-bounds indexing is a compile error.
compile-time-known invalid slicing ranges are compile errors.
when bounds are provable, no runtime check is needed.
Checked safety mode emits dynamic checks for remaining indexing/slicing operations.
Unchecked safety mode may omit dynamic checks.
checked safety failures trap in a deterministic panic-like way outside ordinary error returns.

For slicing:

items[start..end]

Checked safety mode checks:

start <= end
end <= items.len
for inclusive-end slicing, start <= end and end < items.len; the exclusive bound is then end + 1

Slice Expressions¶

Slice ranges are half-open: start is inclusive and end is exclusive. Slice syntax uses the same range-expression surface as first-class ranges, but in indexing position it is consumed by slicing rather than producing a standalone range value.

items[start..end]
items[start..=end]
items[range]
items[start..]
items[..end]
items[..]

Rules:

omitted start defaults to 0.
omitted end defaults to items.len.
start..end is half-open: it includes start and excludes end.
start..=end is inclusive-end: it includes both start and end.
items[..] creates a whole-view slice.
start == end creates an empty slice.
start > end is invalid.
for inclusive-end slicing, end == usize.max is invalid because it cannot name a valid element of any finite slice and cannot be converted to an exclusive bound.
slicing an array or slice always has slice type through sema.
omitted-bound forms ..end, start.., and .. are valid only in slicing contexts in V1.
finite first-class range values such as Range(usize) and RangeInclusive(usize) are valid slice selectors: items[range].
inline slice selector bounds have expected type usize, so ordinary integer coercions apply to each bound. items[0..len] works naturally when len: usize, and items[start_u8..end_u8] works when u8 safely coerces to usize.
stored range values must have endpoint type usize in V1. A stored Range(u8) or Range(i32) is not accepted as a slice selector without explicit conversion.

Sema types all slice expressions as []T or []const T.

If bounds are compile-time-known, IR may preserve known-length metadata and later optimization may lower more efficiently. The source-level type does not implicitly become [N]T or *[N]T.

Range Expressions¶

First-class Range values and range iteration are V1 design surface.

Range Syntax¶

start..end is the canonical half-open range expression and produces Range(T).
start..=end is the inclusive-end range expression and produces RangeInclusive(T).
range operators bind looser than arithmetic, so a + 1..b - 1 means (a + 1)..(b - 1).
range operators are non-associative, so a..b..c is rejected.
comparisons should not chain through ranges; unclear expressions should be rejected until parenthesized.

Range syntax is intended to be reused by future pattern matching, but V1 defines it only for expressions and slicing.

Range Values¶

Range expressions outside slicing contexts produce finite range values that implement Iterable(Item) and yield values, not element pointers.

V1 standalone ranges are integer ranges:

endpoint types must be integer types, integer literals, or comptime integer constants that can infer or coerce to the selected integer endpoint type.
both endpoints must resolve to the same integer type after literal inference and allowed coercions.
the resulting range has one endpoint type and implements Iterable(T) for that same integer type T.
range values are ordinary copyable prelude value types, not resources.
range values are not compiler magic beyond the literal syntax and primitive lowering hooks needed for efficient iteration and slicing.

Ranges do not implement Sequence(T) in V1 because Sequence(T) is location/indexing based and inherits pointer-yielding iteration.

They may still provide ordinary inherent helpers such as len() and is_empty(). Contract membership is semantic; having a method named len does not make a type a Sequence.

Endpoint Inference¶

Range endpoint inference follows ordinary expression context:

a typed endpoint can choose the range type, such as 0..len becoming Range(usize) when len: usize.
an expected iterable item type can choose the endpoint type, such as for i: usize in 0..10.
an expected range type can choose the endpoint type, such as const r: Range(usize) = 0..10.
a slicing context gives inline slice selector bounds expected type usize, as documented above.
if both endpoint expressions are unresolved comptime_int values and no endpoint, item, range, or slicing context supplies an endpoint type, the range falls back to the smallest primitive integer type that can represent both stored endpoint values.
if both fallback endpoint values are non-negative, fallback chooses the smallest uN; if either endpoint is negative, fallback chooses the smallest iN.
if the fallback endpoint values require a width outside the V1 primitive integer range, the expression is rejected.

Fallback inference uses stored endpoint values, not only yielded values. For 0..256, the selected endpoint type must represent both 0 and 256, so the fallback type is u9. For 0..=255, the fallback type is u8.

Examples:

const a = 0..10   // Range(u4)
const b = -5..5   // Range(i4)
const c = 0..256  // Range(u9)
const d = 0..=255 // RangeInclusive(u8)

Fallback range inference is lintable as ranges/inferred-narrow-endpoints because narrow inferred endpoint types can surprise readers. The lint should suggest spelling the intended range type or loop item type explicitly:

const r: Range(usize) = 0..10

for i: usize in 0..10 {
}

Concrete typed endpoints keep their source types. If endpoint types differ, range inference uses ordinary integer widening to an existing endpoint type only when one endpoint type can represent every value of the other. It does not synthesize a third type for concrete endpoints unless an explicit expected range type requests that type and both endpoints can coerce to it.

const end: u32 = 10
const r = 0..end // Range(u32)

When neither endpoint type represents the other, the unannotated range is rejected:

const start: u32 = 0
const stop: i32 = 10
const bad = start..stop // error: neither endpoint type represents the other

An explicit expected range type may still request a third type through ordinary coercion:

const start: u32 = 0
const stop: i32 = 10
const ok: Range(i64) = start..stop // ok

Iteration¶

Range iteration is successor-based and steps by +1 in the endpoint type.

range iterators are plain copyable iterator values unless a future custom range family documents otherwise.
if start > end, a standalone range is empty; it does not infer a negative step.
inclusive-end ranges must not overflow after yielding their final item; for example, u8.max..=u8.max yields one item and then stops.

Deferred Range Forms¶

Deferred:

standalone omitted-bound ranges such as ..end, start.., and ..
floating-point ranges
arbitrary ordered ranges
user-defined range protocols

Supporting custom ranges is a V2 candidate feature; it needs more than Ord(T), including successor or step semantics and overflow/end behavior. Code that needs non-integer stepping in V1 should use an explicit iterator type.

Diagnostics¶

Array, slice, indexing, slicing, and range diagnostics are phase-owned and deterministic.

Parser diagnostics cover malformed source forms:

malformed array type syntax such as a missing length, missing closing bracket, or missing element type in [N]T.
malformed slice type syntax such as [] in type position without an element type.
malformed array literals such as missing commas or missing closing brackets.
malformed index or slice brackets.
malformed range operators, including a..b..c and a..=.

Sema diagnostics cover invalid typed meanings:

array lengths that are not compile-time-known or not exactly representable as usize.
[] array literals without an expected array type.
heterogeneous array literal elements without a common inferred or expected element type.
implicit array-literal-to-slice coercion and slicing of array rvalues.
array-to-slice coercions from non-addressable array values.
indexes that cannot resolve to usize through allowed integer coercions.
negative indexing and signed indexes whose full value range does not fit in usize.
indexing or slicing receivers that do not provide the required Indexable, MutableIndexable, Sequence, or native array/slice operation.
mutable indexing through readonly receivers or receivers without mutable element access.
value-reading an element whose type is not copyable.
move items[i] and raw replacement of live resource-owning elements through ordinary element assignment.
slice bounds that cannot resolve to usize in inline slicing contexts.
stored slice selector ranges whose endpoint type is not usize.
standalone omitted-bound ranges such as ..end, start.., and .. outside slicing contexts.
compile-time-known out-of-bounds indexes, invalid slice ranges, and inclusive-end slice ranges whose end cannot be converted to an exclusive bound.
standalone range expressions whose endpoints are not integer endpoints, cannot infer one shared integer endpoint type, or require a deferred range domain.
range expressions that depend on unsupported floating-point, arbitrary ordered, or user-defined range semantics.

Checked runtime bounds failures are not ordinary compile diagnostics and do not use error-return types. When an indexing or slicing operation cannot be proven in bounds, Checked safety mode emits dynamic checks; check failure traps in the deterministic panic-like way documented in Bounds Behavior. Unchecked safety mode may omit those dynamic checks.

Implementation snapshots should cover representative parser and sema failures plus checked bounds lowering. Snapshot coverage is tracked in V1 Compiler Implementation Tasks, not as a precondition for this design surface to be accepted.

Length and Pointer Fields¶

Slices expose .ptr and .len as fields of the transparent descriptor.

Arrays expose no fields in V1. Array length is part of the array type [N]T; public value-level length access comes through the array's visible Sequence(T) conformance:

array.len() // Sequence(T).len(&array), when lookup is unambiguous

Arrays do not expose .ptr; use &array[i], array-to-slice coercion, or future array extension methods when pointer-oriented interop helpers are designed. General extension methods are tracked by CEP-0011: General Extension Methods.

Fields and inherent methods share one ordinary member namespace. A type cannot define both a field and inherent method with the same name. Contract implementations may need qualified syntax to avoid conflicts with ordinary fields.

Contract-Shaped Model¶

Indexing and for loops are specified in terms of semantic contracts. The detailed contract model is documented in Semantic Contracts. The canonical Indexable, MutableIndexable, Sequence, and MutableSequence definitions are documented in Indexing and Sequence Contracts. Iterator contracts are documented in Iterator Contracts.

The important direction is that indexing and iteration are statically resolved contract-shaped syntax for concrete receivers. Indexing uses Indexable/MutableIndexable; iteration uses Iterable. They are not ad hoc magic, and they do not imply dynamic dispatch unless the receiver is explicitly a dyn contract object.

For slices, the len(self) contract implementation can simply return self.len. The ordinary .len field blocks same-name contract method-call sugar, so slice.len() is rejected in V1. Use qualified contract-call syntax when the contract operation is needed explicitly:

Sequence(T).len(&slice)

items[i] chooses the operation from context:

read contexts use at, then load/copy the value.
mutable-place contexts use at_mut, then mutate through the pointer.
if only readonly access is available, mutation is rejected.
the choice is statically resolved in sema.

Array and slice support is provided by compiler-provided, reflection-visible conformances to these contracts. Sequence(T) provides Iterable(*const T) through default static and dynamic index iterators; MutableSequence(T) provides MutableIterable(*T) through default static and dynamic mutable index iterators. The compiler may still lower native arrays and slices to primitive operations.

Loops¶

for iteration uses the iterator contracts defined in Core Contracts. The loop binding has the iterator item type directly:

for item in items {
  // item: *const T for arrays/slices
  use(item.*)
}

for var item in items {
  // item: *T for mutable arrays/slices
  item.* = item.* * 0.5
}

Semantically, plain static for calls Iterable(*const T).iter; for var calls MutableIterable(*T).iter_mut. Arrays and slices obtain these conformances through Sequence(T) and MutableSequence(T). Dynamic iteration means explicitly calling fallible iter_dyn or iter_dyn_mut to create an owning Box(dyn Iterator(Item)), then iterating that boxed iterator.

for var requires a mutable receiver place. Iterating a readonly slice or const array binding with for var is rejected.

Plain for uses readonly iteration even when mutable iteration is available. Use for var to request mutable iteration.

var in for var item in items requests mutable iteration only. The loop binding itself is not reassignable.

Optimized lowering may become the equivalent direct index loop:

var loop_slice = items
var loop_len = loop_slice.len
var i: usize = 0
while i < loop_len {
  const item = &loop_slice[i]
  ...
  i = i + 1
}

Rules:

iteration order is source order from 0 to len, exclusive.
the iterable expression is evaluated once and captured at loop entry.
reassigning the original slice variable inside the loop does not change the iteration range.
mutating through an iterator item of type *T changes underlying storage.
no element copies happen by default.
bounds checks inside optimized index-loop lowerings may be eliminated because i < len proves indexing safe.

Array iteration uses the same model through array-to-slice view/coercion or equivalent direct array metadata. Array length N is known and may be optimized aggressively.

SIR and IR¶

AST preserves source syntax:

index expressions
slice expressions
field/member access
loop syntax

Sema resolves the typed meaning:

array/slice type
element type
mutability of the resulting element place/view
usize index requirements
compile-time impossible bounds
contract-shaped at / at_mut / len / iter / iter_mut resolution

IR should keep slices as explicit backend-neutral values and should retain array/slice operations instead of flattening them too early to raw C-like pointer arithmetic.

IR should be able to represent:

slice descriptor values
array and slice length
array-to-slice view formation
element address/index operations
slice construction from pointer+length
bounds checks when required by safety mode
known-length metadata when available

C Lowering¶

Internal generated C may represent slices as deterministic descriptor structs:

typedef struct {
    float *ptr;
    size_t len;
} cata_slice_f32;

typedef struct {
    const float *ptr;
    size_t len;
} cata_slice_const_f32;

This is an internal generated-C representation, not the v1 C ABI export shape for slices.

The C backend may optimize internal lowering by splitting descriptors into pointer/length locals or by inlining them away. The source and IR semantics remain slice-based.

Zero Abstraction Tax¶

Contract-shaped indexing and iteration must not impose a runtime abstraction tax when statically resolved.

Requirements:

array/slice len, at, and at_mut lower to direct array/slice operations.
normal indexing and loops must not introduce hidden dynamic dispatch, allocation, iterator objects, or vtables.
optimized output should match what direct built-in array/slice operations would produce.
future user-defined contract implementations should be statically resolved and inlineable where possible.

Future Decisions¶

Deferred work related to this page is listed here, with CEP links where a focused proposal exists:

explicit value-copy iteration syntax
slice literals
explicit anonymous aggregate borrowing, such as &[1, 2, 3]
custom ranges and step semantics, tracked by CEP-0014: Custom Ranges and Step Semantics
C ABI representation for slices, tracked by CEP-0007: C ABI Compound Types and Error Returns
unsafe or explicitly unchecked indexing syntax, if needed, tracked by CEP-0024: Unchecked Indexing
array extension methods for pointer-oriented helpers such as to_c, tracked by CEP-0011: General Extension Methods