Literals¶

Accepted

Accepted for V1 literal source forms, including base-prefixed integers and primitive numeric suffixes.

This page owns the source spelling of literal forms. Literal typing, coercion, ownership, and layout rules belong to Built-In Types, Type System, and the relevant feature pages.

Literal Families¶

V1 recognizes these literal and literal-like source forms:

Family	Forms	Canonical semantics
integer	`0`, `42`, `48_000`, `0b1010`, `0o755`, `0xff`	`comptime_int` or a suffixed primitive integer type; see Built-In Types
floating-point	`0.0`, `1e9`, `6.022e23`, `1.0f32`	`comptime_float` or a suffixed primitive floating-point type; see Built-In Types
code point	`'a'`, `'\n'`, `'\u{1f4a9}'`, `'😀'`	one Unicode scalar value as `comptime_int`
string	`"hello"`, `"line\n"`, `"emoji: 😀"`	byte sequence with type `[]const u8` in V1
primitive value	`true`, `false`, `void`, `null`, `undefined`	see Primitive Values
array	`[1, 2, 3]`, `[]`	fixed-size array value
aggregate construction	`Header{ .channels = 2 }`, `.{ .channels = 2 }`	struct or contextual aggregate construction

Literals do not include leading signs. -42 is unary negation applied to the integer literal 42.

Numeric Literal Digits and Separators¶

Numeric literals use digit sequences. A digit sequence may contain _ separators between digits:

0
42
48_000
16_777_217
0b1111_0000
0xff_ff

Rules:

_ is ignored for value construction.
_ must appear between two digits valid for the literal's base.
leading, trailing, and repeated separators are invalid.
separators do not change whether a token is an integer or floating-point literal.

Invalid:

_42
42_
4__2
0x_ff
0b10_

Integer Literals¶

Integer literals may be decimal or use a base prefix:

Base	Prefixes	Digits	Example
decimal	none	`0` through `9`	`48_000`
binary	`0b`, `0B`	`0`, `1`	`0b1111_0000`
octal	`0o`, `0O`	`0` through `7`	`0o755`
hexadecimal	`0x`, `0X`	`0` through `9`, `a` through `f`, `A` through `F`	`0xdead_beef`

Integer literals have type comptime_int until an expected type or explicit annotation resolves them to a concrete integer type. V1 has no default integer literal type for runtime storage; a stored runtime value must resolve from context or an explicit annotation.

Examples:

const a: i32 = 42
const b = 42          // comptime_int constant
var d: u32 = 48_000
var e = 15u8

Runtime storage without a concrete type is invalid:

var c = 42            // error: runtime storage needs a concrete type

Base-prefixed integer literals are still integer literals. They may use primitive integer suffixes and must pass the same representability checks:

const mask = 0xffu8

Floating-Point Literals¶

Floating-point literals are decimal digit sequences with a fractional part, an exponent, or both:

0.0
1.5
1e9
6.022e23
1.0e-9

Shape:

digits "." digits exponent?
digits exponent

exponent = ("e" | "E") ("+" | "-")? digits

The fractional form requires digits on both sides of .. This keeps range syntax unambiguous:

1.0     // floating-point literal
1..10   // integer literal, range operator, integer literal

Floating-point literals have type comptime_float until an expected type or explicit annotation resolves them to a concrete floating-point type:

const gain: f32 = 0.5
const scale = 1.0       // comptime_float constant

Runtime storage without a concrete type is invalid:

var amount = 1.0        // error: runtime storage needs a concrete type

Hexadecimal floating-point literals are deferred.

Numeric Literal Suffixes¶

Postfix numeric type suffixes are valid in V1. A suffix is a primitive numeric type name immediately following a numeric literal token:

43u8
1.0f32
48_000usize
0xffu8

The suffix gives the literal an explicit destination type. It is shorthand for writing the type in context:

const y: u8 = 43
const z = 43u8

Valid integer suffixes are i1 through i128, u1 through u128, isize, and usize. Valid floating-point suffixes are f32 and f64.

Integer suffixes are valid on integer literals when the value is exactly representable in the suffixed integer type. Floating-point suffixes are valid on integer or floating-point literals when the finite value is in range; they may round to the destination floating-point format at compile time:

const a = 42u8
const b = 42f32
const c = 1.5f32
const d = 0.1f32

Suffixes do not bypass representability checks:

256u8  // error: u8 cannot represent 256
42.5u8 // error: u8 cannot represent 42.5

Suffixes apply to the literal token only. They are not part of a leading sign:

-1u8 // error: unary negation applied to 1u8; u8 has no V1 unary negation

Annotations and expected-type context remain valid:

const x: u8 = 43
const y: f32 = 1.0

Code Point Literals¶

Code point literals use single quotes and contain exactly one Unicode scalar value after escape decoding:

'a'
'\n'
'\''
'\u{1f4a9}'
'😀'

A code point literal is not a byte string and does not denote a UTF-8 byte sequence. It has type comptime_int until context resolves it to a representable integer type.

Some displayed characters are not one Unicode scalar value. A single-scalar emoji such as '😀' is valid. A compound emoji or grapheme cluster is invalid in a code point literal because it contains multiple scalar values:

'👨‍👩‍👧‍👦' // error: more than one scalar value

Escapes are allowed. The matching delimiter must be escaped when writing a single quote code point:

'\''

Invalid:

''      // no scalar value
'ab'    // more than one scalar value

String Literals¶

String literals use double quotes and contain zero or more bytes after escape decoding:

""
"hello"
"line\n"
"quote: \""
"h\x65llo"
"pile: \u{1f4a9}"
"emoji: 😀"

String literals may use Escapes. An unescaped newline is not valid inside a V1 string literal.

Direct non-ASCII source text contributes its UTF-8 bytes to the resulting literal. The compiler does not normalize or reinterpret those source bytes.

Escapes may also contribute bytes. \u{NNNNNN} contributes the UTF-8 encoding of one Unicode scalar value. \xNN contributes one byte and can produce byte sequences that are not valid UTF-8.

String literals have type []const u8 in V1. Catalyst has no primitive string, str, or String type. Owned and growable text belongs in std.text.

These string forms are deferred:

raw strings
multiline strings
byte-string-specific syntax
automatic adjacent string literal concatenation

Escapes¶

Escapes are recognized in code point and string literals. An escape starts with \ and must be one of the forms in this section. Unknown escapes are errors.

All escape forms are valid in both code point and string literals:

Escape	Meaning
`\0`	null character
`\n`	line feed
`\r`	carriage return
`\t`	tab
`\\`	backslash
`\'`	single quote
`\"`	double quote
`\xNN`	hexadecimal 8-bit byte value; exactly two hex digits
`\u{NNNNNN}`	hexadecimal Unicode scalar value; one or more hex digits

\u{NNNNNN} must denote a valid Unicode scalar value. The maximum valid scalar value is 0x10ffff.

Delimiter escapes are allowed in both code point and string literals for consistency, though only the matching delimiter needs escaping:

'\''       // single quote code point
"\x27"     // string containing one single quote byte
"\""       // string containing one double quote
"it's ok"  // single quote does not need escaping in a string

Invalid:

"\q"        // unknown escape
"\x6"       // byte escape requires exactly two hex digits
"\u{}"      // Unicode escape requires at least one hex digit
"\u{110000}" // Unicode scalar value out of range
"\"         // escape reaches end of literal

Source files may also contain direct Unicode scalar values because Catalyst source files are UTF-8.

Primitive Value Forms¶

These primitive value forms are literal-like tokens:

Form	Meaning	Context requirement
`true`	the true value of `bool`	none
`false`	the false value of `bool`	none
`void`	the sole value of `void`	may infer `void`
`null`	optional absence value	requires optional context
`undefined`	unspecified storage value	requires a concrete storage type

Examples:

var ok: bool = true
var none: ?i32 = null
var scratch: [128]u8 = undefined

Context-free null and undefined are invalid:

var missing = null        // error
var unknown = undefined   // error

See Built-In Types for the canonical typing and legality rules.

Array Literals¶

Array literals use bracket syntax:

[1, 2, 3]
[]

Rules:

elements are separated by commas.
a trailing comma is allowed.
elements evaluate in source order.
the literal length is the number of elements.
[] requires an expected array type.

Array literals produce fixed-size array values, not slices:

const xs: [3]i32 = [1, 2, 3]
const empty: [0]i32 = []

Full element inference, ownership behavior, array-to-slice coercion, and deferred slice literal rules are owned by Arrays, Slices, Ranges, and Indexing.

Aggregate Construction¶

Named aggregate construction writes the owner type followed by field initializers:

Header{ .sample_rate = 48_000, .channels = 2 }

Contextual aggregate construction may omit the owner type when an expected type is available:

const header: Header = .{ .sample_rate = 48_000, .channels = 2 }

Rules:

fields are separated by commas.
a trailing comma is allowed.
each entry uses .field_name = expression.
field initializers evaluate in source order.
field names are not local bindings.

The shorthand form .{ ... } is completed by Expected-Type Shorthand. Struct construction semantics, required fields, defaults, duplicate fields, and unknown-field diagnostics are owned by Structs and Methods.

Leading-Dot Forms¶

Leading-dot forms such as .c are not literals in the scalar-literal sense. They are shorthand expressions completed from an expected owner type:

@export(.c)
fn scale(x: f32) f32 {
  return x
}

const c_callback: *const @callconv(.c) fn(i32) i32 = callback
const opts: ExportOptions = .{ .call_conv = .c }

See Expected-Type Shorthand for leading-dot members and contextual aggregate literals.

Deferred Literal Forms¶

These literal spellings are deferred from the current V1 source-form baseline:

hexadecimal floating-point literals
raw strings and multiline strings
byte-string-specific syntax
slice literals such as &[1, 2, 3]
user-defined or custom literal forms

If a deferred form is promoted, document its source spelling here and link to the semantic page that owns its typing, allocation, ownership, or lowering rules.

Built-In Types: primitive types, literal typing, primitive value forms, and string literal type.
Type System: numeric coercion and representability rules.
Arrays, Slices, Ranges, and Indexing: array literals, slice boundaries, and range expressions.
Structs and Methods: struct construction and field defaults.
Expected-Type Shorthand: leading-dot members and contextual aggregate literals.
std.text: owned text and text helper APIs above the V1 string literal baseline.

Literals¶

Literal Families¶

Numeric Literal Digits and Separators¶

Integer Literals¶

Floating-Point Literals¶

Numeric Literal Suffixes¶

Code Point Literals¶

String Literals¶

Escapes¶

Primitive Value Forms¶

Array Literals¶

Aggregate Construction¶

Leading-Dot Forms¶

Deferred Literal Forms¶

Related Details¶