Literals¶
Accepted
Accepted for V1 literal source forms, including base-prefixed integers and primitive numeric suffixes.
This page owns the source spelling of literal forms. Literal typing, coercion, ownership, and layout rules belong to Built-In Types, Type System, and the relevant feature pages.
Literal Families¶
V1 recognizes these literal and literal-like source forms:
| Family | Forms | Canonical semantics |
|---|---|---|
| integer | 0, 42, 48_000, 0b1010, 0o755, 0xff |
comptime_int or a suffixed primitive integer type; see Built-In Types |
| floating-point | 0.0, 1e9, 6.022e23, 1.0f32 |
comptime_float or a suffixed primitive floating-point type; see Built-In Types |
| code point | 'a', '\n', '\u{1f4a9}', '๐' |
one Unicode scalar value as comptime_int |
| string | "hello", "line\n", "emoji: ๐" |
byte sequence with type []const u8 in V1 |
| primitive value | true, false, void, null, undefined |
see Primitive Values |
| array | [1, 2, 3], [] |
fixed-size array value |
| aggregate construction | Header{ .channels = 2 }, .{ .channels = 2 } |
struct or contextual aggregate construction |
Literals do not include leading signs. -42 is unary negation applied to the integer literal 42.
Numeric Literal Digits and Separators¶
Numeric literals use digit sequences. A digit sequence may contain _ separators between digits:
0
42
48_000
16_777_217
0b1111_0000
0xff_ff
Rules:
_is ignored for value construction._must appear between two digits valid for the literal's base.- leading, trailing, and repeated separators are invalid.
- separators do not change whether a token is an integer or floating-point literal.
Invalid:
_42
42_
4__2
0x_ff
0b10_
Integer Literals¶
Integer literals may be decimal or use a base prefix:
| Base | Prefixes | Digits | Example |
|---|---|---|---|
| decimal | none | 0 through 9 |
48_000 |
| binary | 0b, 0B |
0, 1 |
0b1111_0000 |
| octal | 0o, 0O |
0 through 7 |
0o755 |
| hexadecimal | 0x, 0X |
0 through 9, a through f, A through F |
0xdead_beef |
0
42
123456
48_000
0b1010
0o755
0xff
Integer literals have type comptime_int until an expected type or explicit annotation resolves them to a concrete integer type. V1 has no default integer literal type for runtime storage; a stored runtime value must resolve from context or an explicit annotation.
Examples:
const a: i32 = 42
const b = 42 // comptime_int constant
var d: u32 = 48_000
var e = 15u8
Runtime storage without a concrete type is invalid:
var c = 42 // error: runtime storage needs a concrete type
Base-prefixed integer literals are still integer literals. They may use primitive integer suffixes and must pass the same representability checks:
const mask = 0xffu8
Floating-Point Literals¶
Floating-point literals are decimal digit sequences with a fractional part, an exponent, or both:
0.0
1.5
1e9
6.022e23
1.0e-9
Shape:
The fractional form requires digits on both sides of .. This keeps range syntax unambiguous:
1.0 // floating-point literal
1..10 // integer literal, range operator, integer literal
Floating-point literals have type comptime_float until an expected type or explicit annotation resolves them to a concrete floating-point type:
const gain: f32 = 0.5
const scale = 1.0 // comptime_float constant
Runtime storage without a concrete type is invalid:
var amount = 1.0 // error: runtime storage needs a concrete type
Hexadecimal floating-point literals are deferred.
Numeric Literal Suffixes¶
Postfix numeric type suffixes are valid in V1. A suffix is a primitive numeric type name immediately following a numeric literal token:
43u8
1.0f32
48_000usize
0xffu8
The suffix gives the literal an explicit destination type. It is shorthand for writing the type in context:
const y: u8 = 43
const z = 43u8
Valid integer suffixes are i1 through i128, u1 through u128, isize, and usize. Valid floating-point suffixes are f32 and f64.
Integer suffixes are valid on integer literals when the value is exactly representable in the suffixed integer type. Floating-point suffixes are valid on integer or floating-point literals when the finite value is in range; they may round to the destination floating-point format at compile time:
const a = 42u8
const b = 42f32
const c = 1.5f32
const d = 0.1f32
Suffixes do not bypass representability checks:
256u8 // error: u8 cannot represent 256
42.5u8 // error: u8 cannot represent 42.5
Suffixes apply to the literal token only. They are not part of a leading sign:
-1u8 // error: unary negation applied to 1u8; u8 has no V1 unary negation
Annotations and expected-type context remain valid:
const x: u8 = 43
const y: f32 = 1.0
Code Point Literals¶
Code point literals use single quotes and contain exactly one Unicode scalar value after escape decoding:
'a'
'\n'
'\''
'\u{1f4a9}'
'๐'
A code point literal is not a byte string and does not denote a UTF-8 byte sequence. It has type comptime_int until context resolves it to a representable integer type.
Some displayed characters are not one Unicode scalar value. A single-scalar emoji such as '๐' is valid. A compound emoji or grapheme cluster is invalid in a code point literal because it contains multiple scalar values:
'๐จโ๐ฉโ๐งโ๐ฆ' // error: more than one scalar value
Escapes are allowed. The matching delimiter must be escaped when writing a single quote code point:
'\''
Invalid:
'' // no scalar value
'ab' // more than one scalar value
String Literals¶
String literals use double quotes and contain zero or more bytes after escape decoding:
""
"hello"
"line\n"
"quote: \""
"h\x65llo"
"pile: \u{1f4a9}"
"emoji: ๐"
String literals may use Escapes. An unescaped newline is not valid inside a V1 string literal.
Direct non-ASCII source text contributes its UTF-8 bytes to the resulting literal. The compiler does not normalize or reinterpret those source bytes.
Escapes may also contribute bytes. \u{NNNNNN} contributes the UTF-8 encoding of one Unicode scalar value. \xNN contributes one byte and can produce byte sequences that are not valid UTF-8.
String literals have type []const u8 in V1. Catalyst has no primitive string, str, or String type. Owned and growable text belongs in std.text.
These string forms are deferred:
- raw strings
- multiline strings
- byte-string-specific syntax
- automatic adjacent string literal concatenation
Escapes¶
Escapes are recognized in code point and string literals. An escape starts with \ and must be one of the forms in this section. Unknown escapes are errors.
All escape forms are valid in both code point and string literals:
| Escape | Meaning |
|---|---|
\0 |
null character |
\n |
line feed |
\r |
carriage return |
\t |
tab |
\\ |
backslash |
\' |
single quote |
\" |
double quote |
\xNN |
hexadecimal 8-bit byte value; exactly two hex digits |
\u{NNNNNN} |
hexadecimal Unicode scalar value; one or more hex digits |
\u{NNNNNN} must denote a valid Unicode scalar value. The maximum valid scalar value is 0x10ffff.
Delimiter escapes are allowed in both code point and string literals for consistency, though only the matching delimiter needs escaping:
'\'' // single quote code point
"\x27" // string containing one single quote byte
"\"" // string containing one double quote
"it's ok" // single quote does not need escaping in a string
Invalid:
"\q" // unknown escape
"\x6" // byte escape requires exactly two hex digits
"\u{}" // Unicode escape requires at least one hex digit
"\u{110000}" // Unicode scalar value out of range
"\" // escape reaches end of literal
Source files may also contain direct Unicode scalar values because Catalyst source files are UTF-8.
Primitive Value Forms¶
These primitive value forms are literal-like tokens:
| Form | Meaning | Context requirement |
|---|---|---|
true |
the true value of bool |
none |
false |
the false value of bool |
none |
void |
the sole value of void |
may infer void |
null |
optional absence value | requires optional context |
undefined |
unspecified storage value | requires a concrete storage type |
Examples:
var ok: bool = true
var none: ?i32 = null
var scratch: [128]u8 = undefined
Context-free null and undefined are invalid:
var missing = null // error
var unknown = undefined // error
See Built-In Types for the canonical typing and legality rules.
Array Literals¶
Array literals use bracket syntax:
[1, 2, 3]
[]
Rules:
- elements are separated by commas.
- a trailing comma is allowed.
- elements evaluate in source order.
- the literal length is the number of elements.
[]requires an expected array type.
Array literals produce fixed-size array values, not slices:
const xs: [3]i32 = [1, 2, 3]
const empty: [0]i32 = []
Full element inference, ownership behavior, array-to-slice coercion, and deferred slice literal rules are owned by Arrays, Slices, Ranges, and Indexing.
Aggregate Construction¶
Named aggregate construction writes the owner type followed by field initializers:
Header{ .sample_rate = 48_000, .channels = 2 }
Contextual aggregate construction may omit the owner type when an expected type is available:
const header: Header = .{ .sample_rate = 48_000, .channels = 2 }
Rules:
- fields are separated by commas.
- a trailing comma is allowed.
- each entry uses
.field_name = expression. - field initializers evaluate in source order.
- field names are not local bindings.
The shorthand form .{ ... } is completed by Expected-Type Shorthand. Struct construction semantics, required fields, defaults, duplicate fields, and unknown-field diagnostics are owned by Structs and Methods.
Leading-Dot Forms¶
Leading-dot forms such as .c are not literals in the scalar-literal sense. They are shorthand expressions completed from an expected owner type:
@export(.c)
fn scale(x: f32) f32 {
return x
}
const c_callback: *const @callconv(.c) fn(i32) i32 = callback
const opts: ExportOptions = .{ .call_conv = .c }
See Expected-Type Shorthand for leading-dot members and contextual aggregate literals.
Deferred Literal Forms¶
These literal spellings are deferred from the current V1 source-form baseline:
- hexadecimal floating-point literals
- raw strings and multiline strings
- byte-string-specific syntax
- slice literals such as
&[1, 2, 3] - user-defined or custom literal forms
If a deferred form is promoted, document its source spelling here and link to the semantic page that owns its typing, allocation, ownership, or lowering rules.
Related Details¶
- Built-In Types: primitive types, literal typing, primitive value forms, and string literal type.
- Type System: numeric coercion and representability rules.
- Arrays, Slices, Ranges, and Indexing: array literals, slice boundaries, and range expressions.
- Structs and Methods: struct construction and field defaults.
- Expected-Type Shorthand: leading-dot members and contextual aggregate literals.
std.text: owned text and text helper APIs above the V1 string literal baseline.