References GitHub issue

vibecode
{"vibecode": {
    "doc": "references",
    "role": "the foundational data structure inside Drinian that maps reference objects to the objects they point at; the table the engine scans to determine reachability for deterministic garbage collection",
    "status": "design — refs hash + reference class hierarchy + uspace as class-level property",
    "key_concepts": ["refs_hash", "reference_class_hierarchy",
        "variable_and_hash_element_subclasses", "uspace_is_a_class_property",
        "deduplicated_pointer_storage", "foundation_for_deterministic_gc"]
}}

The references hash is the structural foundation that makes Drinian's deterministic garbage collection work without reference counting. Every "thing that can hold an object reference" is an entry in this hash, mapping its own ID to the object it points at. When a reference is removed, the engine traces from a root set of uspace references to determine whether the affected object is still reachable — if not, it's orphan and gets collected.

This is the mechanism behind the "root trace at the mutation point" model in garbage-collection.md.

Shape GitHub issue

The references hash is a top-level field in the Drinian hash. Keys are reference IDs; values are object IDs:

json
"references": {
  "2": "4",
  "3": "4",
  "5": "6"
}

Every reference points at exactly one object — that's the contract of a reference. Multiple references can point at the same object (2 and 3 both point at 4 above); that's how sharing works. The hash captures the full edge set of the program's reference graph.

The reference ID is the reference object's own object ID. There's no separate ref_id namespace — a reference is itself an object (instance of puck.uno/reference or a subclass), and the hash maps its identity to whatever target it currently holds.

Object IDs GitHub issue

Object IDs are integers-as-strings drawn from a single program-wide counter. The counter starts at "1" and proceeds "1", "2", "3", ..., "999", "1000", ... in encounter order. Every object created in the program — variables, hash elements, hashes, function objects, class instances — draws its ID from the same counter, so every ID in a running program is unique across all object kinds.

Platter IDs are different: they're UUIDs (see base-class-use.md § Proposed shape), not from this counter. The reason: platter IDs appear as keys inside user buckets (per nulls.md § Serialization), where integer-strings could collide with user-chosen field names. Object IDs don't have this exposure — they only appear in references, frame locals, and the objects-hash keys, never as markers inside user-controlled bucket data.

The counter is stored as a string, not as an integer, so the sequence can grow indefinitely without bigint machinery. A small increment-the-string-by-one routine handles the counter — rightmost digit increments, carry propagates left. No overflow concern for long-running programs that allocate billions of objects.

Properties:

The counter is cheap (one string-increment per allocation), the IDs are short (1-4 characters for typical programs), and Drinian snapshots stay readable — references: {"2": "4", "3": "4"} is far more inspectable than UUID equivalents.

For persistent object identity across process restarts (Mikobase records, blockchain entries, etc.), different ID schemes apply — those systems use UUIDs because the cross-process uniqueness requirement is real for them. The in-process Caspian counter is for in-memory program state only.

Reference classes GitHub issue

Every entry on the left side of the hash is a reference object — an instance of puck.uno/reference or one of its subclasses. The reference's class determines what role it plays in the program; the hash entry itself only carries the pointer.

puck.uno/reference is the base. Its responsibility is exactly:

The reference object's classes and bucket carry semantic metadata about what kind of reference this is. The pointer lives in exactly one place — the references hash — so there's no risk of drift between the reference's internal state and the table.

Two subclasses, both V1.0:

Class Plays the role of
puck.uno/variable A named slot in a scope frame
puck.uno/hash_element A key inside a hash

puck.uno/variable is a bare reference object — its bucket is empty. The lexical name lives in the enclosing scope as the key in the frame's locals hash; the frame's identity is implicit (the variable only exists because some frame's locals references it, and that's the frame it belongs to). The variable's only state is its identity (its object ID) and its target (the entry in references keyed by that ID). Assignment to a variable ($foo = $bar) rebinds the variable's entry in the references hash to point at the new target.

puck.uno/hash_element carries the parent hash and the key. hash[key] = obj rebinds the hash element's entry to point at the new target. Hash internals are first-class reference objects, not some special-cased container scheme — GC walks the references hash uniformly.

Future reference subclasses (return slots, system-surface references, etc.) can be added without changing the hash shape.

Reference API GitHub issue

The base class provides two operations:

Both are engine-managed. User code typically doesn't call them directly — assignment expressions and hash mutation drive them.

Uspace: a class-level property GitHub issue

Caspian distinguishes uspace (user space) — the reachability graph as the program sees it — from engine bookkeeping that also lives in state but is not part of the program's data.

The distinction matters for GC: an object is in uspace if a trace through references lands on a uspace root. Roots are the subset of reference objects that ground the program's data graph. Engine-internal references (the slots holding the call stack itself, the references hash itself, etc.) don't count as roots even though they're also reference objects.

Uspace is a class-level property, not a per-instance flag. Each reference subclass declares uspace: true or uspace: false in its class definition. The declaration is fixed for the class's lifetime — every instance of the class is uspace if the class declares it, and not otherwise.

The classification:

Class uspace
puck.uno/reference (base) false
puck.uno/variable true
puck.uno/hash_element false
Engine-internal subclasses (state slots, etc.) false

Why hashelement is not uspace: a hash element only matters if the hash containing it is reachable through some uspace root. The element itself isn't a root — the variable (or other root) that holds the hash is. Walking from variable roots picks up all reachable hash elements naturally; making hashelement a root in its own right would double-count.

Why variable is uspace: variables in active scope frames are the program's data anchors. Everything the program "has access to" traces back to a variable.

System surfaces (%foo methods, state slots that the engine exposes as program-visible) get their own reference subclasses that declare uspace: true. Adding new uspace-rooting reference kinds is a class-definition act, not a per-call decision.

Why class-level rather than per-instance: it matches the rest of the design. Truthiness is determined by class membership (see object.md); identity-bearing properties (redact-status, etc.) likewise. The uspace classification is the same kind of thing — what kind of reference is this? — so it lives in the same place.

Lifecycle: creating and destroying references GitHub issue

A reference object is created when a slot opens (variable declared, hash key assigned for the first time). The engine:

  1. Allocates the reference object (with the appropriate subclass).
  2. Inserts a row in references: state.references[ref.id] = target.id.
  3. The reference is now live.

A reference object is destroyed when its slot closes (variable goes out of scope, hash key deleted). The engine:

  1. Removes the row from references (state.references[ref.id] = nil in implementation terms).
  2. Fires a GC trace from the former target — if the target is no longer reachable from any uspace root, it's orphan and gets collected.
  3. The reference object itself is also collected.

The invariant: references hash has exactly one entry per live reference object, and no entries for destroyed ones.

How GC uses the hash GitHub issue

When the engine modifies a reference (rebinds a variable, pops a frame, overwrites a hash element, etc.), it updates the references hash and then checks for orphans:

  1. Update the row. A rebinding writes the new target in place; a destruction removes the row entirely.
  2. Identify candidate orphans. Any object that just lost an incoming pointer is a candidate.
  3. Trace from uspace roots. Walk every reference where the reference's class declares uspace: true. Follow each one's target. From each target, follow every outgoing reference (other entries in references whose target is that object).
  4. Mark reachable objects. Anything the walk reaches is alive.
  5. Collect what wasn't reached. Candidates not in the reachable set are orphans, along with everything in their reachability island. The engine fires on_close deepest-first (see garbage-collection.md § Cleanup order).

The walk handles cycles naturally. Two objects pointing at each other but unreachable from any uspace root are both collected.

Cost: O(reachable objects) per trace, bounded by what the program is actually doing. Most reference changes affect tiny graphs.

Why a hash, not an array of pairs GitHub issue

An earlier sketch used "refs": [[ref_id, object_id], ...] — an array of pairs. A hash is better for three reasons:

The cost of a hash over an array is a few extra bytes per entry for the hash overhead — negligible against the cleanup-without- refcounting benefit.

Snapshot serialization GitHub issue

When the engine snapshots Drinian (post-V1.0 feature; see drinian.md § V1.0 scope), the references hash serializes verbatim — just IDs on both sides, trivially representable in JSON.

The actual reference objects (instances of puck.uno/variable etc.) and the objects they point at serialize via their classes' to_json methods. This is where redaction of sensitive fields happens.

The references hash is the structure; the objects' to_json outputs are the content.

Open questions GitHub issue


© 2026 Puck.uno