Execution model¶

Lince is a fetch-decode-execute interpreter. Each core executes one SPARC instruction at a time; there is no decode cache, no block translator, no JIT. The cost of this simplicity is performance — the gain is determinism, debuggability, and a small attack surface for hardware-modelling bugs.

The `step()` cycle¶

A single iteration of lince::core::step() does the following:

flowchart TD
    A[Sample interrupts] --> B{Pending trap?}
    B -- yes --> H[enter_trap]
    B -- no  --> C[Fetch insn at PC]
    C --> D[Decode → DecodedInsn]
    D --> E[execute → ExecStatus]
    E --> F{ExecStatus}
    F -- Ok / Branch --> G[Advance PC/nPC]
    F -- Trap statuses --> H
    F -- ErrorMode --> X[Halt core]
    G --> Y[commit_psr_pipeline]
    H --> Y
    Y --> Z[Done]

The function returns a step count and an optional HaltReason so the outer round-robin loop can decide whether to continue, swap cores, or exit early.

Branch delay slots and annul¶

SPARC V8 has architectural delay slots: the instruction immediately following a control-transfer instruction (CTI) is always fetched and optionally executed before the branch takes effect.

Lince models this without a pipeline:

JMPL, CALL, Bicc, RETT, etc. do not mutate PC and nPC directly. They set branch_taken_ and compute branch_target_ on CpuState.
The next step() fetches the instruction at the current nPC (the delay slot) and executes it normally.
After the delay slot, the loop adopts branch_target_ as the new PC instead of the post-incremented nPC.

For annulled branches (Bicc,a):

If the branch is taken, the delay slot is executed normally.
If the branch is not taken, CpuState::annul_next_ is set; the next step() skips execution but still advances PC and nPC.

A hardware interrupt clears annul_next_ on trap entry (SPARC V8 §5.1.2.2). This is implemented in CpuState::enter_trap().

PSR write pipeline¶

SPARC V8 §5.1.2.3 distinguishes between immediate and delayed fields of the Processor Status Register:

Field	Semantics	Implementation
`ICC` (`n,z,v,c`)	Immediate	Written directly.
`PIL`	Immediate	Written directly.
`S` (supervisor)	Delayed by 3 cycles	Buffered in `pending_psr_`.
`ET` (trap enable)	Delayed	idem
`PS` (previous-S)	Delayed	idem
`CWP` (current window pointer)	Delayed	idem

step() calls commit_psr_pipeline() once per cycle. Three back-to-back cycles must pass before a WRPSR to a delayed field becomes architecturally visible. The window-overflow / underflow logic relies on this: SAVE sees the committed CWP, not whatever the previous instruction may have stashed in pending_psr_.

Unit-test trap

Tests that drive execute() directly without going through step() must call commit_psr_pipeline() × 3 manually before asserting on delayed PSR fields. See tests/unit/test_handlers_special.cpp for the canonical pattern.

Trap dispatch¶

When a handler returns an ExecStatus other than Ok or Branch, the step loop:

Clears annul_next_.
Maps the ExecStatus to a SPARC tt (trap type) via status_to_tt().
Calls enter_trap(tt):
- Decrements CWP, saves PSR into the trap window.
- Sets S=1, ET=0, PS=old_S.
- Computes TBR = (TBA & 0xFFFFF000) | (tt << 4).
- Sets PC = TBR, nPC = TBR + 4.
If ET was already 0 at the moment the trap occurred, the core enters ErrorMode and the outer loop returns HaltReason::ErrorMode.

RETT undoes step 3: it restores S from PS, sets ET=1, increments CWP, and asks the loop to branch to the return target.

Full trap reference

ErrorMode¶

SPARC V8 §7.1: a trap that fires while PSR.ET == 0 causes the processor to halt and signal an exception to the outside world. Lince models this by setting error_mode_ = true on the offending core and returning HaltReason::ErrorMode from the next run_for / run_until boundary.

The CLI reacts by dumping a post-mortem of every register on core 0. Library users can call emu->core(idx) and inspect pc(), psr(), tbr(), wim(), the global registers, and the active window.

What the interpreter intentionally does not do¶

No decode caching (every instruction is re-decoded on every fetch).
No instruction-translation cache (no JIT, no IR).
No batch execution (no quantum-internal optimisation across instructions).
No speculative or out-of-order execution.

These are deliberate choices: see Design principles for the rationale.