Skip to content

Traps and interrupts

This page is the reference for how Tero models the SPARC V8 trap mechanism: trap types and tt encoding, the trap base register (TBR), the enter_trap / leave_trap (RETT) entry-exit sequence, the PSR ET/S/PS/CWP pipeline, register windows (CWP/WIM, SAVE/RESTORE over/underflow), WRPSR semantics, and how external interrupts from the IRQ(A)MP are sampled and delivered. Every claim below is anchored to the code that implements it.

It belongs to the Developer Manual. For the broader fetch-decode-execute loop see Execution model; for the scheduler and multi-core delivery see Multicore and timing; for the interrupt controllers themselves see IRQMP.


SPARC V8 trap model

A trap on SPARC V8 is a control transfer caused by an exceptional condition. Tero classifies traps the way the architecture manual (§7.1) does:

  • Synchronous (precise) traps — caused by the instruction the core is executing: illegal opcode, privileged instruction in user mode, FP disabled, window overflow/underflow, misaligned access, divide-by-zero, tag overflow, Ticc software traps. These are produced inside core::step from the ExecStatus an instruction handler returns.
  • Asynchronous (interrupt) traps — external interrupt requests routed through the IRQ(A)MP. These are sampled between instructions by the run loop (Emulator::sample_interrupts), never from inside a handler.

Both kinds funnel through the single architectural entry point CpuState::enter_trap (src/core/src/cpu_state.cpp:107), which performs the SPARC V8 §7.3 entry sequence. There is exactly one place that builds the trap frame, so synchronous and asynchronous traps are byte-identical at the window/PSR level.

Scope of the model

Tero models the integer-unit trap set the RTEMS leon3 testsuite exercises plus the FP traps. The MMU-related traps (data_access_MMU_miss, instruction_access_MMU_miss, etc.) are not generated — the SRMMU is deferred (see the roadmap). Watchpoint and r_register_access traps are likewise out of scope. If a guest needs one of these, it does not exist yet — do not assume it does.


Trap types (tt) and the dispatch mapping

The architectural tt constants live in src/core/include/tero/core/trap.hpp as enum class TrapType. Only the subset Tero can actually raise is enumerated:

TrapType tt SPARC V8 trap Raised by
InstructionAccessException 0x01 instruction_access_exception bus error on I-fetch (step.cpp:55)
IllegalInstruction 0x02 illegal_instruction unknown/UNIMP encoding, bad WRPSR.CWP, RETT with ET==1
PrivilegedInstruction 0x03 privileged_instruction privileged op while S==0
FpDisabled 0x04 fp_disabled FPop while PSR.EF==0
WindowOverflow 0x05 window_overflow SAVE into a WIM-invalid window
WindowUnderflow 0x06 window_underflow RESTORE/RETT into a WIM-invalid window
MemAddressNotAligned 0x07 mem_address_not_aligned misaligned load/store/PC/RETT target
FpException 0x08 fp_exception SoftFloat IEEE exception or invalid FP register
DataAccessException 0x09 data_access_exception bus error on a load/store
TagOverflow 0x0A tag_overflow TADDccTV/TSUBccTV overflow
DivisionByZero 0x2A division_by_zero UDIV/SDIV with zero divisor
InterruptLevelBase + L 0x10 + L interrupt_level_L IRQ at level L ∈ 1..15
SoftwareTrapBase + n 0x80 + n trap_instruction Ticc with n = (rs1 + op2) & 0x7F

There are two ways a handler signals a trap, and core::step (src/core/src/step.cpp:87-93) reconciles them after execute() returns:

  1. By status. Most handlers return an ExecStatus (e.g. WinOverflow, PrivInsn). detail::status_to_tt (src/core/src/handlers_internal.hpp:245) is the single mapping from ExecStatus to tt. It returns std::nullopt for non-trap statuses (Ok, Unsupported, ErrorMode, TrapInsn).
  2. By explicit request. Handlers that must choose the tt byte themselves — Ticc (tt = 0x80 + imm7) and the FP handlers (tt = 0x08 with a stamped FSR.ftt) — call CpuState::raise_trap(tt) (src/core/include/tero/core/cpu_state.hpp:614), which latches pending_trap/pending_tt. The step loop checks has_pending_trap() first and uses that tt verbatim.
// src/core/src/step.cpp:87 — reconcile the two trap-signalling paths
std::optional<std::uint8_t> tt;
if (state.has_pending_trap()) {
    tt = state.pending_tt();          // Ticc / FP: handler chose tt
    state.clear_pending_trap();
} else {
    tt = detail::status_to_tt(status); // status-derived tt
}

Software-trap composition

Ticc does not have a fixed tt. exec_ticc (src/core/src/handlers_special.cpp:122) evaluates the condition, computes tt_num = (rs1 + op2) & 0x7F, and calls raise_trap(SoftwareTrapBase | tt_num). RTEMS uses these for system calls and for the window-flush/ta 0 paths.


The trap base register (TBR)

TBR is laid out per SPARC V8 §7.4 (tbr:: constants in src/core/include/tero/core/cpu_state.hpp:201):

 31                    12 11        4 3      0
+------------------------+-----------+--------+
|         TBA            |    tt     |  0000  |
+------------------------+-----------+--------+
   trap base (writable)    trap type   always 0
  • TBA [31:12] — the trap-table base, written by software with WRTBR. CpuState::write_tba (:427) writes only this field; it preserves tt and the zero field, which WRTBR must not disturb.
  • tt [11:4] — written only by trap dispatch (enter_trap step 1) and is read-only to software.
  • [3:0] — hardwired zero (each trap vector is 16 bytes = 4 instructions).

The full vector address is TBR = TBA | (tt << 4), giving each trap type a 4-instruction slot in the trap table.


enter_trap — the entry sequence

CpuState::enter_trap(saved_pc, saved_npc, tt) (src/core/src/cpu_state.cpp:87) is the canonical SPARC V8 §7.3 trap-entry sequence. Order matters and is commented step-by-step in the source:

  1. TBR.tt ← tt — install the trap type, preserving TBA and the zero field (:91).
  2. CWP ← (CWP − 1) mod NWINDOWS (:97, NWINDOWS = 8). This must happen before writing the locals so r[17]/r[18] land in the handler's window, not the interrupted context's.
  3. Save the return pair: new-window %l1 (r17) ← saved_pc, %l2 (r18) ← saved_npc (:101, RegL1/RegL2). The handler reads these back and feeds them to RETT/JMPL.
  4. PS ← S, S ← 1, ET ← 0 (:104-108). Copying S into PS before forcing S=1 is the architectural order; ET=0 disables further trap delivery so a nested fault becomes a hard error (see Error mode).
  5. Jump to the vector: PC ← entry, nPC ← entry + 4 (:116). The vector is TBR normally, but TBA alone when single-vector trapping is on — see Single-vector trapping.
  6. Clear the pending annul flag (exec_.annul_next = false, :122). SPARC V8 §5.1.2.2: the delay-slot annul mechanism is per-CTI and does not survive a trap, so the first handler instruction is never silently dropped.
  7. Drop the pending-trap marker and wake the core (:126-127) so the step loop does not re-dispatch and a powered-down core resumes.

enter_trap does not check ET

enter_trap assumes the caller has already verified ET == 1. The precondition is enforced by core::step (step.cpp:96) and Emulator::sample_interrupts (emulator.cpp:1324) — they enter error mode or skip delivery when ET == 0. Leaving the check out of enter_trap lets unit tests exercise the entry path in isolation (src/core/include/tero/core/cpu_state.hpp:633).

Single-vector trapping (SVT)

LEON3/LEON4 support single-vector trapping via ASR17.SVT (bit 13, Asr17SvtBit in cpu_state.hpp:470). When svt_enabled() is true (:497), enter_trap enters at TBA (the tt offset is dropped) so a single handler dispatches every trap:

// src/core/src/cpu_state.cpp:138 — vector selection
const std::uint32_t tbr_v = tbr();
const std::uint32_t entry = svt_enabled() ? (tbr_v & tbr::TbaMask) : tbr_v;
set_pc(entry);
set_npc(entry + 4U);

This matches the Gaisler SIS oracle and the LEON3 manual. tt is still written to TBR so the single handler can read it back and branch.


RETT and leave_trap — the exit sequence

RETT is the only way out of a trap. The handler is exec_rett (src/core/src/handlers_special.cpp:140); the state mutation lives in CpuState::leave_trap (src/core/src/cpu_state.cpp:154).

Pre-checks (order matters)

exec_rett validates conditions in this exact order before touching state:

// src/core/src/handlers_special.cpp:145
if (s.et())  return ExecStatus::IllegalInsn;   // (1) ET==1 → illegal
if (!s.s())  return ExecStatus::PrivInsn;       // (2) S==0  → privileged
const std::uint32_t new_cwp = (s.cwp() + 1U) % NumWindows;
if (s.window_invalid(new_cwp)) return ExecStatus::WinUnderflow;  // (3) WIM
const std::uint32_t target = s.read_r(insn.rs1) + alu_op2(s, insn);
if ((target & 0x3U) != 0U) return ExecStatus::AlignError;        // (4) align
s.leave_trap(target);

Deliberate SIS-matching deviation from the manual

The SPARC V8 manual orders the checks as privileged-first when ET==1 && S==0. The Gaisler SIS reference simulator checks ET==1 → illegal first, regardless of S. Tero follows SIS bit-for-bit (handlers_special.cpp:141-144). The only state where the two differ — RETT with traps already enabled and in user mode — is unreachable for a real guest, so matching the oracle costs nothing and keeps the lockstep comparator green.

leave_trap state mutation

leave_trap(target) (cpu_state.cpp:130) performs the SPARC V8 §B.26 exit:

  1. CWP ← (CWP + 1) mod NWINDOWS — the inverse of enter_trap's decrement.
  2. S ← PS (re-arm the prior supervisor bit; PS itself is preserved so a nested trap below this RETT still has the right snapshot) and ET ← 1 (re-enable traps).
  3. request_branch(target) — the step loop applies target as the new nPC (PC ← old nPC), giving the architectural PC ← nPC; nPC ← target update through the same branch mechanism CALL/Bicc/JMPL use.

Trap-entry / RETT control flow

flowchart TD
    A["core::step: execute() returns ExecStatus"] --> B{has_pending_trap?}
    B -- yes --> C["tt = pending_tt (Ticc / FP)"]
    B -- no --> D["tt = status_to_tt(status)"]
    C --> E{tt has value?}
    D --> E
    E -- no --> N["advance PC/nPC<br/>(normal delay-slot rule)"]
    E -- yes --> F{PSR.ET == 1?}
    F -- "no" --> G["set_error_mode(true)<br/>halt, preserve PC/nPC"]
    F -- "yes" --> H["enter_trap(pc, npc, tt)"]
    H --> H1["TBR.tt = tt"]
    H1 --> H2["CWP = CWP-1 mod 8"]
    H2 --> H3["l1=pc, l2=npc"]
    H3 --> H4["PS=S; S=1; ET=0"]
    H4 --> H5["PC = SVT ? TBA : TBR<br/>nPC = entry+4"]
    H5 --> K["handler runs (ET=0)"]
    K --> L["RETT %l1/%l2-derived target"]
    L --> M{"ET==1? S==0?<br/>WIM(new_cwp)? aligned?"}
    M -- "any fails" --> P["raise illegal / priv /<br/>window_underflow / mem_not_aligned"]
    M -- "all pass" --> Q["leave_trap(target)"]
    Q --> Q1["CWP = CWP+1 mod 8"]
    Q1 --> Q2["S=PS; ET=1"]
    Q2 --> Q3["request_branch(target)"]
    Q3 --> R["resume interrupted code"]

WRPSR is applied immediately

SPARC V8 §5.1.2.3 permits WRPSR's effect on S, ET, PS, and CWP to be deferred up to three instructions (the ICC and PIL fields are always immediate). That deferral is implementation latitude, and Tero does not take it — it applies every writable PSR field at once, matching the Gaisler SIS oracle. Trap entry and exit set the PSR the same direct way; there is no pending-write buffer anywhere.

// src/core/src/cpu_state.cpp:69 — write_psr_writable (immediate)
void CpuState::write_psr_writable(std::uint32_t value) noexcept {
    const std::uint32_t new_val = (psr() & psr::ReadOnlyMask)
                                | (value & psr::WritableMask);
    set_psr(new_val);
}

Why WRPSR does not model the 3-instruction delay

Real SPARC software (RTEMS and every trap handler) pads WRPSR with three NOPs, so the observable result is identical whether the write lands immediately or after three instructions. The Gaisler SIS oracle (sparc.c WRPSR) applies it immediately, and Tero must match the oracle: an earlier pending_psr_ / commit_psr_pipeline() delay model diverged from SIS whenever a trap was taken inside the three-instruction window — trap entry dropped the still-pending CWP change, desyncing the register-window state on trap-dense SMP paths (e.g. smpschededf03's ISR-exit window reload read a stale frame). The verbatim rationale is in the source comment at cpu_state.cpp:69-81.

write_psr_writable only touches the writable fields. psr::ReadOnlyMask (cpu_state.hpp:128) protects IMPL, VER, the reserved field, and EC (no coprocessor on LEON). EF is writable — Tero has an FPU and software toggles PSR.EF to enable/disable it. The WRPSR handler (exec_write_special, handlers_special.cpp:73) also rejects a CWP field ≥ NWINDOWS with illegal_instruction and requires S==1.


Register windows

Tero models 8 windows (NumWindows = 8, cpu_state.hpp:31). The register file is one host-order blob (int_state_, an ir::GuestState) with the windowed slots laid out per layout:: so the reference interpreter and the IR/JIT share one representation (see Layers and modules).

CWP, the window file, and overlap

reg_offset(cwp, r) / window_slot(cwp, r) (cpu_state.hpp:63-80) map an architectural register number to a byte offset:

  • r0..r7 (globals) → the globals region, window-independent.
  • r8..r23 (outs + locals) → the current window.
  • r24..r31 (ins) → alias the outs of window (cwp + 1) mod 8. SAVE decrements CWP, so the caller sits at CWP+1; its outs are the callee's ins. This overlap is what makes SAVE/RESTORE cheap and is why window overflow/underflow detection is needed.

WIM and window-invalid detection

WIM (window invalid mask) has one bit per window. CpuState::window_invalid(cwp) (cpu_state.hpp:414) is (wim() & (1 << (cwp % 8))) != 0. set_wim (:405) masks the value to the valid NWINDOWS bits. Software marks one window invalid to act as the overflow/underflow tripwire.

SAVE / RESTORE over/underflow

SAVE (exec_save, handlers_regwin.cpp:11) and RESTORE (exec_restore, :26) both compute the destination window, check WIM before committing, and only then move CWP and write the result:

// src/core/src/handlers_regwin.cpp:11 — SAVE
const std::uint32_t new_cwp = (s.cwp() + NumWindows - 1U) % NumWindows;
if (s.window_invalid(new_cwp)) return ExecStatus::WinOverflow;   // trap, no commit
const std::uint32_t sum = s.read_r(insn.rs1) + alu_op2(s, insn);
s.set_cwp(new_cwp);
s.write_r(insn.rd, sum);

RESTORE is the mirror image (+1 window, WinUnderflow). Because the WIM check precedes the CWP move, a faulting SAVE/RESTORE leaves CWP untouched — the trap handler sees the pre-instruction window, spills/refills, and re-executes the instruction (the saved nPC points back at it).

End-to-end window roundtrip test

tests/integration/test_regwin_roundtrip.cpp loads a bare-metal SPARC program (tests/guest-programs/asm/regwin-roundtrip/regwin_roundtrip.S) that chains 7 SAVEs through one overflow trap and 7 RESTOREs through one underflow trap, verifying every window's %l0 marker survives the trap → handler-spill → RETT → retry loop. The asm uses RTEMS-style WIM rotation (right for overflow, left for underflow), adapted from bsps/sparc/leon3/start/win_ovflow.S.


Hardware interrupt sampling and delivery

External interrupts are not raised inside instruction handlers. The run loop samples them between instructions through Emulator::sample_interrupts(core_idx) (src/runtime/src/emulator.cpp:1285). The flow is query → wake → gate → ack → enter_trap:

  1. Query the controller: pending_mask(core_idx) (emulator.cpp:1292) returns the per-core bitmap of asserted lines from the IRQ(A)MP. Empty mask → return.
  2. Find the level: scan bits MaxIrqLevel (15) down to 1; the highest set bit is the candidate interrupt level (:1302-1308).
  3. Wake a powered-down core (:1320). A pending interrupt wakes the core regardless of PSR.ET — RTEMS SMP boot relies on this, parking the secondary CPU with ET=0 and waking it with an IPI.
  4. Gate by ET (:1324): if ET==0, leave the IRQ pending and return — the core resumes at its current PC.
  5. Gate by PIL (:1330): SPARC V8 §7.1 — the interrupt is taken only when level > PSR.PIL. Otherwise it stays pending in the controller.
  6. Acknowledge: acknowledge(core, decision.ack_mask) auto-clears the pending bit (or the force bit if forced) per GR712RC §8. The engine passes the controller bits through opaquely — the ack_mask is formed by the architecture in evaluate_interrupt, where the GRLIB level == bit identity (1u << level) now lives, so the engine never reconstructs it. See the force-precedence rule (Decision 39) in IRQMP.
  7. Enter the trap (:1347): enter_trap(pc, npc, 0x10 + level).

A lower-level interrupt stays pending until either the PIL drops or the handler clears the source (writing ICR/the force register).

Extended interrupts (EIRQ, lines 16..31)

The SPARC tt field only encodes 16 levels, so IRQs 16..31 are redirected. IrqMP::pending_mask (src/peripherals/src/irqmp.cpp:154) folds any masked extended-IRQ bit up to the MPSTAT.EIRQ redirection level so sample_interrupts (which scans only 1..15) sees it:

// src/peripherals/src/irqmp.cpp:161 — fold extended IRQs to the EIRQ level
const std::uint32_t eirq_level = (aload(mpstat_) >> MpstatEirqShift) & MpstatFieldMask;
if (eirq_level != 0 && (mask & ExtendedIrqMask) != 0U) {
    mask |= (1U << eirq_level);
}

On acknowledge of the EIRQ level, the controller pops the actual extended index into EID[cpu] so the handler can read which line fired (irqmp.cpp:98). The GR740 sibling IrqAMP (src/peripherals/src/irqamp.cpp:136) implements the same redirection — they are distinct GRLIB IP cores; do not bolt GR740 features onto IrqMP.

Self-directed IPIs (instruction-boundary latency)

A core can interrupt itself (RTEMS uses this for some scheduler paths). A self-IPI is just the core writing its own IFORCE/force bit, which pending_mask then reports. Round-boundary sampling would delay it a full quantum; instead Emulator::poll_self_interrupt(core_idx) (emulator.cpp:1371) re-samples at each instruction boundary on both the Switch path (emulator.cpp:648) and the JIT path (emulator.cpp:1007):

// src/runtime/src/emulator.cpp:1371
void Emulator::poll_self_interrupt(std::size_t core_idx) {
    auto& state = cores_[core_idx];
    // enter_trap saves PC/nPC as the return pair, well-defined only at an
    // instruction boundary; in a delay slot (npc != pc+4) the IRQ waits one
    // more instruction, exactly as on hardware.
    if (state.npc() != state.pc() + 4U) return;
    sample_interrupts(core_idx);
}

The delay-slot guard is load-bearing: taking a trap mid-delay-slot would save a return pair that does not reconstruct the branch. This fixed smpmulticast01 across all six SMP configs (see the campaign notes referenced from MEMORY.md).


Error mode (ET=0)

If a trap condition arises while PSR.ET == 0, SPARC V8 §7.3 says the processor enters error_mode and halts until reset. core::step detects this before calling enter_trap:

// src/core/src/step.cpp:95
if (tt.has_value()) {
    if (!state.et()) {
        state.set_error_mode(true);   // preserve PC/nPC for post-mortem
        return {ExecStatus::ErrorMode};
    }
    state.enter_trap(pc, npc, *tt);
    return {status};
}

error_mode() is a one-way latch (cpu_state.hpp:663). On the next step the loop short-circuits at step.cpp:15 and returns ErrorMode without fetching, so the caller sees a stable status. The run loop scans every core for error mode each round and returns HaltReason::HaltedMode (or, if a GDB client is attached, redirects through the stub with the appropriate stop signal — SIGSEGV for an unclassified access fault — at the offending PC).

How the stop signal is chosen (architecture-neutral)

The GDB transport never reads the SPARC tt. The halted CPU entity reports a coarse, ISA-neutral GdbFaultClass (src/interfaces/include/tero/igdb_registers.hpp) through the IGdbRegisters::gdb_fault_class() capability; the stub turns that into an RSP stop signal via stop_signal_from_fault_class (src/runtime/src/gdb_stub_transport.cpp). The SPARC tt → class decode (gdb_fault_class_from_tt) lives in the SPARC CPU entity (src/runtime/src/cpu.cpp), so a non-SPARC core supplies its own mapping and the transport stays architecture-agnostic.

The CLI prints a full core-0 post-mortem (PC/nPC/TBR/tt/PSR/WIM and the register file) when this happens. Library users can call emu->core(0) and inspect any field directly.


Special cases and policies

Tagged-arithmetic trap commits its result

TADDccTV/TSUBccTV (SPARC V8 §B.30) compute the result and update icc first, then trap if V is set (Decision 10). The handler writes rd and icc before returning ExecStatus::TagOverflow. The manual leaves the result "unpredictable" when the trap fires; deterministically committing it makes test expectations reproducible.

FP-disabled vs FP-exception

  • op=10, op3 ∈ {0x34, 0x35} (FPop1/FPop2) decode as InsnKind::FpOp. When PSR.EF == 0 the dispatch raises FpDisabled (tt=0x04); when enabled, the SoftFloat-backed handler runs.
  • A SoftFloat IEEE exception or invalid FP-register access raises fp_exception (tt=0x08) via raise_fp_exception (src/core/src/fpu_handlers.cpp:30), which stamps FSR.ftt before calling raise_trap.
  • op=10, op3 ∈ {0x36, 0x37} (coprocessor) decode as InsnKind::Unknown and trap with IllegalInstruction — there is no coprocessor on LEON (PSR.EC stays 0).

Misaligned PC

A misaligned PC synthesizes mem_address_not_aligned against the fetch (step.cpp:37) and flows through the same trap path as any other synchronous trap — there is no separate fetch-alignment trap type.


How to add or debug a trap path

Adding a new synchronous trap:

  1. Add the tt to TrapType in src/core/include/tero/core/trap.hpp if it is not already there.
  2. Either return a new ExecStatus from the handler and extend detail::status_to_tt (handlers_internal.hpp:245), or call raise_trap(tt) directly if the handler must choose the tt byte.
  3. Per project rule, write at least three tests (normal, edge, trap) under tests/unit/tests/unit/test_traps.cpp is the home for trap-entry tests.

Adding a new interrupt source: drive the IRQ(A)MP through the IInterruptSource bridge handed to the peripheral (PeripheralContext::irqs). Sampling, gating, and tt = 0x10 + level delivery are already handled by sample_interrupts — you do not touch enter_trap.

Debugging: the --trace observer (IEmulatorObserver) fires on_trap_taken(core, level, tt, pc) from sample_interrupts (emulator.cpp:1342) for interrupt traps. For synchronous traps, run under the lockstep comparator (scripts/lockstep_compare.py) against SIS — a per-core PC divergence at a trap boundary localises a wrong tt, a wrong window move, or a PSR-pipeline mismatch. The error-mode post-mortem dumps the faulting frame directly.

When in doubt, read the manual

Trap semantics, register-window corner cases, and RETT ordering are exactly the areas where guessing creates multi-day bugs. The authoritative sources are the SPARC V8 Architecture Manual (§4.2 PSR, §7 traps, §B.26 RETT) and the SIS oracle. If a behaviour is not in the code or the manuals, stop and ask — do not invent it.