Skip to content

Status archive — historical session logs

This page archives the chronological session logs and per-phase completion summaries that piled up while Lince's MVP and follow-up scope were being built. They are preserved here for historical context and traceability; the current state of the project lives in Status.

The architectural decisions taken during implementation are indexed in Design decisions — that page is the searchable, authoritative list. The summaries below give the chronological context of when and why each decision happened.


Design decisions taken during implementation (Phases 0–4)

These are judgment calls beyond what CLAUDE.md already freezes. They should survive context wipes — the reasoning lives here, not in diffs. (Authoritative index in architecture/decisions.md.)

  1. SystemBus is non-copyable and non-movable. It owns RAM via unique_ptr<Ram> and peripherals will cache raw pointers into it; moving the bus would invalidate them. If you need to relocate a bus, construct a new one.
  2. Big-endian translation lives in SystemBus, not in Ram. Ram holds raw bytes as they appear on the wire; SystemBus::{encode,decode}_be() does the BE shuffle at the typed-access boundary. Keeps RAM trivially snapshot-able and matches how a real memory controller behaves.
  3. Single region per access. A byte-span access that straddles two regions returns ErrorCode::BusError — real hardware latches one transaction against one target. Do not silently split transfers.
  4. MMIO requires 1 / 2 / 4-byte naturally aligned accesses. Anything else is rejected at the bus with BusError or AlignmentError. CPU alignment traps belong in Phase 2 handlers, not in the bus.
  5. Bus does not own peripherals. SystemBus::map_peripheral takes a non-owning IPeripheral*. When Emulator arrives (Phase 5) it will own unique_ptr<IPeripheral> and hand raw pointers to the bus. For now, tests own the DummyPeripheral directly.
  6. Warning set is stricter than the CLAUDE.md minimum. The lince::warnings INTERFACE target enables, on top of -Wall -Wextra -Wpedantic -Werror: -Wshadow -Wnon-virtual-dtor -Wold-style-cast -Wcast-align -Wunused -Woverloaded-virtual -Wconversion -Wsign-conversion -Wnull-dereference -Wdouble-promotion -Wformat=2. Everything builds with 0 warnings / 0 errors under this set.
  7. tests/support/dummy_peripheral is a test fixture, not a module. It lives in the test tree to exercise the IPeripheral + DMA contract end-to-end. It must not leak into lince_peripherals.
  8. FPop1/FPop2 decoded as InsnKind::FpOp. The decoder recognizes op=10, op3∈{0x34, 0x35} as FP instructions rather than lumping them into Unknown. The handler returns ExecStatus::FpDisabled (tt=0x04) for all FPop encodings, which is the correct LEON3 behavior when no FPU is present (PSR.EF=0). Coprocessor opcodes (op3∈{0x36, 0x37}) remain Unknown and map to IllegalInstruction.
  9. Instruction-fetch vs data-access bus errors are distinguished. The step() loop uses ExecStatus::InsnFetchError (tt=0x01, instruction_access_exception) for failed fetches and BusError (tt=0x09, data_access_exception) for load/store failures. Both were previously mapped to DataAccessException.
  10. TADDccTV / TSUBccTV set icc and write rd even when trapping. Per SPARC V8 §B.30, the tagged-add/sub trap variant computes the result and condition codes first, then traps if V is set. The handler writes rd and icc before returning ExecStatus::TagOverflow, matching the architectural spec that the result is "unpredictable" — our choice is to write it anyway for determinism in tests.
  11. handlers.cpp split into category files. The monolithic handlers.cpp was split into handlers_alu.cpp, handlers_branch.cpp, handlers_loadstore.cpp, handlers_regwin.cpp, handlers_special.cpp, with shared helpers (alu_op2, eval_cond) in handlers_internal.hpp. The public execute() dispatcher remains in handlers.cpp. This keeps each file focused without changing the public API.
  12. MemCtrl (FTMCTRL) is a passive stub. MCFG1–MCFG4 are readable/writable with no side effects. MCFG3 bit 27 (reserved, reads as 1) is forced. No timing, no bank switching — just enough for the RTEMS probe.
  13. IrqMP IFORCE write semantics. Writing IFORCE uses a clear-then-set protocol: the upper 16 bits clear bits, the lower 16 bits set bits (both masked to IRQ lines 1–15). This matches GRLIB behavior where software can atomically set and clear force bits in one write.
  14. IrqMP pending_mask(0) includes IFR0. CPU 0's pending mask is IPEND | IFR0 | (current_mask & IFR0), i.e., the CPU-0-local force register contributes directly. For CPU N>0, pending_mask(N) is IPEND | (current_mask & IFRN). This matches the GR712RC single-CPU force register design.
  15. GPTimer control register writable mask is 0x2B. Bits 0 (EN), 1 (RS), 3 (IE), 5 (CH) are directly writable. Bit 2 (LD) is write-only — triggers a reload from the counter register then clears. Bit 4 (IP) uses write-0-clear semantics (writing 1 has no effect, writing 0 clears the pending bit). Bit 6 (DH) is read-only 0.
  16. GPTimer prescaler underflow logic. On tick(), the prescaler counter is decremented first; when it reaches zero, the prescaler value is reloaded from the reload register and all enabled sub-timers are ticked (counter decrement, underflow → reload if RS=1 / stop if RS=0, IP set if IE=1).
  17. GPTimer timer4 watchdog defaults. After reset, timer4 has EN=RS=IE=1 and counter/reload both set to 0xFFFF. This matches the GR712RC default where the watchdog timer is armed and must be disabled or fed by software.
  18. ApbUart uses std::queue<uint8_t> for the RX FIFO (max 8 entries). No TX FIFO is modeled — transmit() drains immediately via ICharacterDevice. The status register bit 31 (FA, FIFO available) always reads as 1 since the queue fits within 8 entries.
  19. PeripheralContext now includes ICharacterDevice*. Added for APBUart to inject console I/O. Default is nullptr; the Emulator (Phase 5) will wire it to the configured character device.
  20. All peripheral MMIO handlers reject non-word accesses. Byte and half-word reads/writes return ErrorCode::AlignmentError. This is stricter than GR712RC (which allows byte accesses to the UART data register), but matches the MVP approach: defer narrowing support to when RTEMS actually requires it.
  21. BA encoding uses disp22=0, not 1. Earlier test_bare_metal.cpp helpers encoded the unconditional branch BA with disp22=1, which actually targets PC+4 (the delay slot itself) and silently broke any test that relied on BA to land on a specific instruction. The SPARC V8 convention is BA,a to the label; the fix in the test helpers sets disp22=0 so BA .+0 becomes a proper self-branch when intended.
  22. hello_uart.S must enable CTRL.TE before transmitting. ApbUart drops writes to the data register unless the TX enable bit is set in the control register. Any asm test that writes directly to APBUart MMIO must first st the TE bit into CTRL. Missing this is silent — the test "runs" but no output is produced.
  23. Emulator exposes injection, not construction, for services. set_character_device() and set_logger() swap the defaults at runtime. This lets the CLI (lince_app) wire StdoutCharDevice / StdoutLogger without forcing them into EmulatorConfig, and lets tests swap in CapturingCharDevice without touching config. SMP2 wrappers will use the same injection points.
  24. Idle-time skip is bounded to 1 ms. When all cores are powered down, run_until() jumps simulated time forward to the next event. However, the GPTimer uses direct IInterruptSource::raise() calls rather than the EventScheduler, so next_event_time() does not account for timer interrupts. Without a bound, time would jump to the deadline and the GPTimer's periodic interrupt would be delivered only once, preventing the RTEMS clock from advancing. The 1 ms bound (kMaxIdleNs) ensures timer interrupts arrive at roughly their expected rate even when all cores are idle.
  25. 4 KB RAM mapped at physical address 0x0. Real GR712RC boots from ROM at address 0. The RTEMS idle loop does lda [%o0] 0x1c, %g0 where %o0 = 0xFFFFFFF0; the SPARC V8 address wrap-around means the access lands at physical address 0. Mapping 4 KB of RAM at address 0 avoids trap 9 (DataAccessException) in the idle loop.
  26. GPTimer bootloader prescaler initialization. The emulator simulates the GR712RC ROM bootloader's timer setup by writing to the prescaler value and reload registers during initialize(). Without this, the prescaler counter starts at 0 and takes 0xFFFF ticks (≈65 ms) before the first underflow, greatly delaying the first timer interrupt.
  27. Secondary cores start in power-down mode. On real GR712RC hardware, only the primary core (CPU 0) starts executing at reset. Secondary cores are parked in power-down mode (wr %g0, %asr19) and wait for the primary to release them via IRQMP. The emulator now sets is_powered_down = true for all cores except CPU 0 after loading the ELF image.

catch_discover_tests is configured so Catch2's SKIP(...) output is caught by ctest as a skip, not a failure. This keeps the counter honest and lets conditional integration tests like test_rtems_boot live in-tree without breaking CI.


Phase 3 completion summary

Phase 3 (Traps, window overflow, privileged instructions) is complete. The five-plan-step structure from plans/phase3_traps/ was implemented:

  • Step 1: PSR, WIM, TBR state and access instructions (RD/WR special registers with privilege checks). CpuState exposes write_psr_writable, write_tba, window_invalid, and per-field accessors.
  • Step 2: Trap dispatch core. CpuState::enter_trap() / leave_trap(), raise_trap(), has_pending_trap(). The step() loop dispatches synchronous traps through status_to_tt() and handles ErrorMode.
  • Step 3: Window overflow/underflow. SAVE checks WIM before rotating CWP and returns WinOverflow; RESTORE returns WinUnderflow. These trap statuses flow through the step loop into enter_trap.
  • Step 4: RETT and privileged instructions. Full SPARC V8 §B.26 pre-check ordering (S→ET→WIM→alignment). leave_trap() restores S←PS, ET←1, and requests a branch to the return target.
  • Step 5: Remaining synchronous traps:
  • TagOverflow for TADDccTV/TSUBccTV (writes result+icc first).
  • FpDisabled for all FPop1/FPop2 encodings.
  • InsnFetchError (tt=0x01) distinguished from BusError (tt=0x09).
  • DivisionByZero (already in status_to_tt from Phase 2 handler).
  • All privileged registers now produce proper trap entries through step().

Phase 4 completion summary

Phase 4 (Minimal Peripherals) is complete. Four peripherals implemented, each with full unit tests:

  • MemCtrl (FTMCTRL stub): 4 readable/writable config registers (MCFG1– MCFG4). MCFG3 bit 27 forced to 1. No side effects, no timing. Enough for the RTEMS memory controller probe.

  • IrqMP (multi-core interrupt controller): 31 interrupt lines (1–15 per- CPU maskable, 16–28 broadcast), 2 CPU contexts (mask, force, ICR), MPSTAT read-only fields, IPEND/IFORCE with clear-then-set write semantics. pending_mask() computes per-CPU interrupt bitmap for the core step loop. external_assert() routes broadcast bits to all CPU force registers.

  • GPTimer (4 sub-timers + prescaler): Configurable prescaler, per-timer counter/reload/control registers. Control writable mask 0x2B; LD is write-only trigger; IP is write-0-clear. tick() decrements prescaler, then sub-timers on underflow. Timer4 defaults to watchdog mode (EN+RS+IE, counter/reload = 0xFFFF). raise_count exposed for test verification.

  • ApbUart (serial port): Data register with std::queue<uint8_t> RX FIFO (8 deep), status register with DR/RI/TI/TE/FA bits, control register with IT+IR+TE+RI enables, scaler register. TX drains immediately through ICharacterDevice*. update_irq() calls IInterruptSource::raise/lower based on enabled interrupt conditions.

All peripherals implement IPeripheral and receive PeripheralContext via attach(). ICharacterDevice* was added to PeripheralContext for ApbUart.

Phase 5 completion summary

Phase 5 (RTEMS Boot) is complete. The emulator successfully boots a real RTEMS hello-world.elf up to the point where it prints "Hello World" via the UART and gracefully halts using ta 0.

Key accomplishments during the bring-up:

  • Stack Pointer logic: Fixed %sp initialization to top-of-RAM in the ELF Loader to match reference emulator behaviour and prevent INTERNAL_ERROR_NO_MEMORY_FOR_HEAP.
  • AMBA PnP emulation: Mapped memory and dummy AMBA Plug&Play structs dynamically inside Emulator::initialize to pretend standard devices (IRQMP, APBUART, GPTIMER) exist where RTEMS expects them. Also fixed a uint32_t boundaries wrap-around bug in AddressRange.
  • FPU Stubbing: Addressed traps caused by hard-float libc.a binaries that unconditionally executed FPU operations (ldd, std, FBfcc, st %fsr). The emulator set FPU bits (PSR.EF, PSR.EC) to read-only 0 indicating soft-float, and gracefully NOPs any encountered FPU instructions. (Superseded by the SoftFloat 3e integration that landed later.)

Deferred (out of MVP scope, later resolved)

  • sp04 idle-loop hang: Fixed in Phase 6. Idle-time skipping now jumps simulated time past the sleeping interval; sp04 reaches *** END OF TEST reliably.
  • sp11 ErrorMode mid-run: Fixed. The root cause was a stale annul_next_ flag in CpuState::enter_trap() (Decision 37).

Decisions taken in the 2026-04-25 session (Phase 6 completion)

  1. enter_trap() clears annul_next_. SPARC V8 §5.1.2.2 specifies that the annul-bit mechanism is per-CTI and does not persist past a trap entry. The emulator's CpuState::enter_trap() now explicitly clears annul_next_ to prevent the first instruction of an ISR handler from being silently dropped when a hardware interrupt arrives immediately after an annulled delay slot at the end of a quantum. This was the root cause of sp11's ErrorMode crash.

  2. WRPSR unit tests drain the 3-instruction PSR pipeline. SPARC V8 §5.1.2.3 mandates that changes to S, ET, PS, and CWP are not visible until three instructions after a WRPSR. The write_psr_writable() implementation buffers those delayed fields in pending_psr_ and commits them via commit_psr_pipeline(), called by the step loop. Unit tests that call execute() directly (bypassing the step loop) must manually call commit_psr_pipeline() three times before asserting on delayed fields. This was the cause of two test failures (#108 and #149); the fix is in the tests, not the implementation.

Decisions taken in the 2026-04-24 session

  1. Emulator::add_peripheral injects ctx.bus = &bus_. Previously PeripheralContext::bus was left null for user-defined peripherals, which silently broke any DMA-capable custom peripheral attached through the public API. Now the runtime always wires the bus; the rest of PeripheralContext (irq, scheduler, logger, chardev) was already handled.

  2. DemoDmaDevice does endian-safe byte-wise XOR. The original implementation memcpy'd a uint32_t over dma_read and XORed the host-endian word, which produced different results on little-endian vs big-endian hosts. The corrected version XORs each byte against the corresponding byte of the mask in BE order (MSB ↔ addr+0), so the result matches what a SPARC ld; xor; st sequence would produce on the wire.

  3. sptest pass criterion is "*** END OF TEST in console". The previous test required HaltReason != ErrorMode and the END OF TEST string. RTEMS sptests print END OF TEST and then unwind into a halt that triggers a trap-while-ET=0 (ErrorMode), so the strict check failed all 8 actually-passing tests. The functional pass criterion is the END OF TEST banner, period.


HW values audit (2026-04-24)

Full audit of all hardcoded hardware constants against the GR712RC user manual. Five bugs were found and fixed:

  1. GPTimer Timer 4 reset comment was wrong. The comment said RS=1 but 0x09 = 0b1001 has RS=0 (bit 1 is clear). Correct value is EN=1, RS=0, IE=1. The code value was already correct; only the comment was fixed.

  2. Cache config registers used placeholder values. Both I$ and D$ were 0x08101004 (fake). Replaced with GR712RC-specific values: I$ = 0x132308e8 (4-way, 16 KiB, LRU, snooping, MMU present), D$ = 0x1b2208f8 (4-way, 16 KiB, LRU, snooping, MMU present, write-through). These are read-only registers accessed via ASI 0x02 at addresses 0x08 and 0x0C.

  3. ASR17 reset value was incomplete. Only had V8 mul/div (bit 20) and NWINDOWS-1=7. Added GR712RC-specific fields: FPU type [11:10]=01 (GRFPU) and watchpoints [7:5]=010 (2 watchpoints). New value: (1U << 20) | (1U << 11) | (2U << 5) | 7U = 0x100847. (Later refined again on 2026-05-04 against the GR712RC §4.2.10 spec — see memory project_asr17_spec_alignment.md.)

  4. PSR reset was missing LEON3 impl/version fields. After reset, PSR had only S=1 (impl=0, ver=0). Fixed to include impl=0xF (Gaisler) and ver=0x3 (LEON3FT). New reset: (0xFU << 28) | (0x3U << 24) | kSBit = 0xF3000080. These fields are in the read-only mask, so WRPSR preserves them — but they must be correct at reset because RTEMS probes them.

  5. FTMCTRL P&P device ID was wrong. The AHB Plug&Play descriptor used device ID 0x00F (MCTRL, simple memory controller) instead of 0x054 (FTMCTRL, fault-tolerant memory controller with EDAC). The GR712RC has FTMCTRL, not MCTRL. Config word changed from 0x0100f020 to 0x01054020.

Two initially suspected P&P bugs were verified as not bugs:

  • APB P&P IRQ field encoding at [4:0] (5-bit) was correct all along.
  • AHB P&P address descriptor offsets at +0x10 (word 4 of 32-byte entry) were correct for the GRLIB format.

2026-05-03 — IRQ delivery loop fix

Two coordinated changes that close the interrupt delivery loop end-to-end for RTEMS real-time clock ticks:

IrqMP: auto-ack on trap delivery (Decision 39). Added IrqMP::acknowledge(cpu, bit). Invoked by Emulator::sample_interrupts before enter_trap, clearing the corresponding pending / iforce / ifr0 bit. Implements GR712RC §8 (p.~115): "When a processor takes an interrupt trap, the corresponding pending bit will automatically be cleared". Force-precedence rule: if the bit came from the Interrupt Force Register, that is cleared first; only otherwise is the shared pending register cleared.

GPTimer: IRQ pulse on every underflow (Decision 40). Removed the !was_ip guard from process_timer_tick(). The IRQ line now pulses on each underflow with IE == 1, independent of the sticky IP status bit. GR712RC §11: IP is a status register (clear by writing 0), not the IRQ line. Without the pulse-per-underflow, the RTEMS clock driver would only get one interrupt because the IRQMP auto-clears the pending bit on trap acknowledge — and the timer would never re-assert the line if it only raised on the 0→1 transition of IP.

End-to-end validation: the timer fix is exercised implicitly by every RTEMS integration test that uses rtems_task_wake_after() (notably hello-world and hello-lince). Without periodic re-arming the BSP clock driver would deadlock on the first wake_after call.

These fixes were a prerequisite for SMP validation of the timer loop (secondary cores must also receive periodic timer interrupts).