Status archive — historical session logs¶
This page archives the chronological session logs and per-phase completion summaries that piled up while Lince's MVP and follow-up scope were being built. They are preserved here for historical context and traceability; the current state of the project lives in Status.
The architectural decisions taken during implementation are indexed in Design decisions — that page is the searchable, authoritative list. The summaries below give the chronological context of when and why each decision happened.
Design decisions taken during implementation (Phases 0–4)¶
These are judgment calls beyond what CLAUDE.md already freezes. They
should survive context wipes — the reasoning lives here, not in
diffs. (Authoritative index in
architecture/decisions.md.)
SystemBusis non-copyable and non-movable. It owns RAM viaunique_ptr<Ram>and peripherals will cache raw pointers into it; moving the bus would invalidate them. If you need to relocate a bus, construct a new one.- Big-endian translation lives in
SystemBus, not inRam.Ramholds raw bytes as they appear on the wire;SystemBus::{encode,decode}_be()does the BE shuffle at the typed-access boundary. Keeps RAM trivially snapshot-able and matches how a real memory controller behaves. - Single region per access. A byte-span access that straddles two
regions returns
ErrorCode::BusError— real hardware latches one transaction against one target. Do not silently split transfers. - MMIO requires 1 / 2 / 4-byte naturally aligned accesses. Anything
else is rejected at the bus with
BusErrororAlignmentError. CPU alignment traps belong in Phase 2 handlers, not in the bus. - Bus does not own peripherals.
SystemBus::map_peripheraltakes a non-owningIPeripheral*. WhenEmulatorarrives (Phase 5) it will ownunique_ptr<IPeripheral>and hand raw pointers to the bus. For now, tests own theDummyPeripheraldirectly. - Warning set is stricter than the CLAUDE.md minimum. The
lince::warningsINTERFACE target enables, on top of-Wall -Wextra -Wpedantic -Werror:-Wshadow -Wnon-virtual-dtor -Wold-style-cast -Wcast-align-Wunused -Woverloaded-virtual -Wconversion -Wsign-conversion-Wnull-dereference -Wdouble-promotion -Wformat=2. Everything builds with 0 warnings / 0 errors under this set. tests/support/dummy_peripheralis a test fixture, not a module. It lives in the test tree to exercise theIPeripheral+ DMA contract end-to-end. It must not leak intolince_peripherals.- FPop1/FPop2 decoded as
InsnKind::FpOp. The decoder recognizesop=10, op3∈{0x34, 0x35}as FP instructions rather than lumping them intoUnknown. The handler returnsExecStatus::FpDisabled(tt=0x04) for all FPop encodings, which is the correct LEON3 behavior when no FPU is present (PSR.EF=0). Coprocessor opcodes (op3∈{0x36, 0x37}) remainUnknownand map toIllegalInstruction. - Instruction-fetch vs data-access bus errors are distinguished. The
step()loop usesExecStatus::InsnFetchError(tt=0x01,instruction_access_exception) for failed fetches andBusError(tt=0x09,data_access_exception) for load/store failures. Both were previously mapped toDataAccessException. - TADDccTV / TSUBccTV set icc and write rd even when trapping. Per
SPARC V8 §B.30, the tagged-add/sub trap variant computes the result and
condition codes first, then traps if V is set. The handler writes
rdandiccbefore returningExecStatus::TagOverflow, matching the architectural spec that the result is "unpredictable" — our choice is to write it anyway for determinism in tests. - handlers.cpp split into category files. The monolithic
handlers.cppwas split intohandlers_alu.cpp,handlers_branch.cpp,handlers_loadstore.cpp,handlers_regwin.cpp,handlers_special.cpp, with shared helpers (alu_op2,eval_cond) inhandlers_internal.hpp. The publicexecute()dispatcher remains inhandlers.cpp. This keeps each file focused without changing the public API. - MemCtrl (FTMCTRL) is a passive stub. MCFG1–MCFG4 are readable/writable with no side effects. MCFG3 bit 27 (reserved, reads as 1) is forced. No timing, no bank switching — just enough for the RTEMS probe.
- IrqMP IFORCE write semantics. Writing IFORCE uses a clear-then-set protocol: the upper 16 bits clear bits, the lower 16 bits set bits (both masked to IRQ lines 1–15). This matches GRLIB behavior where software can atomically set and clear force bits in one write.
- IrqMP
pending_mask(0)includes IFR0. CPU 0's pending mask isIPEND | IFR0 | (current_mask & IFR0), i.e., the CPU-0-local force register contributes directly. For CPU N>0,pending_mask(N)isIPEND | (current_mask & IFRN). This matches the GR712RC single-CPU force register design. - GPTimer control register writable mask is
0x2B. Bits 0 (EN), 1 (RS), 3 (IE), 5 (CH) are directly writable. Bit 2 (LD) is write-only — triggers a reload from the counter register then clears. Bit 4 (IP) uses write-0-clear semantics (writing 1 has no effect, writing 0 clears the pending bit). Bit 6 (DH) is read-only 0. - GPTimer prescaler underflow logic. On
tick(), the prescaler counter is decremented first; when it reaches zero, the prescaler value is reloaded from the reload register and all enabled sub-timers are ticked (counter decrement, underflow → reload if RS=1 / stop if RS=0, IP set if IE=1). - GPTimer timer4 watchdog defaults. After reset, timer4 has EN=RS=IE=1 and counter/reload both set to 0xFFFF. This matches the GR712RC default where the watchdog timer is armed and must be disabled or fed by software.
- ApbUart uses
std::queue<uint8_t>for the RX FIFO (max 8 entries). No TX FIFO is modeled —transmit()drains immediately viaICharacterDevice. The status register bit 31 (FA, FIFO available) always reads as 1 since the queue fits within 8 entries. - PeripheralContext now includes
ICharacterDevice*. Added for APBUart to inject console I/O. Default isnullptr; theEmulator(Phase 5) will wire it to the configured character device. - All peripheral MMIO handlers reject non-word accesses. Byte and
half-word reads/writes return
ErrorCode::AlignmentError. This is stricter than GR712RC (which allows byte accesses to the UART data register), but matches the MVP approach: defer narrowing support to when RTEMS actually requires it. - BA encoding uses
disp22=0, not1. Earliertest_bare_metal.cpphelpers encoded the unconditional branchBAwithdisp22=1, which actually targets PC+4 (the delay slot itself) and silently broke any test that relied on BA to land on a specific instruction. The SPARC V8 convention isBA,ato the label; the fix in the test helpers setsdisp22=0soBA .+0becomes a proper self-branch when intended. hello_uart.Smust enableCTRL.TEbefore transmitting. ApbUart drops writes to the data register unless the TX enable bit is set in the control register. Any asm test that writes directly to APBUart MMIO must firststthe TE bit into CTRL. Missing this is silent — the test "runs" but no output is produced.- Emulator exposes injection, not construction, for services.
set_character_device()andset_logger()swap the defaults at runtime. This lets the CLI (lince_app) wireStdoutCharDevice/StdoutLoggerwithout forcing them intoEmulatorConfig, and lets tests swap inCapturingCharDevicewithout touching config. SMP2 wrappers will use the same injection points. - Idle-time skip is bounded to 1 ms. When all cores are powered down,
run_until()jumps simulated time forward to the next event. However, the GPTimer uses directIInterruptSource::raise()calls rather than theEventScheduler, sonext_event_time()does not account for timer interrupts. Without a bound, time would jump to the deadline and the GPTimer's periodic interrupt would be delivered only once, preventing the RTEMS clock from advancing. The 1 ms bound (kMaxIdleNs) ensures timer interrupts arrive at roughly their expected rate even when all cores are idle. - 4 KB RAM mapped at physical address 0x0. Real GR712RC boots from ROM
at address 0. The RTEMS idle loop does
lda [%o0] 0x1c, %g0where%o0 = 0xFFFFFFF0; the SPARC V8 address wrap-around means the access lands at physical address 0. Mapping 4 KB of RAM at address 0 avoids trap 9 (DataAccessException) in the idle loop. - GPTimer bootloader prescaler initialization. The emulator simulates
the GR712RC ROM bootloader's timer setup by writing to the prescaler
value and reload registers during
initialize(). Without this, the prescaler counter starts at 0 and takes 0xFFFF ticks (≈65 ms) before the first underflow, greatly delaying the first timer interrupt. - Secondary cores start in power-down mode. On real GR712RC hardware,
only the primary core (CPU 0) starts executing at reset. Secondary cores
are parked in power-down mode (
wr %g0, %asr19) and wait for the primary to release them via IRQMP. The emulator now setsis_powered_down = truefor all cores except CPU 0 after loading the ELF image.
catch_discover_tests is configured so Catch2's SKIP(...) output is
caught by ctest as a skip, not a failure. This keeps the counter honest
and lets conditional integration tests like test_rtems_boot live
in-tree without breaking CI.
Phase 3 completion summary¶
Phase 3 (Traps, window overflow, privileged instructions) is complete. The
five-plan-step structure from plans/phase3_traps/ was implemented:
- Step 1: PSR, WIM, TBR state and access instructions (RD/WR special
registers with privilege checks).
CpuStateexposeswrite_psr_writable,write_tba,window_invalid, and per-field accessors. - Step 2: Trap dispatch core.
CpuState::enter_trap()/leave_trap(),raise_trap(),has_pending_trap(). Thestep()loop dispatches synchronous traps throughstatus_to_tt()and handlesErrorMode. - Step 3: Window overflow/underflow.
SAVEchecks WIM before rotating CWP and returnsWinOverflow;RESTOREreturnsWinUnderflow. These trap statuses flow through the step loop intoenter_trap. - Step 4: RETT and privileged instructions. Full SPARC V8 §B.26
pre-check ordering (S→ET→WIM→alignment).
leave_trap()restores S←PS, ET←1, and requests a branch to the return target. - Step 5: Remaining synchronous traps:
TagOverflowfor TADDccTV/TSUBccTV (writes result+icc first).FpDisabledfor all FPop1/FPop2 encodings.InsnFetchError(tt=0x01) distinguished fromBusError(tt=0x09).DivisionByZero(already instatus_to_ttfrom Phase 2 handler).- All privileged registers now produce proper trap entries through step().
Phase 4 completion summary¶
Phase 4 (Minimal Peripherals) is complete. Four peripherals implemented, each with full unit tests:
-
MemCtrl (FTMCTRL stub): 4 readable/writable config registers (MCFG1– MCFG4). MCFG3 bit 27 forced to 1. No side effects, no timing. Enough for the RTEMS memory controller probe.
-
IrqMP (multi-core interrupt controller): 31 interrupt lines (1–15 per- CPU maskable, 16–28 broadcast), 2 CPU contexts (mask, force, ICR), MPSTAT read-only fields, IPEND/IFORCE with clear-then-set write semantics.
pending_mask()computes per-CPU interrupt bitmap for the core step loop.external_assert()routes broadcast bits to all CPU force registers. -
GPTimer (4 sub-timers + prescaler): Configurable prescaler, per-timer counter/reload/control registers. Control writable mask
0x2B; LD is write-only trigger; IP is write-0-clear.tick()decrements prescaler, then sub-timers on underflow. Timer4 defaults to watchdog mode (EN+RS+IE, counter/reload = 0xFFFF).raise_countexposed for test verification. -
ApbUart (serial port): Data register with
std::queue<uint8_t>RX FIFO (8 deep), status register with DR/RI/TI/TE/FA bits, control register with IT+IR+TE+RI enables, scaler register. TX drains immediately throughICharacterDevice*.update_irq()callsIInterruptSource::raise/lowerbased on enabled interrupt conditions.
All peripherals implement IPeripheral and receive PeripheralContext via
attach(). ICharacterDevice* was added to PeripheralContext for ApbUart.
Phase 5 completion summary¶
Phase 5 (RTEMS Boot) is complete. The emulator successfully boots a real
RTEMS hello-world.elf up to the point where it prints "Hello World" via
the UART and gracefully halts using ta 0.
Key accomplishments during the bring-up:
- Stack Pointer logic: Fixed
%spinitialization to top-of-RAM in the ELF Loader to match reference emulator behaviour and preventINTERNAL_ERROR_NO_MEMORY_FOR_HEAP. - AMBA PnP emulation: Mapped memory and dummy AMBA Plug&Play structs
dynamically inside
Emulator::initializeto pretend standard devices (IRQMP, APBUART, GPTIMER) exist where RTEMS expects them. Also fixed auint32_tboundaries wrap-around bug inAddressRange. - FPU Stubbing: Addressed traps caused by hard-float
libc.abinaries that unconditionally executed FPU operations (ldd,std,FBfcc,st %fsr). The emulator set FPU bits (PSR.EF,PSR.EC) to read-only0indicating soft-float, and gracefully NOPs any encountered FPU instructions. (Superseded by the SoftFloat 3e integration that landed later.)
Deferred (out of MVP scope, later resolved)¶
sp04 idle-loop hang: Fixed in Phase 6. Idle-time skipping now jumps simulated time past the sleeping interval; sp04 reaches*** END OF TESTreliably.sp11 ErrorMode mid-run: Fixed. The root cause was a staleannul_next_flag inCpuState::enter_trap()(Decision 37).
Decisions taken in the 2026-04-25 session (Phase 6 completion)¶
-
enter_trap()clearsannul_next_. SPARC V8 §5.1.2.2 specifies that the annul-bit mechanism is per-CTI and does not persist past a trap entry. The emulator'sCpuState::enter_trap()now explicitly clearsannul_next_to prevent the first instruction of an ISR handler from being silently dropped when a hardware interrupt arrives immediately after an annulled delay slot at the end of a quantum. This was the root cause of sp11's ErrorMode crash. -
WRPSR unit tests drain the 3-instruction PSR pipeline. SPARC V8 §5.1.2.3 mandates that changes to S, ET, PS, and CWP are not visible until three instructions after a WRPSR. The
write_psr_writable()implementation buffers those delayed fields inpending_psr_and commits them viacommit_psr_pipeline(), called by the step loop. Unit tests that callexecute()directly (bypassing the step loop) must manually callcommit_psr_pipeline()three times before asserting on delayed fields. This was the cause of two test failures (#108 and #149); the fix is in the tests, not the implementation.
Decisions taken in the 2026-04-24 session¶
-
Emulator::add_peripheralinjectsctx.bus = &bus_. PreviouslyPeripheralContext::buswas left null for user-defined peripherals, which silently broke any DMA-capable custom peripheral attached through the public API. Now the runtime always wires the bus; the rest ofPeripheralContext(irq, scheduler, logger, chardev) was already handled. -
DemoDmaDevicedoes endian-safe byte-wise XOR. The original implementationmemcpy'd auint32_toverdma_readand XORed the host-endian word, which produced different results on little-endian vs big-endian hosts. The corrected version XORs each byte against the corresponding byte of the mask in BE order (MSB ↔ addr+0), so the result matches what a SPARCld; xor; stsequence would produce on the wire. -
sptest pass criterion is "
*** END OF TESTin console". The previous test requiredHaltReason != ErrorModeand the END OF TEST string. RTEMS sptests print END OF TEST and then unwind into a halt that triggers a trap-while-ET=0 (ErrorMode), so the strict check failed all 8 actually-passing tests. The functional pass criterion is the END OF TEST banner, period.
HW values audit (2026-04-24)¶
Full audit of all hardcoded hardware constants against the GR712RC user manual. Five bugs were found and fixed:
-
GPTimer Timer 4 reset comment was wrong. The comment said
RS=1but0x09 = 0b1001has RS=0 (bit 1 is clear). Correct value isEN=1, RS=0, IE=1. The code value was already correct; only the comment was fixed. -
Cache config registers used placeholder values. Both I$ and D$ were
0x08101004(fake). Replaced with GR712RC-specific values: I$ =0x132308e8(4-way, 16 KiB, LRU, snooping, MMU present), D$ =0x1b2208f8(4-way, 16 KiB, LRU, snooping, MMU present, write-through). These are read-only registers accessed via ASI 0x02 at addresses 0x08 and 0x0C. -
ASR17 reset value was incomplete. Only had V8 mul/div (bit 20) and NWINDOWS-1=7. Added GR712RC-specific fields: FPU type
[11:10]=01(GRFPU) and watchpoints[7:5]=010(2 watchpoints). New value:(1U << 20) | (1U << 11) | (2U << 5) | 7U=0x100847. (Later refined again on 2026-05-04 against the GR712RC §4.2.10 spec — see memoryproject_asr17_spec_alignment.md.) -
PSR reset was missing LEON3 impl/version fields. After reset, PSR had only S=1 (impl=0, ver=0). Fixed to include impl=
0xF(Gaisler) and ver=0x3(LEON3FT). New reset:(0xFU << 28) | (0x3U << 24) | kSBit=0xF3000080. These fields are in the read-only mask, so WRPSR preserves them — but they must be correct at reset because RTEMS probes them. -
FTMCTRL P&P device ID was wrong. The AHB Plug&Play descriptor used device ID
0x00F(MCTRL, simple memory controller) instead of0x054(FTMCTRL, fault-tolerant memory controller with EDAC). The GR712RC has FTMCTRL, not MCTRL. Config word changed from0x0100f020to0x01054020.
Two initially suspected P&P bugs were verified as not bugs:
- APB P&P IRQ field encoding at
[4:0](5-bit) was correct all along. - AHB P&P address descriptor offsets at +0x10 (word 4 of 32-byte entry) were correct for the GRLIB format.
2026-05-03 — IRQ delivery loop fix¶
Two coordinated changes that close the interrupt delivery loop end-to-end for RTEMS real-time clock ticks:
IrqMP: auto-ack on trap delivery (Decision 39).
Added IrqMP::acknowledge(cpu, bit). Invoked by
Emulator::sample_interrupts before enter_trap, clearing the
corresponding pending / iforce / ifr0 bit. Implements GR712RC §8
(p.~115): "When a processor takes an interrupt trap, the corresponding
pending bit will automatically be cleared". Force-precedence rule: if the
bit came from the Interrupt Force Register, that is cleared first; only
otherwise is the shared pending register cleared.
GPTimer: IRQ pulse on every underflow (Decision 40).
Removed the !was_ip guard from process_timer_tick(). The IRQ line
now pulses on each underflow with IE == 1, independent of the sticky
IP status bit. GR712RC §11: IP is a status register (clear by
writing 0), not the IRQ line. Without the pulse-per-underflow, the RTEMS
clock driver would only get one interrupt because the IRQMP auto-clears
the pending bit on trap acknowledge — and the timer would never
re-assert the line if it only raised on the 0→1 transition of IP.
End-to-end validation: the timer fix is exercised implicitly by every
RTEMS integration test that uses rtems_task_wake_after() (notably
hello-world and hello-lince). Without periodic re-arming the BSP
clock driver would deadlock on the first wake_after call.
These fixes were a prerequisite for SMP validation of the timer loop (secondary cores must also receive periodic timer interrupts).