Skip to content

Tero — Project Status

Last updated: 2026-06-10. Landmarks this session: architecture audited 100 % clean against its own frozen principles (the two violations found — peripheral identity and histogram I/O — were fixed; Decisions 60–74 added), and the documentation gained the entity-model / compose / runtime-decomposition pages plus a known-debt register. Read this after CLAUDE.md.

CLAUDE.md is the frozen architectural contract. This file is the living handoff between sessions: what has been built, what is next, and the judgment calls that happened during implementation and are not visible from the file tree alone. Older session logs — Phase ¾/5 summaries, the 2026-04-24 HW audit, the 2026-05-03 IRQ delivery fix — have been moved to Status archive.


State at a glance

Area Status
MVP (Phases 0–6) Done
FPU — Berkeley SoftFloat 3e Done
GDB remote stub (RSP, late-binding, RTEMS-aware) Done
SMP validation (N=2 / N=4) Done — above the ≥ 50 % gate
Save / Restore lifecycle Done
PROM (mkprom2 boot ROM) Done
Phase 9.1 — ADRs + 1:1 roadmap landed Done (2026-05-15)
Phase 9.2 — tero-bench v0 + baseline Done (2026-05-15) — see Performance
Phase 10.1 — per-PC decode cache Done
Phase 10.2 — threaded-code dispatch Removed — ~1.5×, superseded by IR JIT (Decision 59)
Phase 11 — arch-neutral IR Done — multi-arch seam
Phase 12 — tiered LLVM JIT (default translation) Done — ~2000 MIPS; LLVM ≥ 18 mandatory (JIT)
Phase 14 — per-core perf wins (cache 8192, event-bounded N=1 quantum, IRQ-poll fast-path) Done (2026-06-05) — ~2× uniprocessor Dhrystone, bit-exact (Performance)
Multi-core time model — TimeAdvance::Concurrent (SIS-faithful shared timeline) Done (2026-06-05) — default; Sum kept as legacy. Fixes the ~N× inflated multi-core %realtime (ADR-005, plans/adr-005-multicore-time-model.md)
GR740 1:1 real-time Met in MultiThread — 1.82× realtime on compute-bound load; SingleThread structurally short, bounded by one host thread (Performance)
SIS oracle cross-validation 99.2 % (242/244) — scripts/oracle_compare.py; sole real divergence is smplock01 (Test results)
SRMMU, cache control Deferred — see roadmap

Build & test state

  • Configure: cmake -S . -B build -G Ninja
  • Build: cmake --build build0 warnings / 0 errors under -Werror with the project's strict warning set.
  • Test: ctest --test-dir build --output-on-failure499 / 499 (≥ 497 pass + ≤ 2 conditional skips for missing external artifacts). The skips fire only when an external dependency is unavailable (sparc-gaisler-rtems5-gdb, the RTEMS hello ELF, or nm); they do not indicate a bug.
  • The test_rtems_sptests, test_rtems_fptests and test_rtems_smptests integration tests run in CTest. A sptest counts as PASS when its captured UART contains *** END OF TEST (Decision 36). The SMP gate is ≥ 50 %; current rates are 77.8 % (42/54 N=2 GR712RC) and 74.1 % (40/54 N=4 GR740) — both well above the gate.
  • End-to-end: ./build/src/app/tero-emu --image tests/guest-programs/rtems/hello-world/hello-world.elf prints "Hello World" via APBUART and returns cleanly.
  • Toolchain: GCC ≥ 13 or Clang ≥ 17, CMake ≥ 3.25, Ninja, LLVM ≥ 18 (mandatory — the binary-translation JIT links LLVM ORCv2; found via find_package(LLVM REQUIRED CONFIG)). SPARC cross-toolchain (RCC 1.3.2) at /opt/rcc-1.3.2-gcc/bin/ is used for tests/guest-programs/asm/ and any RTEMS rebuild.
  • Execution method is runtime-selected: EmulatorConfig::translation (default true = tiered LLVM JIT; false = the core::step Switch interpreter, the correctness oracle). Both paths are compiled into every build — there is no TERO_ENABLE_JIT build flag.

Modules: real code vs. stub

Target State
tero_interfaces Complete — strong types, Result<T>, ICpuBus, and every interface header from CLAUDE.md.
tero_defaults Complete — StdoutLogger, NullFaultInjector, StdoutCharDevice, DebugPublisher.
tero_bus Complete — Ram, SystemBus (routing + BE typed access + IBusMaster).
tero_core Complete — CpuState (8 register windows, PSR/WIM/TBR/Y with 3-instruction pipeline, FP context), full SPARC V8 integer decoder + FPU dispatcher (FPop1/FPop2 backed by Berkeley SoftFloat 3e), handlers split by category, step() with trap dispatch + error mode + annul.
tero_peripherals Complete — MemCtrl (FTMCTRL), IrqMP / IrqAMP (GR712RC and GR740 controllers), GPTimer (4 sub-timers + watchdog), ApbUart (FIFO + IRQ), Prom (mkprom2 boot ROM).
tero_ir Complete — arch-neutral IrBlock/IrInst/IrExit, opaque GuestState, (PhysAddr, ModeCtx) block cache, reference IR interpreter. Guest-ISA-neutral (depends only on tero_interfaces).
tero_arch_sparc Complete — SPARC frontend: translate_block (SPARC → IR), mode_ctx_of, take_exception, CpuStateGuestState sync. Partial-but-correct: un-lowered ops (FP, atomics, RETT/Ticc, the cc-setting divide/multiply-step forms, most special-register access, and the non-predicable annulled Bicc,a delay slots — stores and LDD) end the block and fall back to core::step.
tero_jit Complete — IR → native via LLVM ORCv2 (LLJIT); tiered baseline-O0 / background-O2 (ADR-002), self-loop + region chaining, inline RAM. Isolated module; LLVM (≥ 18) mandatory.
tero_runtime Complete — Emulator (drives both the Switch interpreter and run_ir_quantum per EmulatorConfig::translation), EmulatorConfig + gr712rc_config() / gr740_config() factories, EventScheduler, CpuBusBridge, ElfLoader (detects mkprom2 .rom and flattens PT_LOAD into PROM), GdbStub (RSP over TCP, late-binding, RTEMS thread-awareness B1, stop-on-ErrorMode with TT → signal mapping). Links tero_arch_sparc + tero_jit.
tero_app Full CLI: --image, --ram, --cores, --budget, --turbo, --gdb-port, --gdb-wait, --verbose, --version, --help. Default pacing is Realtime; --turbo switches to free-running.
demo_dma_device Complete (in examples/demo-dma/). Reference custom peripheral with DMA + IRQ; ~200 LOC.
tests 499 Catch2 cases. Suite includes unit tests per module + integration tests for bare_metal, demo_dma_device, hello_uart_elf, rtems_boot, rtems_sptests, rtems_fptests, rtems_smptests, rtems_hello_tero (PROM), smp_atomics, gdb_stub_protocol (codec + late-binding + qSymbol + error-mode + RTEMS-aware live guest), gdb_stub_rtems (real GDB binary, conditional).

External dependencies (FetchContent)

  • Catch2 v3.5.3 — test framework, only fetched when TERO_BUILD_TESTS=ON.
  • tl::expected v1.1.0 — stand-in for std::expected, which libstdc++ gates behind C++23. The project is fixed on C++20, so tero::Result<T> is tl::expected<T, ErrorCode>.
  • fmt — vendored for logging.
  • Berkeley SoftFloat 3e — vendored under third-party/softfloat3e/, built into tero_core. Configured with the 8086-SSE specialization (upstream default).

LLVM is not FetchContent — it is a host requirement

LLVM (≥ 18) is the one mandatory dependency CMake does not fetch or vendor. Install it from your distribution (llvm-devel / llvm-NN-dev) or use nix develop (the flake pins LLVM 19). See Installation.


2026-06-10 — TERO rename + machine-script grammar v2 + kits as .tero files

The project was renamed Lince → TERO (Translating Emulator for Real-time Onboard-systems; availability web-verified, repo now claaj/tero, old URLs redirect): namespaces tero::, include roots include/tero/, binaries tero-emu/tero-bench, options TERO_*, script extension .tero, component-ABI symbol tero_component_abi_version with ComponentAbiVersion reset to 1 (no external libraries existed). plans/ and references/ stay untouched as history. Gates: full fast suite + plugin suites + [boot] incl. the mkprom2 PROM path, all green; zero guest-visible change.

Then the machine-script track (plans/soc-config-file-design.md) executed end-to-end on branch feat/grammar-v2:

  • F0 — grammar v2 (793be41): bounded expressions + let, new with inline props, dotted assignment, kwarg map, connect obj.port peer:IInterface with real interface tokens, indexed ports, entity-check (always-on hard invariants + strict verb + tero-emu --entity-check dry-run). One grammar, no aliases. Decision 75.
  • F1 — edge-based PnP (9439299): connect <bridge>.slaves <dev>:IAmbaPnp, edge order = slot, slaves[N] pins; pnp_slot/ pnp_publish deleted; kits migrated byte-identically (kit tests passed unchanged). The git-history audit corrected the rollout premise — the 5 IAmbaPnp implementors are the complete historically-published set. Decision 76. Slot semantics amended by Decision 78 (see below).
  • F2/F3 — kits as files (7c22b32): machines/gr712rc.tero + gr740.tero embedded at configure time; the C++ compositions deleted after dump-byte-equality proof on both SoCs; kit entity-check surfaced missing cpu1..3 documentation edges (fixed). Decision 77.
  • F4: machine files installed under share/tero/machines/; the gr712rc-derived.tero example teaches copy-the-kit-and-edit (validated by the CLI dry-run); docs updated.
  • Follow-up — order-independent slaves + StdoutMonitor (2026-06-10): PnP slot numbers became internal (records compact in edge order; GRLIB software matches by identity + BAR, never by slot), so slaves[N] pinning and pnp_ahb_slot were removed — slaves[N] now fails build() with a diagnostic. The kits' tables compact (GR740 APB 0,1,3,4,5 → 0..4); RTEMS [boot] passed unchanged on both kits. The Console component was renamed StdoutMonitor (ComponentKind::Monitor) — it is a host stdout sink, not hardware. Decision 78.

2026-06-09/10 — Cleanliness sweep + documentation completion

The architecture was audited against its own frozen principles (singletons, direct I/O, dependency direction, strong types, compile-time behaviour flags, arch-agnosticism). Exactly two violations were found and both were fixed; a duplication/smell audit followed; the documentation gained the entity-model / compose / decomposition coverage it was missing.

  • Peripheral identity unified (5a6d121, merge 6f6eafa). IPeripheral::name() is now final and returns the runtime-injected PeripheralSpec::instance_name; authors implement device_class(). find_entity(x)->name() == x holds for every entity kind; add_peripheral() peripherals became findable; the observer reports instance names. ComponentAbiVersion 1 → 2 (v1 component libraries are rejected at load). All ~33 in-tree implementors renamed; the sniffer-plugin test stubs were caught by the Switch build (the JIT build does not compile the plugins). Decision 70.
  • Opcode-histogram I/O moved out of the core (d185a8a + 9b1dd78). ~ExecutionEngine's fopen/getenv CSV dump became Emulator::opcode_histogram() + the pure format_opcode_histogram_csv(); tero-bench and tero-emu own the I/O policy (verified end-to-end in build-histogram; on 2026-06-10 the $TERO_HISTOGRAM_OUT env var was replaced by an explicit --histogram-out flag on both clients). Decision 74.
  • Cheap smell fixes (9b1dd78): MIRROR WARNING comments on the two run-loop mirrors in engine_translate.cpp; ctx.irq empty-guard unified in Soc::add_peripheral.
  • Docs: new pages architecture/entity-model.md, architecture/runtime-decomposition.md, modules/compose.md, guide/machine-scripts.md, development/known-debt.md; stale pages updated for the decomposition/compose reality; Decisions 60–74 added (entity model, compose, runtime decomposition, identity, histogram).
  • Gates: fast suite 11283 assertions / 1104 cases green on Switch, JIT and ASan/UBSan builds; 6 sniffer/connector plugin suites green; histogram CSV byte-identical pre/post.

2026-06-05 — Per-core perf wins on main: ~2× uniprocessor Dhrystone

Four bit-exact per-core optimizations landed on main, each found by profiling and confirmed by a controlled same-host back-to-back A/B (governor=performance). Together they roughly double uniprocessor Dhrystone throughput — Dhrystone N=1 measures ~295 MIPS on main (governor performance), up from a ~150 MIPS pre-Phase-12 baseline. The full ctest suite stays green (709 cases pass, 0 fail) across uniprocessor + SMP N=2/N=4 and Switch + JIT + MultiThread.

  1. Dispatch-cache anchor (lever A) (IrBlock::jit_entry; TieredJit::*_anchored; Emulator::run_ir_quantum). Stash the stable CacheEntry address in the block on first compile and reuse it to resolve the best fn / count executions without the per-dispatch unordered_map hash (the #1 call-heavy hotspot): Dhrystone +19.5% (238 → 285 MIPS). Bit-exact (anchor read only after a validated find). Previously PR #15; merged here.
  2. IR BlockCache 1024 → 8192 slots (src/ir/include/tero/ir/block_cache.hpp). The direct-mapped index covered only a 4 KiB PC window at 1024, so call-heavy guests aliased hot code against libc and constantly re-translated. 8192 (a 32 KiB window) removes the aliasing: Dhrystone +39% (153 → 212 MIPS), p99 slice jitter 28.7 ms → 8.9 ms, cpubound flat. 16384/32768 measured no further gain. Bit-exact (a cache only changes eviction frequency, never results).
  3. Event-bounded single-core quantum (Emulator::run_core_quantum, src/runtime/src/emulator.cpp). A true single core has no round-robin to preserve, so it now runs a large burst (cap 1 << 16) but never past the next scheduled event — Dhrystone +40% (212 → 298 MIPS) and more accurate (events fire at sub-quantum precision instead of rounding up to the round boundary). Gated strictly to cores == 1, so SMP interleaving and sim-time accounting are byte-unchanged; bit-exact between JIT and Switch.
  4. Interrupt-poll fast-path via IInterruptController::raw_pending() (src/interfaces/include/tero/iinterrupt_controller.hpp; IrqMP/IrqAMP; Emulator::sample_interrupts). The dispatcher polls the controller at every block boundary, but it is empty ~99.99% of the time at a 100 Hz tick. raw_pending() is a maintained single-word superset; a 0 result proves the full scan would return 0, so the poll early-outs. Dhrystone +2.7% (301.9 → 310.2 MIPS, controlled A/B). Provably bit-exact; the reader early-out is gated to SingleThread (no MT behavior change).

Rejected (measured net-negative, kept off / unmerged): cross-region JIT block linking ("lever B", ~9% slower), MT host-thread affinity pinning (net-negative on a shared host), and bigger JIT region fusion (jit_max_region_blocks 8 → 16/32/64, −8%). Method note: perf comparisons must be same-host back-to-back with governor=performance — the committed tests/results/*.csv are thermal-stale from other hosts and have produced phantom regressions. Full write-up: Performance.


2026-05-25 — Arch-neutral IR + tiered LLVM JIT landed on main

A two-part change merged the performance track into main and made LLVM mandatory (commits 981ebe8, then 72071c4 / e41d052 raising the LLVM floor to 18).

What landed.

  1. Arch-neutral IR (tero_ir) — a guest-ISA-neutral basic-block IR (IrBlock/IrInst/IrExit) over an opaque GuestState blob, a (PhysAddr, ModeCtx) block cache, and a reference IR interpreter. The seam for multi-arch: a new ISA is a new frontend, not a new core.
  2. SPARC frontend (tero_arch_sparc)translate_block, mode_ctx_of, take_exception, and the CpuStateGuestState sync.
  3. Tiered LLVM JIT (tero_jit) — IR → native via ORCv2; baseline O0 on the calling thread, background O2 promotion (ADR-002); self-loop + region chaining; inline RAM. cpubound-mix ~2000 MIPS single-core.
  4. Runtime selection — the four-way Dispatch enum and the TERO_ENABLE_JIT build flag are gone; EmulatorConfig::translation (bool, default true) picks JIT vs Switch interpreter at runtime (Decision 57). GDB works under translation (Decision 58).
  5. Threaded-code dispatch removed — the Phase 10.2 prototype reached ~1.5× over Switch, missed its targets, and is superseded by the IR JIT (Decision 59). Its results pages remain as a historical record.

Validation. Layered lockstep: IR-interpreter vs core::step, JIT vs IR-interpreter, JIT vs core::step — all byte-identical across an RTEMS boot; sptests/smptests under both methods; ASan/UBSan/LSan on the isolated tero_jit_tests exe. Full write-up: IR and LLVM JIT and Design decisions (Decisions 49–59).

Doc reconciliation. This pass also fixed pages that predated the merge and still claimed "no JIT / no decode cache" (execution model, design principles, architecture overview) and the roadmap's "JIT not in scope" line, and added the three new modules to the layer graphs.


2026-05-13 — smpschededf02 fixed: RTEMS task stack overflow

Outcome. N=2 smptests went from 41 → 42 PASS (smpschededf02 was the last "TIMEOUT cluster" canary that turned out to have an actually fixable upstream cause). FAIL count unchanged: no regressions.

Root cause. RTEMS's SPARC default CPU_STACK_MINIMUM_SIZE is 4 KiB. The smpschededf02 EDF worker tasks are created with RTEMS_MINIMUM_STACK_SIZE, so each gets exactly 4 KiB. Over a long EDF run their %sp drifts monotonically downward and eventually crosses below their allocated stack base — overflowing into the back-to-back-allocated stack of an adjacent task. The overflow hits that task's saved [%fp - 12] slot inside its dispatched _Thread_Do_dispatch frame; when the victim resumes via _CPU_Context_switch's retl path on a different CPU, it reloads the corrupted slot via ld [%fp + -12], %o0 at +688 and calls _Thread_queue_Do_acquire_critical with a non-lock pointer — eternal PIL=15 spin.

Fix. tests/guest-programs/rtems/build_smptests.sh now passes -DCONFIGURE_MINIMUM_TASK_STACK_SIZE=16384 to the SMP test build. 4 KiB → 16 KiB; ample headroom and no other test regressed.

Investigation infrastructure left in tree (all opt-in, no production cost):

  1. IEmulatorObserver::on_instruction(cpu, pc) — new per-instruction hook in src/interfaces/include/tero/iemulator_observer.hpp, wired in src/runtime/src/emulator.cpp between the GDB-stub break check and the actual core::step(). Default is empty; one branch per instruction when no observer is installed. See Decision 47.
  2. Dispatch probe — extended tests/integration/test_smpschededf02_trace.cpp with a DispatchProbe observer that snapshots register and memory state (per-CPU executing pointer, incoming TCB's saved fp/sp slot, the bad TCB's TCB+0x40/+0x48 fields) at scheduler entry points. Hidden behind the [.smpsched_dispatch_probe] Catch2 tag. The probe is what nailed down the overflow timeline (sim 16924 → 76453 → 80873 → 80955: TCB 0x4003ba58's saved fp/sp slot transitioning from valid to wild).
  3. Register-window roundtrip testtests/integration/test_regwin_roundtrip.cpp loads tests/guest-programs/asm/regwin-roundtrip/regwin_roundtrip.S, a bare-metal SPARC program that chains 7 SAVEs through one window-overflow trap + 7 RESTOREs through one window-underflow trap. Fixes a documentation gap: previously, only trap delivery was tested (tests/unit/test_traps.cpp), not the full roundtrip through real spill/fill handlers. The PASS rules out tero's basic window mechanics as a source of dispatch-frame corruption — useful ground truth that pointed the investigation at upstream stack layout instead of enter_trap/leave_trap.
  4. Quieter batch loggertests/integration/rtems_csv_harness.hpp now installs a StdoutLogger(LogLevel::Error) per emulator instance. The Info-level default lets a single misbehaving guest emit millions of [WARN] [prom] ignored write lines and bury the per-test outcome from the harness; Error-level cuts that to zero while leaving real fault lines visible.

What this rules out. Tero's _CPU_Context_switch interpretation is correct: it reads the saved fp/sp from the incoming TCB faithfully. The corruption is upstream, in the contents of that slot at the moment of save. Tero's save/restore/enter_trap/leave_trap are correct (regwin_roundtrip test). The basic register-window mechanics tested in unit tests already covered trap delivery; the new asm test covers the handler-RETT-retry loop end-to-end.

Files touched.

File Change
src/interfaces/include/tero/iemulator_observer.hpp Added on_instruction(cpu, pc) virtual hook with empty default.
src/runtime/src/emulator.cpp Fire observer_->on_instruction(...) before each core::step() in run_until_unpaced.
tests/guest-programs/rtems/build_smptests.sh cflags now include -DCONFIGURE_MINIMUM_TASK_STACK_SIZE=16384.
tests/guest-programs/asm/regwin-roundtrip/regwin_roundtrip.S New bare-metal SPARC test.
tests/guest-programs/asm/CMakeLists.txt Build target for regwin_roundtrip.elf.
tests/integration/test_regwin_roundtrip.cpp New Catch2 case [core][regwin][trap].
tests/integration/test_smpschededf02_trace.cpp New [.smpsched_dispatch_probe] test case + DispatchProbe class.
tests/integration/rtems_csv_harness.hpp Install StdoutLogger(LogLevel::Error) per emulator.
tests/CMakeLists.txt Wire TERO_REGWIN_ROUNDTRIP_ELF define + new test source.

2026-05-12 — GDB stub: late-binding, RTEMS thread-awareness B1

A single-session pass over the GDB stub that turned it from "works if you connect first" into a development-grade debugger backend:

  1. Late-binding attach. EmulatorConfig::gdb_stub_wait_for_client = false no longer leaves the socket lonely. run_until_unpaced calls GdbStub::poll_accept() once per quantum; a successful accept arms a stop reply and the run loop returns HaltReason::Breakpoint so the CLI hands over to process_until_resume(). Second clients are accepted-and-closed to drain the listen backlog. See Decision 41.

  2. qSymbol handshake. qSupported now advertises qSymbol+. A state machine in handle_q_symbol asks GDB for _Per_CPU_Information, stores the resolved address, and resets on unsolicited qSymbol:: (GDB file reload). Empty <addr_hex> ⇒ non-RTEMS guest, thread-awareness stays disabled. See Decision 43.

  3. Stop-on-ErrorMode with TT → signal mapping. When a core enters error_mode and a client is attached, the run loop redirects through GdbStub::report_error_mode(core), which reads TBR[11:4] and emits a stop reply with SIGILL / SIGBUS / SIGFPE / SIGSEGV / SIGTRAP according to the trap class. Mapping table in signal_from_tt() (gdb_stub_transport.cpp); GDB's internal signal numbers, not Linux's. Per-client error_reported_ latch prevents infinite re-notify on c. See Decisions 44 and 45.

  4. RTEMS thread-awareness B1. Report the executing thread per core as a real Objects_Id with its 4-byte ASCII name. Out of scope for now (logged in Debugging with GDB): B2 (enumerate blocked/ready threads), B3 (per-thread register context), B4 (priority/state in qThreadExtraInfo). Offsets hardcoded in rtems_layout::* and verified empirically against hello-world.elf via DWARF; each read is gated on three sanity checks. See Decision 42.

  5. Coherent stop replies. send_stop_reply now prefers the resolved RTEMS Objects_Id over core+1. Per-call read (no cache) so context switches in RTEMS never need stub invalidation. See Decision 46.

Verification.

  • [gdb_stub] suite: 20 cases / 828 assertions. Includes 2 late-binding tests, 1 qSymbol handshake test (2-round flow + restart), 2 ErrorMode tests (default → SIGSEGV; TT=0x07 → SIGBUS), 2 RTEMS-aware tests (forged memory, fallback when symbol absent), and a [!mayfail] live-guest test against hello-world.elf.
  • Regression sweep [emulator]+[rtems][boot]+[gdb_stub]: 43 cases / 949 assertions, zero regressions.

Files touched. Summary table preserved in the commit; full doc in Debugging with GDB.


Next steps (in order of value)

  1. B2 — enumerate blocked / ready threads. Walk _Objects_Information_table[OBJECTS_CLASSIC_API][OBJECTS_RTEMS_TASKS].local_table[]. Needs one more symbol via qSymbol and 2–3 more offsets. ~1 day.
  2. B3 — per-thread register context. g / G read from Thread_Control.Registers when the selected thread isn't the executing one. Enables bt on blocked threads. ~2 days.
  3. Hardware watchpoints (Z2 / Z3 / Z4) via per-load/store bus hooks. SPARC V8 has no real watchpoint hardware; emulation in the bus is straightforward but touches the hot path. ~1 day.
  4. qXfer:features:read:target.xml for explicit register description, particularly %asr17. Cosmetic. ~0.5 day.
  5. Phase 8 items (GPIO, Flash, CAN, SpaceWire, RegisterBank refactor). Detailed in the roadmap.

Session notes for the next agent

  • The user works in Spanish; responses in Spanish are welcome, mixed technical English is also fine (code/identifiers in English).
  • When in doubt about hardware semantics, read the manual, do not guess (CLAUDE.md §Rules for AI Agents, rule 1). SPARC V8 corner cases are the classic footgun here.
  • plans/roadmap.md is the work-tracking source of truth. docs/development/roadmap.md is its user-facing mirror. If they drift, the plans/ one wins.
  • docs/architecture/decisions.md is the searchable index of every judgment call that's not directly readable from the code. Add new decisions as numbered entries there; cross-reference them as "Decision N" from prose elsewhere.
  • Immediate WRPSR (Decision 38): write_psr_writable() (cpu_state.cpp:69) applies every writable PSR field at once, straight to psr_, matching the SIS oracle. SPARC V8 §5.1.2.3 permits an up-to-3-instruction delay on S/ET/PS/CWP, but it is implementation latitude (software pads WRPSR with NOPs). The earlier pending_psr_ / commit_psr_pipeline() delay model was removed because it desynced the register windows when a trap fired inside the window (smpschededf03).
  • ASR19 power-down: Writing to ASR19 sets is_powered_down_ = true on the core. Any raised trap (internal or external IRQ above PIL) clears it.
  • Idle-time skipping: When all active cores are powered-down, the emulator time-jumps to min(next_event_time, sim_time + kMaxIdleNs). kMaxIdleNs = 1 ms to keep the GPTimer ticking at roughly the right rate.
  • Secondary cores start powered-down; the primary (CPU 0) releases them via IrqMP. load_elf() sets is_powered_down = true for all cores except 0.
  • PeripheralContext includes ICharacterDevice* for UART I/O, wired by Emulator::set_character_device() (default StdoutCharDevice).
  • All peripheral MMIO still rejects non-word accesses with AlignmentError.
  • SPARC cross-toolchain is at /opt/rcc-1.3.2-gcc/bin/.
  • The RTEMS thread-awareness offsets in rtems_layout::* are hardcoded against RCC 1.3.2. If RCC is bumped, run ./build/tests/tero_tests "[gdb_stub][rtems-aware]" to check alignment — a [!mayfail] failure means the offsets drifted and you need to refresh them from new DWARF (how-to).