Tero — Project Status¶
Last updated: 2026-06-10. Landmarks this session: architecture audited 100 %
clean against its own frozen principles (the two violations found — peripheral
identity and histogram I/O — were fixed; Decisions 60–74 added), and the
documentation gained the entity-model / compose / runtime-decomposition pages
plus a known-debt register.
Read this after CLAUDE.md.
CLAUDE.md is the frozen architectural contract. This file is the living
handoff between sessions: what has been built, what is next, and the
judgment calls that happened during implementation and are not visible
from the file tree alone. Older session logs — Phase ¾/5 summaries,
the 2026-04-24 HW audit, the 2026-05-03 IRQ delivery fix — have been
moved to Status archive.
State at a glance¶
| Area | Status |
|---|---|
| MVP (Phases 0–6) | Done |
| FPU — Berkeley SoftFloat 3e | Done |
| GDB remote stub (RSP, late-binding, RTEMS-aware) | Done |
| SMP validation (N=2 / N=4) | Done — above the ≥ 50 % gate |
| Save / Restore lifecycle | Done |
| PROM (mkprom2 boot ROM) | Done |
| Phase 9.1 — ADRs + 1:1 roadmap landed | Done (2026-05-15) |
Phase 9.2 — tero-bench v0 + baseline |
Done (2026-05-15) — see Performance |
| Phase 10.1 — per-PC decode cache | Done |
| Phase 10.2 — threaded-code dispatch | Removed — ~1.5×, superseded by IR JIT (Decision 59) |
| Phase 11 — arch-neutral IR | Done — multi-arch seam |
Phase 12 — tiered LLVM JIT (default translation) |
Done — ~2000 MIPS; LLVM ≥ 18 mandatory (JIT) |
| Phase 14 — per-core perf wins (cache 8192, event-bounded N=1 quantum, IRQ-poll fast-path) | Done (2026-06-05) — ~2× uniprocessor Dhrystone, bit-exact (Performance) |
Multi-core time model — TimeAdvance::Concurrent (SIS-faithful shared timeline) |
Done (2026-06-05) — default; Sum kept as legacy. Fixes the ~N× inflated multi-core %realtime (ADR-005, plans/adr-005-multicore-time-model.md) |
| GR740 1:1 real-time | Met in MultiThread — 1.82× realtime on compute-bound load; SingleThread structurally short, bounded by one host thread (Performance) |
| SIS oracle cross-validation | 99.2 % (242/244) — scripts/oracle_compare.py; sole real divergence is smplock01 (Test results) |
| SRMMU, cache control | Deferred — see roadmap |
Build & test state¶
- Configure:
cmake -S . -B build -G Ninja - Build:
cmake --build build— 0 warnings / 0 errors under-Werrorwith the project's strict warning set. - Test:
ctest --test-dir build --output-on-failure— 499 / 499 (≥ 497 pass + ≤ 2 conditional skips for missing external artifacts). The skips fire only when an external dependency is unavailable (sparc-gaisler-rtems5-gdb, the RTEMS hello ELF, ornm); they do not indicate a bug. - The
test_rtems_sptests,test_rtems_fptestsandtest_rtems_smptestsintegration tests run in CTest. A sptest counts as PASS when its captured UART contains*** END OF TEST(Decision 36). The SMP gate is ≥ 50 %; current rates are 77.8 % (42/54 N=2 GR712RC) and 74.1 % (40/54 N=4 GR740) — both well above the gate. - End-to-end:
./build/src/app/tero-emu --image tests/guest-programs/rtems/hello-world/hello-world.elfprints "Hello World" via APBUART and returns cleanly. - Toolchain: GCC ≥ 13 or Clang ≥ 17, CMake ≥ 3.25, Ninja, LLVM ≥ 18
(mandatory — the binary-translation JIT links LLVM ORCv2; found via
find_package(LLVM REQUIRED CONFIG)). SPARC cross-toolchain (RCC 1.3.2) at/opt/rcc-1.3.2-gcc/bin/is used fortests/guest-programs/asm/and any RTEMS rebuild. - Execution method is runtime-selected:
EmulatorConfig::translation(defaulttrue= tiered LLVM JIT;false= thecore::stepSwitch interpreter, the correctness oracle). Both paths are compiled into every build — there is noTERO_ENABLE_JITbuild flag.
Modules: real code vs. stub¶
| Target | State |
|---|---|
tero_interfaces |
Complete — strong types, Result<T>, ICpuBus, and every interface header from CLAUDE.md. |
tero_defaults |
Complete — StdoutLogger, NullFaultInjector, StdoutCharDevice, DebugPublisher. |
tero_bus |
Complete — Ram, SystemBus (routing + BE typed access + IBusMaster). |
tero_core |
Complete — CpuState (8 register windows, PSR/WIM/TBR/Y with 3-instruction pipeline, FP context), full SPARC V8 integer decoder + FPU dispatcher (FPop1/FPop2 backed by Berkeley SoftFloat 3e), handlers split by category, step() with trap dispatch + error mode + annul. |
tero_peripherals |
Complete — MemCtrl (FTMCTRL), IrqMP / IrqAMP (GR712RC and GR740 controllers), GPTimer (4 sub-timers + watchdog), ApbUart (FIFO + IRQ), Prom (mkprom2 boot ROM). |
tero_ir |
Complete — arch-neutral IrBlock/IrInst/IrExit, opaque GuestState, (PhysAddr, ModeCtx) block cache, reference IR interpreter. Guest-ISA-neutral (depends only on tero_interfaces). |
tero_arch_sparc |
Complete — SPARC frontend: translate_block (SPARC → IR), mode_ctx_of, take_exception, CpuState↔GuestState sync. Partial-but-correct: un-lowered ops (FP, atomics, RETT/Ticc, the cc-setting divide/multiply-step forms, most special-register access, and the non-predicable annulled Bicc,a delay slots — stores and LDD) end the block and fall back to core::step. |
tero_jit |
Complete — IR → native via LLVM ORCv2 (LLJIT); tiered baseline-O0 / background-O2 (ADR-002), self-loop + region chaining, inline RAM. Isolated module; LLVM (≥ 18) mandatory. |
tero_runtime |
Complete — Emulator (drives both the Switch interpreter and run_ir_quantum per EmulatorConfig::translation), EmulatorConfig + gr712rc_config() / gr740_config() factories, EventScheduler, CpuBusBridge, ElfLoader (detects mkprom2 .rom and flattens PT_LOAD into PROM), GdbStub (RSP over TCP, late-binding, RTEMS thread-awareness B1, stop-on-ErrorMode with TT → signal mapping). Links tero_arch_sparc + tero_jit. |
tero_app |
Full CLI: --image, --ram, --cores, --budget, --turbo, --gdb-port, --gdb-wait, --verbose, --version, --help. Default pacing is Realtime; --turbo switches to free-running. |
demo_dma_device |
Complete (in examples/demo-dma/). Reference custom peripheral with DMA + IRQ; ~200 LOC. |
tests |
499 Catch2 cases. Suite includes unit tests per module + integration tests for bare_metal, demo_dma_device, hello_uart_elf, rtems_boot, rtems_sptests, rtems_fptests, rtems_smptests, rtems_hello_tero (PROM), smp_atomics, gdb_stub_protocol (codec + late-binding + qSymbol + error-mode + RTEMS-aware live guest), gdb_stub_rtems (real GDB binary, conditional). |
External dependencies (FetchContent)¶
- Catch2 v3.5.3 — test framework, only fetched when
TERO_BUILD_TESTS=ON. - tl::expected v1.1.0 — stand-in for
std::expected, which libstdc++ gates behind C++23. The project is fixed on C++20, sotero::Result<T>istl::expected<T, ErrorCode>. - fmt — vendored for logging.
- Berkeley SoftFloat 3e — vendored under
third-party/softfloat3e/, built intotero_core. Configured with the8086-SSEspecialization (upstream default).
LLVM is not FetchContent — it is a host requirement
LLVM (≥ 18) is the one mandatory dependency CMake does not fetch or
vendor. Install it from your distribution (llvm-devel / llvm-NN-dev)
or use nix develop (the flake pins LLVM 19). See
Installation.
2026-06-10 — TERO rename + machine-script grammar v2 + kits as .tero files¶
The project was renamed Lince → TERO (Translating Emulator for Real-time
Onboard-systems; availability web-verified, repo now claaj/tero, old URLs
redirect): namespaces tero::, include roots include/tero/, binaries
tero-emu/tero-bench, options TERO_*, script extension .tero,
component-ABI symbol tero_component_abi_version with ComponentAbiVersion
reset to 1 (no external libraries existed). plans/ and references/
stay untouched as history. Gates: full fast suite + plugin suites + [boot]
incl. the mkprom2 PROM path, all green; zero guest-visible change.
Then the machine-script track (plans/soc-config-file-design.md) executed
end-to-end on branch feat/grammar-v2:
- F0 — grammar v2 (
793be41): bounded expressions +let,newwith inline props, dotted assignment, kwargmap,connect obj.port peer:IInterfacewith real interface tokens, indexed ports,entity-check(always-on hard invariants + strict verb +tero-emu --entity-checkdry-run). One grammar, no aliases. Decision 75. - F1 — edge-based PnP (
9439299):connect <bridge>.slaves <dev>:IAmbaPnp, edge order = slot,slaves[N]pins;pnp_slot/pnp_publishdeleted; kits migrated byte-identically (kit tests passed unchanged). The git-history audit corrected the rollout premise — the 5IAmbaPnpimplementors are the complete historically-published set. Decision 76. Slot semantics amended by Decision 78 (see below). - F2/F3 — kits as files (
7c22b32):machines/gr712rc.tero+gr740.teroembedded at configure time; the C++ compositions deleted after dump-byte-equality proof on both SoCs; kit entity-check surfaced missing cpu1..3 documentation edges (fixed). Decision 77. - F4: machine files installed under
share/tero/machines/; thegr712rc-derived.teroexample teaches copy-the-kit-and-edit (validated by the CLI dry-run); docs updated. - Follow-up — order-independent slaves +
StdoutMonitor(2026-06-10): PnP slot numbers became internal (records compact in edge order; GRLIB software matches by identity + BAR, never by slot), soslaves[N]pinning andpnp_ahb_slotwere removed —slaves[N]now failsbuild()with a diagnostic. The kits' tables compact (GR740 APB 0,1,3,4,5 → 0..4); RTEMS[boot]passed unchanged on both kits. TheConsolecomponent was renamedStdoutMonitor(ComponentKind::Monitor) — it is a host stdout sink, not hardware. Decision 78.
2026-06-09/10 — Cleanliness sweep + documentation completion¶
The architecture was audited against its own frozen principles (singletons, direct I/O, dependency direction, strong types, compile-time behaviour flags, arch-agnosticism). Exactly two violations were found and both were fixed; a duplication/smell audit followed; the documentation gained the entity-model / compose / decomposition coverage it was missing.
- Peripheral identity unified (
5a6d121, merge6f6eafa).IPeripheral::name()is nowfinaland returns the runtime-injectedPeripheralSpec::instance_name; authors implementdevice_class().find_entity(x)->name() == xholds for every entity kind;add_peripheral()peripherals became findable; the observer reports instance names.ComponentAbiVersion1 → 2 (v1 component libraries are rejected at load). All ~33 in-tree implementors renamed; the sniffer-plugin test stubs were caught by the Switch build (the JIT build does not compile the plugins). Decision 70. - Opcode-histogram I/O moved out of the core (
d185a8a+9b1dd78).~ExecutionEngine'sfopen/getenvCSV dump becameEmulator::opcode_histogram()+ the pureformat_opcode_histogram_csv();tero-benchandtero-emuown the I/O policy (verified end-to-end inbuild-histogram; on 2026-06-10 the$TERO_HISTOGRAM_OUTenv var was replaced by an explicit--histogram-outflag on both clients). Decision 74. - Cheap smell fixes (
9b1dd78): MIRROR WARNING comments on the two run-loop mirrors inengine_translate.cpp;ctx.irqempty-guard unified inSoc::add_peripheral. - Docs: new pages
architecture/entity-model.md,architecture/runtime-decomposition.md,modules/compose.md,guide/machine-scripts.md,development/known-debt.md; stale pages updated for the decomposition/compose reality; Decisions 60–74 added (entity model, compose, runtime decomposition, identity, histogram). - Gates: fast suite 11283 assertions / 1104 cases green on Switch, JIT and ASan/UBSan builds; 6 sniffer/connector plugin suites green; histogram CSV byte-identical pre/post.
2026-06-05 — Per-core perf wins on main: ~2× uniprocessor Dhrystone¶
Four bit-exact per-core optimizations landed on main, each found by
profiling and confirmed by a controlled same-host back-to-back A/B
(governor=performance). Together they roughly double uniprocessor Dhrystone
throughput — Dhrystone N=1 measures ~295 MIPS on main (governor
performance), up from a ~150 MIPS pre-Phase-12 baseline. The full ctest suite
stays green (709 cases pass, 0 fail) across uniprocessor + SMP N=2/N=4 and
Switch + JIT + MultiThread.
- Dispatch-cache anchor (lever A) (
IrBlock::jit_entry;TieredJit::*_anchored;Emulator::run_ir_quantum). Stash the stableCacheEntryaddress in the block on first compile and reuse it to resolve the best fn / count executions without the per-dispatchunordered_maphash (the #1 call-heavy hotspot): Dhrystone +19.5% (238 → 285 MIPS). Bit-exact (anchor read only after a validatedfind). Previously PR #15; merged here. - IR
BlockCache1024 → 8192 slots (src/ir/include/tero/ir/block_cache.hpp). The direct-mapped index covered only a 4 KiB PC window at 1024, so call-heavy guests aliased hot code against libc and constantly re-translated. 8192 (a 32 KiB window) removes the aliasing: Dhrystone +39% (153 → 212 MIPS), p99 slice jitter 28.7 ms → 8.9 ms, cpubound flat. 16384/32768 measured no further gain. Bit-exact (a cache only changes eviction frequency, never results). - Event-bounded single-core quantum (
Emulator::run_core_quantum,src/runtime/src/emulator.cpp). A true single core has no round-robin to preserve, so it now runs a large burst (cap1 << 16) but never past the next scheduled event — Dhrystone +40% (212 → 298 MIPS) and more accurate (events fire at sub-quantum precision instead of rounding up to the round boundary). Gated strictly tocores == 1, so SMP interleaving and sim-time accounting are byte-unchanged; bit-exact between JIT and Switch. - Interrupt-poll fast-path via
IInterruptController::raw_pending()(src/interfaces/include/tero/iinterrupt_controller.hpp; IrqMP/IrqAMP;Emulator::sample_interrupts). The dispatcher polls the controller at every block boundary, but it is empty ~99.99% of the time at a 100 Hz tick.raw_pending()is a maintained single-word superset; a 0 result proves the full scan would return 0, so the poll early-outs. Dhrystone +2.7% (301.9 → 310.2 MIPS, controlled A/B). Provably bit-exact; the reader early-out is gated toSingleThread(no MT behavior change).
Rejected (measured net-negative, kept off / unmerged): cross-region JIT
block linking ("lever B", ~9% slower), MT host-thread affinity pinning
(net-negative on a shared host), and bigger JIT region fusion
(jit_max_region_blocks 8 → 16/32/64, −8%). Method note: perf comparisons
must be same-host back-to-back with governor=performance — the committed
tests/results/*.csv are thermal-stale from other hosts and have produced
phantom regressions. Full write-up:
Performance.
2026-05-25 — Arch-neutral IR + tiered LLVM JIT landed on main¶
A two-part change merged the performance track into main and made LLVM
mandatory (commits 981ebe8, then 72071c4 / e41d052 raising the LLVM
floor to 18).
What landed.
- Arch-neutral IR (
tero_ir) — a guest-ISA-neutral basic-block IR (IrBlock/IrInst/IrExit) over an opaqueGuestStateblob, a(PhysAddr, ModeCtx)block cache, and a reference IR interpreter. The seam for multi-arch: a new ISA is a new frontend, not a new core. - SPARC frontend (
tero_arch_sparc) —translate_block,mode_ctx_of,take_exception, and theCpuState↔GuestStatesync. - Tiered LLVM JIT (
tero_jit) — IR → native via ORCv2; baseline O0 on the calling thread, background O2 promotion (ADR-002); self-loop + region chaining; inline RAM.cpubound-mix~2000 MIPS single-core. - Runtime selection — the four-way
Dispatchenum and theTERO_ENABLE_JITbuild flag are gone;EmulatorConfig::translation(bool, defaulttrue) picks JIT vs Switch interpreter at runtime (Decision 57). GDB works under translation (Decision 58). - Threaded-code dispatch removed — the Phase 10.2 prototype reached ~1.5× over Switch, missed its targets, and is superseded by the IR JIT (Decision 59). Its results pages remain as a historical record.
Validation. Layered lockstep: IR-interpreter vs core::step,
JIT vs IR-interpreter, JIT vs core::step — all byte-identical across an
RTEMS boot; sptests/smptests under both methods; ASan/UBSan/LSan on the
isolated tero_jit_tests exe. Full write-up: IR and LLVM JIT
and Design decisions (Decisions 49–59).
Doc reconciliation. This pass also fixed pages that predated the merge
and still claimed "no JIT / no decode cache" (execution model, design
principles, architecture overview) and the roadmap's "JIT not in scope"
line, and added the three new modules to the layer graphs.
2026-05-13 — smpschededf02 fixed: RTEMS task stack overflow¶
Outcome. N=2 smptests went from 41 → 42 PASS (smpschededf02 was the last "TIMEOUT cluster" canary that turned out to have an actually fixable upstream cause). FAIL count unchanged: no regressions.
Root cause. RTEMS's SPARC default CPU_STACK_MINIMUM_SIZE is
4 KiB. The smpschededf02 EDF worker tasks are created with
RTEMS_MINIMUM_STACK_SIZE, so each gets exactly 4 KiB. Over a long
EDF run their %sp drifts monotonically downward and eventually
crosses below their allocated stack base — overflowing into the
back-to-back-allocated stack of an adjacent task. The overflow hits
that task's saved [%fp - 12] slot inside its dispatched
_Thread_Do_dispatch frame; when the victim resumes via
_CPU_Context_switch's retl path on a different CPU, it reloads
the corrupted slot via ld [%fp + -12], %o0 at +688 and calls
_Thread_queue_Do_acquire_critical with a non-lock pointer — eternal
PIL=15 spin.
Fix. tests/guest-programs/rtems/build_smptests.sh now passes
-DCONFIGURE_MINIMUM_TASK_STACK_SIZE=16384 to the SMP test build.
4 KiB → 16 KiB; ample headroom and no other test regressed.
Investigation infrastructure left in tree (all opt-in, no production cost):
IEmulatorObserver::on_instruction(cpu, pc)— new per-instruction hook insrc/interfaces/include/tero/iemulator_observer.hpp, wired insrc/runtime/src/emulator.cppbetween the GDB-stub break check and the actualcore::step(). Default is empty; one branch per instruction when no observer is installed. See Decision 47.- Dispatch probe — extended
tests/integration/test_smpschededf02_trace.cppwith aDispatchProbeobserver that snapshots register and memory state (per-CPU executing pointer, incoming TCB's saved fp/sp slot, the bad TCB's TCB+0x40/+0x48 fields) at scheduler entry points. Hidden behind the[.smpsched_dispatch_probe]Catch2 tag. The probe is what nailed down the overflow timeline (sim 16924 → 76453 → 80873 → 80955: TCB 0x4003ba58's saved fp/sp slot transitioning from valid to wild). - Register-window roundtrip test —
tests/integration/test_regwin_roundtrip.cpploadstests/guest-programs/asm/regwin-roundtrip/regwin_roundtrip.S, a bare-metal SPARC program that chains 7 SAVEs through one window-overflow trap + 7 RESTOREs through one window-underflow trap. Fixes a documentation gap: previously, only trap delivery was tested (tests/unit/test_traps.cpp), not the full roundtrip through real spill/fill handlers. The PASS rules out tero's basic window mechanics as a source of dispatch-frame corruption — useful ground truth that pointed the investigation at upstream stack layout instead ofenter_trap/leave_trap. - Quieter batch logger —
tests/integration/rtems_csv_harness.hppnow installs aStdoutLogger(LogLevel::Error)per emulator instance. The Info-level default lets a single misbehaving guest emit millions of[WARN] [prom] ignored writelines and bury the per-test outcome from the harness; Error-level cuts that to zero while leaving real fault lines visible.
What this rules out. Tero's _CPU_Context_switch interpretation
is correct: it reads the saved fp/sp from the incoming TCB faithfully.
The corruption is upstream, in the contents of that slot at the
moment of save. Tero's save/restore/enter_trap/leave_trap
are correct (regwin_roundtrip test). The basic register-window
mechanics tested in unit tests already covered trap delivery; the
new asm test covers the handler-RETT-retry loop end-to-end.
Files touched.
| File | Change |
|---|---|
src/interfaces/include/tero/iemulator_observer.hpp |
Added on_instruction(cpu, pc) virtual hook with empty default. |
src/runtime/src/emulator.cpp |
Fire observer_->on_instruction(...) before each core::step() in run_until_unpaced. |
tests/guest-programs/rtems/build_smptests.sh |
cflags now include -DCONFIGURE_MINIMUM_TASK_STACK_SIZE=16384. |
tests/guest-programs/asm/regwin-roundtrip/regwin_roundtrip.S |
New bare-metal SPARC test. |
tests/guest-programs/asm/CMakeLists.txt |
Build target for regwin_roundtrip.elf. |
tests/integration/test_regwin_roundtrip.cpp |
New Catch2 case [core][regwin][trap]. |
tests/integration/test_smpschededf02_trace.cpp |
New [.smpsched_dispatch_probe] test case + DispatchProbe class. |
tests/integration/rtems_csv_harness.hpp |
Install StdoutLogger(LogLevel::Error) per emulator. |
tests/CMakeLists.txt |
Wire TERO_REGWIN_ROUNDTRIP_ELF define + new test source. |
2026-05-12 — GDB stub: late-binding, RTEMS thread-awareness B1¶
A single-session pass over the GDB stub that turned it from "works if you connect first" into a development-grade debugger backend:
-
Late-binding attach.
EmulatorConfig::gdb_stub_wait_for_client = falseno longer leaves the socket lonely.run_until_unpacedcallsGdbStub::poll_accept()once per quantum; a successful accept arms a stop reply and the run loop returnsHaltReason::Breakpointso the CLI hands over toprocess_until_resume(). Second clients are accepted-and-closed to drain the listen backlog. See Decision 41. -
qSymbolhandshake.qSupportednow advertisesqSymbol+. A state machine inhandle_q_symbolasks GDB for_Per_CPU_Information, stores the resolved address, and resets on unsolicitedqSymbol::(GDBfilereload). Empty<addr_hex>⇒ non-RTEMS guest, thread-awareness stays disabled. See Decision 43. -
Stop-on-ErrorMode with TT → signal mapping. When a core enters
error_modeand a client is attached, the run loop redirects throughGdbStub::report_error_mode(core), which readsTBR[11:4]and emits a stop reply withSIGILL/SIGBUS/SIGFPE/SIGSEGV/SIGTRAPaccording to the trap class. Mapping table insignal_from_tt()(gdb_stub_transport.cpp); GDB's internal signal numbers, not Linux's. Per-clienterror_reported_latch prevents infinite re-notify onc. See Decisions 44 and 45. -
RTEMS thread-awareness B1. Report the executing thread per core as a real
Objects_Idwith its 4-byte ASCII name. Out of scope for now (logged in Debugging with GDB): B2 (enumerate blocked/ready threads), B3 (per-thread register context), B4 (priority/state inqThreadExtraInfo). Offsets hardcoded inrtems_layout::*and verified empirically againsthello-world.elfvia DWARF; each read is gated on three sanity checks. See Decision 42. -
Coherent stop replies.
send_stop_replynow prefers the resolved RTEMSObjects_Idovercore+1. Per-call read (no cache) so context switches in RTEMS never need stub invalidation. See Decision 46.
Verification.
[gdb_stub]suite: 20 cases / 828 assertions. Includes 2 late-binding tests, 1qSymbolhandshake test (2-round flow + restart), 2 ErrorMode tests (default → SIGSEGV; TT=0x07 → SIGBUS), 2 RTEMS-aware tests (forged memory, fallback when symbol absent), and a[!mayfail]live-guest test againsthello-world.elf.- Regression sweep
[emulator]+[rtems][boot]+[gdb_stub]: 43 cases / 949 assertions, zero regressions.
Files touched. Summary table preserved in the commit; full doc in Debugging with GDB.
Next steps (in order of value)¶
- B2 — enumerate blocked / ready threads. Walk
_Objects_Information_table[OBJECTS_CLASSIC_API][OBJECTS_RTEMS_TASKS].local_table[]. Needs one more symbol viaqSymboland 2–3 more offsets. ~1 day. - B3 — per-thread register context.
g/Gread fromThread_Control.Registerswhen the selected thread isn't the executing one. Enablesbton blocked threads. ~2 days. - Hardware watchpoints (
Z2/Z3/Z4) via per-load/store bus hooks. SPARC V8 has no real watchpoint hardware; emulation in the bus is straightforward but touches the hot path. ~1 day. qXfer:features:read:target.xmlfor explicit register description, particularly%asr17. Cosmetic. ~0.5 day.- Phase 8 items (GPIO, Flash, CAN, SpaceWire, RegisterBank refactor). Detailed in the roadmap.
Session notes for the next agent¶
- The user works in Spanish; responses in Spanish are welcome, mixed technical English is also fine (code/identifiers in English).
- When in doubt about hardware semantics, read the manual, do not
guess (
CLAUDE.md §Rules for AI Agents, rule 1). SPARC V8 corner cases are the classic footgun here. plans/roadmap.mdis the work-tracking source of truth.docs/development/roadmap.mdis its user-facing mirror. If they drift, theplans/one wins.docs/architecture/decisions.mdis the searchable index of every judgment call that's not directly readable from the code. Add new decisions as numbered entries there; cross-reference them as "Decision N" from prose elsewhere.- Immediate WRPSR (Decision 38):
write_psr_writable()(cpu_state.cpp:69) applies every writable PSR field at once, straight topsr_, matching the SIS oracle. SPARC V8 §5.1.2.3 permits an up-to-3-instruction delay on S/ET/PS/CWP, but it is implementation latitude (software padsWRPSRwith NOPs). The earlierpending_psr_/commit_psr_pipeline()delay model was removed because it desynced the register windows when a trap fired inside the window (smpschededf03). - ASR19 power-down: Writing to ASR19 sets
is_powered_down_ = trueon the core. Any raised trap (internal or external IRQ above PIL) clears it. - Idle-time skipping: When all active cores are powered-down, the
emulator time-jumps to
min(next_event_time, sim_time + kMaxIdleNs).kMaxIdleNs = 1 msto keep the GPTimer ticking at roughly the right rate. - Secondary cores start powered-down; the primary (CPU 0) releases
them via IrqMP.
load_elf()setsis_powered_down = truefor all cores except 0. PeripheralContextincludesICharacterDevice*for UART I/O, wired byEmulator::set_character_device()(defaultStdoutCharDevice).- All peripheral MMIO still rejects non-word accesses with
AlignmentError. - SPARC cross-toolchain is at
/opt/rcc-1.3.2-gcc/bin/. - The RTEMS thread-awareness offsets in
rtems_layout::*are hardcoded against RCC 1.3.2. If RCC is bumped, run./build/tests/tero_tests "[gdb_stub][rtems-aware]"to check alignment — a[!mayfail]failure means the offsets drifted and you need to refresh them from new DWARF (how-to).