Architecture overview¶
Are you a user looking to run or embed Tero?
This section is the Developer Manual — it explains how the emulator
is built internally. You do not need it to run tero-emu or to
embed tero_runtime in your own program.
For those tasks, use the User Manual instead:
| Goal | Go here |
|---|---|
| Build and install | Getting started → Installation |
| Boot an RTEMS image | Getting started → Quickstart |
| Look up every CLI flag | Guide → CLI reference |
| Embed as a library | Guide → Embedding as a library |
Configure EmulatorConfig |
Guide → Configuration |
| Write a custom peripheral | Peripherals → Custom peripherals |
This is the canonical entry point for developers working on the Tero source code — the "how the whole project is conceived" landing page. It ties the layers, the execution model, and the design rules into one big-picture, then hands off to the focused pages for detail. For the current state and judgment calls, see development/status.
The sister pages in this section drill into individual concerns:
- Layers and modules — CMake target/layer graph + strict dependency rules
- Entity object model — everything modelled is a
tero::IEntity; capabilities viaget_interface<T>() - Runtime decomposition —
Emulatoras a facade overSoc+ExecutionEngine+DebugServer - Design principles — the non-negotiable invariants
- Design decisions / ADRs — the canonical decision log (ADR-001..004 + the numbered decisions)
- Execution model —
run_forloop, quantum, round-robin, idle-skip, pacing - IR and LLVM JIT — the binary-translation engine
- Multicore and timing — round-robin, idle skip, CPI/clock
- Memory and bus —
SystemBus, RAM, BE handling, DMA - Peripheral system —
PeripheralSpec, lifecycle, ports - Traps and interrupts — TBR, sampling, ErrorMode
- Adding a frontend — the multi-arch IR seam
1. What Tero is (and is not)¶
Tero is a C++20 functional emulator for Cobham Gaisler's
GR712RC (dual-core LEON3FT) and GR740 (quad-core LEON4FT)
SPARC V8 systems-on-chip. Its primary goal is to run the RTEMS 5 and
RTEMS 6 testsuite for BSP leon3 with high pass rates across
uniprocessor and SMP configurations. Its long-horizon goal (Phase 9+) is
sustained 1:1 wall-clock real-time emulation of the GR740.
What Tero is:
- A functional emulator with two execution methods — a fetch-decode-execute Switch interpreter (the correctness oracle) and a default binary-translation path (arch-neutral IR + tiered LLVM JIT). See Execution model and IR and LLVM JIT.
- A standalone C++ library, designed so an SMP2 (ECSS-E-ST-40-07) wrapper — maintained in a separate repository — can drive it under an external scheduler. The single-thread execution mode is a first-class, permanently-supported option for exactly this reason (ADR-001).
- Capable of booting RTEMS in uniprocessor and SMP configurations (validated N=2 GR712RC, N=4 GR740), with FPU, GDB stub, and PROM/MKPROM2 boot.
What Tero is NOT (frozen non-goals):
- Not cycle-accurate. 1:1 is wall-clock, not pipeline-cycle (ADR-004). Caches and the SRMMU are not modelled.
- Not a Linux host. Tero is RTEMS-focused; no MMU-using guest is required today, so the SRMMU is stubbed and deferred.
- Not the SMP2 wrapper itself (separate repository) and not a networking guest stack.
The FPU is implemented (Berkeley SoftFloat 3e, vendored).
2. The big picture¶
Tero is a stack of CMake static libraries with strictly unidirectional
dependencies, orchestrated by one entry-point class: Emulator. Since
the 2026-06 entity-model refactor, Emulator is a facade over three
subsystems — Soc (the assembled machine), ExecutionEngine (cores,
run loop, clock, JIT), and DebugServer (GDB stub + breakpoints) — see
Runtime decomposition. The diagram below
shows both the layering (who links whom) and, on the right, the
execution flow (how a run_until call turns into retired guest
instructions).
flowchart TB
subgraph stack["Library stack (link-time dependencies)"]
direction TB
app["tero_app<br/><i>CLI: tero-emu</i>"]
comp["tero_compose<br/><i>Machine, kits, .tero scripts,<br/>component-library loader</i>"]
rt["tero_runtime<br/><i>Emulator (facade), Soc, ExecutionEngine,<br/>DebugServer, ElfLoader, GdbStub</i>"]
per["tero_peripherals<br/><i>IrqMP/IrqAMP, GPTimer, ApbUart,<br/>MemCtrl, Prom, GRGPIO</i>"]
bus["tero_bus<br/><i>SystemBus, Ram</i>"]
defs["tero_defaults<br/><i>StdoutLogger, StdoutCharDevice,<br/>NullFaultInjector, DebugPublisher</i>"]
arch["tero_arch_sparc<br/><i>SPARC → IR frontend, sync</i>"]
jit["tero_jit<br/><i>IR → native, LLVM ORCv2</i>"]
ir["tero_ir<br/><i>arch-neutral IR, GuestState,<br/>BlockCache, IR interpreter</i>"]
core["tero_core<br/><i>SPARC V8 ISA, CpuState,<br/>decoder, step, handlers</i>"]
iface["tero_interfaces<br/><i>strong types, Result<T>, I* contracts</i>"]
app --> comp
comp --> rt
rt --> per --> bus
rt --> defs
rt --> arch
rt --> jit
rt --> core
rt --> bus
arch --> ir
arch --> core
jit --> ir
core --> ir
core -.-> iface
bus -.-> iface
per -.-> iface
ir -.-> iface
defs -.-> iface
end
subgraph flow["Execution flow (run_until)"]
direction TB
f0["run_until(deadline)"] --> f1{pacing?}
f1 -- Realtime --> f2["slice + sleep_until"]
f1 -- Turbo --> f3
f2 --> f3["run_until_unpaced: per round"]
f3 --> f4["sample_interrupts / idle-skip"]
f4 --> f5{translation?}
f5 -- true --> f6["run_ir_quantum<br/>JIT block / IR fallback"]
f5 -- false --> f7["core::step ×quantum"]
f6 --> f8["fire events + tick peripherals"]
f7 --> f8
end
rt -. drives .-> f0
Three facts to internalise from the diagram:
tero_corelinkstero_ir.CpuState's integer register file is anir::GuestStatebyte blob (state unification), so the Switch interpreter and the IR engine read/write the same bytes with no sync. The SPARC↔IR bridge (mode_ctx_of,translate_block) lives intero_arch_sparc, the only module linking bothcoreandir. See Layers.- The two execution methods share one
Emulatorand oneCpuState. They are not separate emulators;translationis a runtime field, and the JIT path falls back tocore::stepfor traps, delay slots, annulled slots, and untranslatable ops. tero::coreis a PRIVATE link oftero_runtime(src/runtime/CMakeLists.txt:34-48). The runtime's run loop drives cores through the arch-neutralir::IArchitectureseam, so neutral consumers do not transitively link the SPARC core. See Layers.
3. The non-negotiable design principles¶
These shape every line of code. Each is restated with full rationale and the code that enforces it in Design principles:
| Principle | One-line statement |
|---|---|
| Zero singletons / global mutable state | Two Emulators in one process must never interfere. |
| Zero direct I/O from the core | Everything goes through ILogger / ICharacterDevice. |
| Time as a parameter | The core computes simulated time; it never samples the host clock (except gated Realtime pacing). |
| Configuration by struct | EmulatorConfig is a plain struct — no file parsing, no behaviour build-flags. |
| Errors as values | The public boundary returns Result<T> = tl::expected<T, ErrorCode>. |
| Strong types everywhere | PhysAddr, VirtAddr, CoreId, SimTimeNs — never a bare uint32_t. |
| Switch interpreter is the oracle | Every translated path is validated bit-identical against core::step. |
| Round-robin single-thread default | TSO and atomics are correct by construction; MultiThread is the opt-in escape hatch. |
| Big-endian at the typed boundary | RAM stores raw bytes; the byte-swap happens in the accessor / IR op. |
4. The src/ modules at a glance¶
The authoritative dependency edges are in Layers and modules; this is the orientation map.
tero_interfaces (src/interfaces/)¶
Header-only vocabulary library. Defines the strong types and Result<T>
(types.hpp), AddressRange, the PeripheralContext aggregate, the
entity-model base (IEntity, IEntityRegistry, IConnectable,
IMmio — see Entity object model), and every I*
contract (IPeripheral, ICpu, IGdbRegisters, ICpuBus,
IBusMaster, IInterruptSource, IInterruptController, IScheduler,
IEvent, ILogger, ICharacterDevice, ITimeSource, IPublisher,
IFaultInjector, IEmulatorObserver, IPort, IMemoryRegion, the
per-protocol comms interfaces). Plus GatedMutex (ADR-001) and
BreakpointSet. No logic, no state. Full inventory:
tero_interfaces.
tero_core (src/core/)¶
The SPARC V8 integer + FP ISA. CpuState (8-window register file, PSR,
WIM/TBR/Y/PC/nPC, ASR19
power-down, error mode, the per-PC decode cache, and the integer state
exposed as a GuestState blob). The decoder (DecodedInsn + InsnKind),
the category handlers (handlers_alu/branch/loadstore/regwin/special),
the FP handlers (SoftFloat 3e), step (the single-instruction driver),
and the trap constants. Reaches the world only through the injected
ICpuBus.
tero_bus (src/bus/)¶
Physical memory routing. Ram (raw std::vector<std::byte>,
endianness-agnostic) and SystemBus (the central physical-address router;
owns RAM, holds non-owning IPeripheral* MMIO regions, does the
byte-order shuffle in encode/decode (MemEndian, default
big-endian), and implements
IBusMaster so DMA shares the exact CPU memory map). See
Memory and bus.
tero_peripherals (src/peripherals/)¶
The GRLIB IP cores. Default GR712RC MMIO layout (from the kit and the
DefaultBase constants):
| Class | MMIO base | IRQ | Description |
|---|---|---|---|
MemCtrl (FTMCTRL) |
0x80000000 |
— | Passive stub; MCFG1–4 readable/writable, no side effects. |
ApbUart (console) |
0x80000100 |
2 | Primary serial port; 8-byte RX FIFO, immediate TX via ICharacterDevice. |
IrqMP |
0x80000200 |
— | GR712RC multi-core interrupt controller (mask/force/broadcast). |
GPTimer |
0x80000300 |
8 | 4 sub-timers + prescaler; timer 4 is an armed watchdog. |
GRGPIO ×2 |
0x80000900 / 0x80000A00 |
— | GPIO ports, per-pin ISignalPort. |
GR740 substitutes IrqAMP (0xFF904000) — a sibling IP core, not a
GR712RC bolt-on (feedback_irqmp_vs_irqamp) — and maps its own UARTs
(IRQ 29/30). Prom is auto-wired when prom_size != 0. See
Peripheral system.
tero_defaults (src/defaults/)¶
Standalone implementations of the swappable services: StdoutLogger,
StdoutCharDevice, NullFaultInjector, DebugPublisher. Everything an
SMP2 wrapper would replace lives here, behind the Emulator::set_*
injection points.
Translation stack (tero_ir, tero_arch_sparc, tero_jit)¶
The binary-translation path (translation = true, default), layered
alongside the interpreter:
tero_ir— the architecture-neutral IR (IrOp/IrBlock/BlockExit), the opaque byte-addressedGuestState, the(PhysAddr, ModeCtx)-keyedBlockCache, the reference IR interpreter, and theIArchitecture/IArchFrontendseam. Knows no guest ISA.tero_arch_sparc— the SPARC frontend:translate_block(SPARC → IR),mode_ctx_of,take_exception,sparc_layout. The only module bridgingcoreandir.tero_jit— lowers the IR to native code via LLVM ORCv2 (LLJIT), tiered baseline-O0 / background-O2 (ADR-002). Owns LLVM.
Full detail: Binary translation — a primer (concept-first, for readers new to the technique); IR and LLVM JIT (the reference); Adding a frontend for the multi-arch seam; EmuGen for the planned frontend generator (design, gated).
tero_runtime (src/runtime/)¶
The orchestrator. Emulator is a facade over three subsystems
(Runtime decomposition): Soc (peripherals,
IRQ bridges, bus, the entity registry), ExecutionEngine (per-core
register blobs, the round-robin run loop, per-core IR caches + JITs —
implementation split across the engine_*.cpp translation units), and
DebugServer (GDB stub + breakpoints). Plus EmulatorConfig,
ElfLoader, CpuBusBridge (adapts SystemBus to ICpuBus),
EventScheduler, the GRLIB Plug&Play table builder, and the config
validator.
tero_compose (src/compose/)¶
Board composition above the runtime: the Machine object graph that
lowers to an EmulatorConfig, the gr712rc_config() / gr740_config()
kits (src/compose/src/kits.cpp), the .tero script front-end,
and the dlopen loader for component libraries. Produces a config and
stops; execution belongs to tero_runtime. See
tero_compose.
tero_app (src/app/)¶
The CLI tero-emu: argument parsing, EmulatorConfig construction, UART
terminal options, and the ErrorMode post-mortem dump. The only
translation unit allowed to do direct I/O.
5. The execution cycle (orientation)¶
The full mechanism is in Execution model; the 30-second version:
Emulator::run_for/run_untildelegate to theExecutionEngine(engine_run_loop.cpp); the facade keeps no run state of its own.run_untilslices the span intopacing_slice_nschunks under both pacing modes;Realtimeadditionallysleep_untils between chunks —sim_time_is identical either way.run_until_unpacedruns scheduling rounds until the deadline. Each round: late-binding GDB checks → sync the SoC up-counter → drain core-release wakes →sample_interruptsper core → error-mode check → either idle-skip (all cores powered down: jumpsim_time_toward the next event/deadline) or run one quantum per core (1000 instructions, round-robin) → fire scheduled events → tick peripherals.- Each core's quantum runs via
run_ir_quantum(JIT/IR, whentranslation && !observer), via the universal IR-interpret path (run_ir_interpret_quantum, for a non-SPARC frontend orforce_ir_interpret), or via the per-stepcore::stepSwitch loop. The JIT path falls back tocore::stepfor delay slots, annul, and untranslatable ops.
6. Adding a custom peripheral¶
The declarative model (spec + lifecycle + ports + validation) is in Peripheral system. The 30-second summary:
- Implement
IPeripheral(mmio_range,attach,mmio_read,mmio_write,tick,publish; optionallyfind_portforISignalPorts). - Push a
PeripheralSpecintocfg.peripheralswith your factory, IRQ lines, and any portconnections. Emulator::create(std::move(cfg))→initialize().
auto cfg = tero::compose::gr712rc_config();
cfg.peripherals.push_back({
.instance_name = "my_dma",
.factory = [](const tero::PeripheralContext&) {
return std::make_unique<MyDma>(tero::PhysAddr{0x80000800});
},
.irqs = {tero::IrqLine{10}},
});
auto emu = tero::runtime::Emulator::create(std::move(cfg));
(*emu)->initialize();
Emulator::add_peripheral(IPeripheral, IrqLine) is kept as sugar for
tests/REPL usage that insert peripherals after initialize(). References:
examples/custom-board/ (multi-IRQ + ISignalPort wiring),
examples/demo-dma/ (DMA-capable peripheral, both registration patterns).
7. Testing and conventions¶
Test strategy¶
- Framework: Catch2 v3 (FetchContent-pinned).
- Unit tests (
tests/unit/): individual decoupled modules (CpuState,SystemBus, instruction categories). - Integration tests (
tests/integration/): fullEmulatororchestration;test_demo_dma_device.cpp,test_bare_metal.cpp. - Assembly execution (
tests/guest-programs/asm/): real.SSPARC programs compiled with the RCC cross-toolchain. - RTEMS system tests (
tests/guest-programs/rtems/): real, unmodified RTEMS binaries; the acceptance gate for core changes.
The IR/JIT path additionally validates itself against the oracle at block
granularity (Emulator::run_oracle_lockstep) and across full RTEMS boots
(the Tero-vs-SIS lockstep comparator).
Style and naming¶
- Namespaces:
tero,tero::core,tero::bus,tero::peripherals,tero::runtime. - Types
PascalCase; functionssnake_case; memberstrailing_underscore_. #pragma oncein every header; includes ordered project → third-party → stdlib.[[nodiscard]]aggressively onResult<T>and value-returning getters.- The whole tree builds with zero warnings under a strict
-Werrorset (Decision 6).