Skip to content

Architecture overview

Are you a user looking to run or embed Tero?

This section is the Developer Manual — it explains how the emulator is built internally. You do not need it to run tero-emu or to embed tero_runtime in your own program.

For those tasks, use the User Manual instead:

Goal Go here
Build and install Getting started → Installation
Boot an RTEMS image Getting started → Quickstart
Look up every CLI flag Guide → CLI reference
Embed as a library Guide → Embedding as a library
Configure EmulatorConfig Guide → Configuration
Write a custom peripheral Peripherals → Custom peripherals

This is the canonical entry point for developers working on the Tero source code — the "how the whole project is conceived" landing page. It ties the layers, the execution model, and the design rules into one big-picture, then hands off to the focused pages for detail. For the current state and judgment calls, see development/status.

The sister pages in this section drill into individual concerns:

1. What Tero is (and is not)

Tero is a C++20 functional emulator for Cobham Gaisler's GR712RC (dual-core LEON3FT) and GR740 (quad-core LEON4FT) SPARC V8 systems-on-chip. Its primary goal is to run the RTEMS 5 and RTEMS 6 testsuite for BSP leon3 with high pass rates across uniprocessor and SMP configurations. Its long-horizon goal (Phase 9+) is sustained 1:1 wall-clock real-time emulation of the GR740.

What Tero is:

  • A functional emulator with two execution methods — a fetch-decode-execute Switch interpreter (the correctness oracle) and a default binary-translation path (arch-neutral IR + tiered LLVM JIT). See Execution model and IR and LLVM JIT.
  • A standalone C++ library, designed so an SMP2 (ECSS-E-ST-40-07) wrapper — maintained in a separate repository — can drive it under an external scheduler. The single-thread execution mode is a first-class, permanently-supported option for exactly this reason (ADR-001).
  • Capable of booting RTEMS in uniprocessor and SMP configurations (validated N=2 GR712RC, N=4 GR740), with FPU, GDB stub, and PROM/MKPROM2 boot.

What Tero is NOT (frozen non-goals):

  • Not cycle-accurate. 1:1 is wall-clock, not pipeline-cycle (ADR-004). Caches and the SRMMU are not modelled.
  • Not a Linux host. Tero is RTEMS-focused; no MMU-using guest is required today, so the SRMMU is stubbed and deferred.
  • Not the SMP2 wrapper itself (separate repository) and not a networking guest stack.

The FPU is implemented (Berkeley SoftFloat 3e, vendored).

2. The big picture

Tero is a stack of CMake static libraries with strictly unidirectional dependencies, orchestrated by one entry-point class: Emulator. Since the 2026-06 entity-model refactor, Emulator is a facade over three subsystems — Soc (the assembled machine), ExecutionEngine (cores, run loop, clock, JIT), and DebugServer (GDB stub + breakpoints) — see Runtime decomposition. The diagram below shows both the layering (who links whom) and, on the right, the execution flow (how a run_until call turns into retired guest instructions).

flowchart TB
    subgraph stack["Library stack (link-time dependencies)"]
        direction TB
        app["tero_app<br/><i>CLI: tero-emu</i>"]
        comp["tero_compose<br/><i>Machine, kits, .tero scripts,<br/>component-library loader</i>"]
        rt["tero_runtime<br/><i>Emulator (facade), Soc, ExecutionEngine,<br/>DebugServer, ElfLoader, GdbStub</i>"]
        per["tero_peripherals<br/><i>IrqMP/IrqAMP, GPTimer, ApbUart,<br/>MemCtrl, Prom, GRGPIO</i>"]
        bus["tero_bus<br/><i>SystemBus, Ram</i>"]
        defs["tero_defaults<br/><i>StdoutLogger, StdoutCharDevice,<br/>NullFaultInjector, DebugPublisher</i>"]
        arch["tero_arch_sparc<br/><i>SPARC → IR frontend, sync</i>"]
        jit["tero_jit<br/><i>IR → native, LLVM ORCv2</i>"]
        ir["tero_ir<br/><i>arch-neutral IR, GuestState,<br/>BlockCache, IR interpreter</i>"]
        core["tero_core<br/><i>SPARC V8 ISA, CpuState,<br/>decoder, step, handlers</i>"]
        iface["tero_interfaces<br/><i>strong types, Result&lt;T&gt;, I* contracts</i>"]

        app --> comp
        comp --> rt
        rt --> per --> bus
        rt --> defs
        rt --> arch
        rt --> jit
        rt --> core
        rt --> bus
        arch --> ir
        arch --> core
        jit --> ir
        core --> ir
        core -.-> iface
        bus -.-> iface
        per -.-> iface
        ir -.-> iface
        defs -.-> iface
    end

    subgraph flow["Execution flow (run_until)"]
        direction TB
        f0["run_until(deadline)"] --> f1{pacing?}
        f1 -- Realtime --> f2["slice + sleep_until"]
        f1 -- Turbo --> f3
        f2 --> f3["run_until_unpaced: per round"]
        f3 --> f4["sample_interrupts / idle-skip"]
        f4 --> f5{translation?}
        f5 -- true --> f6["run_ir_quantum<br/>JIT block / IR fallback"]
        f5 -- false --> f7["core::step ×quantum"]
        f6 --> f8["fire events + tick peripherals"]
        f7 --> f8
    end

    rt -. drives .-> f0

Three facts to internalise from the diagram:

  1. tero_core links tero_ir. CpuState's integer register file is an ir::GuestState byte blob (state unification), so the Switch interpreter and the IR engine read/write the same bytes with no sync. The SPARC↔IR bridge (mode_ctx_of, translate_block) lives in tero_arch_sparc, the only module linking both core and ir. See Layers.
  2. The two execution methods share one Emulator and one CpuState. They are not separate emulators; translation is a runtime field, and the JIT path falls back to core::step for traps, delay slots, annulled slots, and untranslatable ops.
  3. tero::core is a PRIVATE link of tero_runtime (src/runtime/CMakeLists.txt:34-48). The runtime's run loop drives cores through the arch-neutral ir::IArchitecture seam, so neutral consumers do not transitively link the SPARC core. See Layers.

3. The non-negotiable design principles

These shape every line of code. Each is restated with full rationale and the code that enforces it in Design principles:

Principle One-line statement
Zero singletons / global mutable state Two Emulators in one process must never interfere.
Zero direct I/O from the core Everything goes through ILogger / ICharacterDevice.
Time as a parameter The core computes simulated time; it never samples the host clock (except gated Realtime pacing).
Configuration by struct EmulatorConfig is a plain struct — no file parsing, no behaviour build-flags.
Errors as values The public boundary returns Result<T> = tl::expected<T, ErrorCode>.
Strong types everywhere PhysAddr, VirtAddr, CoreId, SimTimeNs — never a bare uint32_t.
Switch interpreter is the oracle Every translated path is validated bit-identical against core::step.
Round-robin single-thread default TSO and atomics are correct by construction; MultiThread is the opt-in escape hatch.
Big-endian at the typed boundary RAM stores raw bytes; the byte-swap happens in the accessor / IR op.

4. The src/ modules at a glance

The authoritative dependency edges are in Layers and modules; this is the orientation map.

tero_interfaces (src/interfaces/)

Header-only vocabulary library. Defines the strong types and Result<T> (types.hpp), AddressRange, the PeripheralContext aggregate, the entity-model base (IEntity, IEntityRegistry, IConnectable, IMmio — see Entity object model), and every I* contract (IPeripheral, ICpu, IGdbRegisters, ICpuBus, IBusMaster, IInterruptSource, IInterruptController, IScheduler, IEvent, ILogger, ICharacterDevice, ITimeSource, IPublisher, IFaultInjector, IEmulatorObserver, IPort, IMemoryRegion, the per-protocol comms interfaces). Plus GatedMutex (ADR-001) and BreakpointSet. No logic, no state. Full inventory: tero_interfaces.

tero_core (src/core/)

The SPARC V8 integer + FP ISA. CpuState (8-window register file, PSR, WIM/TBR/Y/PC/nPC, ASR19 power-down, error mode, the per-PC decode cache, and the integer state exposed as a GuestState blob). The decoder (DecodedInsn + InsnKind), the category handlers (handlers_alu/branch/loadstore/regwin/special), the FP handlers (SoftFloat 3e), step (the single-instruction driver), and the trap constants. Reaches the world only through the injected ICpuBus.

tero_bus (src/bus/)

Physical memory routing. Ram (raw std::vector<std::byte>, endianness-agnostic) and SystemBus (the central physical-address router; owns RAM, holds non-owning IPeripheral* MMIO regions, does the byte-order shuffle in encode/decode (MemEndian, default big-endian), and implements IBusMaster so DMA shares the exact CPU memory map). See Memory and bus.

tero_peripherals (src/peripherals/)

The GRLIB IP cores. Default GR712RC MMIO layout (from the kit and the DefaultBase constants):

Class MMIO base IRQ Description
MemCtrl (FTMCTRL) 0x80000000 Passive stub; MCFG1–4 readable/writable, no side effects.
ApbUart (console) 0x80000100 2 Primary serial port; 8-byte RX FIFO, immediate TX via ICharacterDevice.
IrqMP 0x80000200 GR712RC multi-core interrupt controller (mask/force/broadcast).
GPTimer 0x80000300 8 4 sub-timers + prescaler; timer 4 is an armed watchdog.
GRGPIO ×2 0x80000900 / 0x80000A00 GPIO ports, per-pin ISignalPort.

GR740 substitutes IrqAMP (0xFF904000) — a sibling IP core, not a GR712RC bolt-on (feedback_irqmp_vs_irqamp) — and maps its own UARTs (IRQ 29/30). Prom is auto-wired when prom_size != 0. See Peripheral system.

tero_defaults (src/defaults/)

Standalone implementations of the swappable services: StdoutLogger, StdoutCharDevice, NullFaultInjector, DebugPublisher. Everything an SMP2 wrapper would replace lives here, behind the Emulator::set_* injection points.

Translation stack (tero_ir, tero_arch_sparc, tero_jit)

The binary-translation path (translation = true, default), layered alongside the interpreter:

  • tero_ir — the architecture-neutral IR (IrOp/IrBlock/ BlockExit), the opaque byte-addressed GuestState, the (PhysAddr, ModeCtx)-keyed BlockCache, the reference IR interpreter, and the IArchitecture/IArchFrontend seam. Knows no guest ISA.
  • tero_arch_sparc — the SPARC frontend: translate_block (SPARC → IR), mode_ctx_of, take_exception, sparc_layout. The only module bridging core and ir.
  • tero_jit — lowers the IR to native code via LLVM ORCv2 (LLJIT), tiered baseline-O0 / background-O2 (ADR-002). Owns LLVM.

Full detail: Binary translation — a primer (concept-first, for readers new to the technique); IR and LLVM JIT (the reference); Adding a frontend for the multi-arch seam; EmuGen for the planned frontend generator (design, gated).

tero_runtime (src/runtime/)

The orchestrator. Emulator is a facade over three subsystems (Runtime decomposition): Soc (peripherals, IRQ bridges, bus, the entity registry), ExecutionEngine (per-core register blobs, the round-robin run loop, per-core IR caches + JITs — implementation split across the engine_*.cpp translation units), and DebugServer (GDB stub + breakpoints). Plus EmulatorConfig, ElfLoader, CpuBusBridge (adapts SystemBus to ICpuBus), EventScheduler, the GRLIB Plug&Play table builder, and the config validator.

tero_compose (src/compose/)

Board composition above the runtime: the Machine object graph that lowers to an EmulatorConfig, the gr712rc_config() / gr740_config() kits (src/compose/src/kits.cpp), the .tero script front-end, and the dlopen loader for component libraries. Produces a config and stops; execution belongs to tero_runtime. See tero_compose.

tero_app (src/app/)

The CLI tero-emu: argument parsing, EmulatorConfig construction, UART terminal options, and the ErrorMode post-mortem dump. The only translation unit allowed to do direct I/O.

5. The execution cycle (orientation)

The full mechanism is in Execution model; the 30-second version:

  • Emulator::run_for / run_until delegate to the ExecutionEngine (engine_run_loop.cpp); the facade keeps no run state of its own.
  • run_until slices the span into pacing_slice_ns chunks under both pacing modes; Realtime additionally sleep_untils between chunks — sim_time_ is identical either way.
  • run_until_unpaced runs scheduling rounds until the deadline. Each round: late-binding GDB checks → sync the SoC up-counter → drain core-release wakes → sample_interrupts per core → error-mode check → either idle-skip (all cores powered down: jump sim_time_ toward the next event/deadline) or run one quantum per core (1000 instructions, round-robin) → fire scheduled events → tick peripherals.
  • Each core's quantum runs via run_ir_quantum (JIT/IR, when translation && !observer), via the universal IR-interpret path (run_ir_interpret_quantum, for a non-SPARC frontend or force_ir_interpret), or via the per-step core::step Switch loop. The JIT path falls back to core::step for delay slots, annul, and untranslatable ops.

6. Adding a custom peripheral

The declarative model (spec + lifecycle + ports + validation) is in Peripheral system. The 30-second summary:

  1. Implement IPeripheral (mmio_range, attach, mmio_read, mmio_write, tick, publish; optionally find_port for ISignalPorts).
  2. Push a PeripheralSpec into cfg.peripherals with your factory, IRQ lines, and any port connections.
  3. Emulator::create(std::move(cfg))initialize().
auto cfg = tero::compose::gr712rc_config();
cfg.peripherals.push_back({
    .instance_name = "my_dma",
    .factory = [](const tero::PeripheralContext&) {
        return std::make_unique<MyDma>(tero::PhysAddr{0x80000800});
    },
    .irqs = {tero::IrqLine{10}},
});
auto emu = tero::runtime::Emulator::create(std::move(cfg));
(*emu)->initialize();

Emulator::add_peripheral(IPeripheral, IrqLine) is kept as sugar for tests/REPL usage that insert peripherals after initialize(). References: examples/custom-board/ (multi-IRQ + ISignalPort wiring), examples/demo-dma/ (DMA-capable peripheral, both registration patterns).

7. Testing and conventions

Test strategy

  • Framework: Catch2 v3 (FetchContent-pinned).
  • Unit tests (tests/unit/): individual decoupled modules (CpuState, SystemBus, instruction categories).
  • Integration tests (tests/integration/): full Emulator orchestration; test_demo_dma_device.cpp, test_bare_metal.cpp.
  • Assembly execution (tests/guest-programs/asm/): real .S SPARC programs compiled with the RCC cross-toolchain.
  • RTEMS system tests (tests/guest-programs/rtems/): real, unmodified RTEMS binaries; the acceptance gate for core changes.
ctest --test-dir build --output-on-failure

The IR/JIT path additionally validates itself against the oracle at block granularity (Emulator::run_oracle_lockstep) and across full RTEMS boots (the Tero-vs-SIS lockstep comparator).

Style and naming

  • Namespaces: tero, tero::core, tero::bus, tero::peripherals, tero::runtime.
  • Types PascalCase; functions snake_case; members trailing_underscore_.
  • #pragma once in every header; includes ordered project → third-party → stdlib.
  • [[nodiscard]] aggressively on Result<T> and value-returning getters.
  • The whole tree builds with zero warnings under a strict -Werror set (Decision 6).

See also