Architecture overview¶
This page is the canonical entry point for developers working on the Lince source code. If you are a user looking to run the emulator or consume it as a C++ library, see the user guide.
For the project's current state and the judgment calls taken during the latest work sessions, read development/status. The sister pages in this section drill into individual concerns:
- Layers and modules — directory and CMake target layout
- Execution model —
run_forloop, decode/execute, delay slots - Multicore and timing — round-robin, idle skip, quantum
- Memory and bus —
SystemBus, RAM, BE handling, DMA - Traps and interrupts — TBR, sampling, ErrorMode
- Design principles — non-negotiable invariants
- Design decisions — judgment calls taken during implementation
1. Overview and Layers¶
Lince is a C++20 functional emulator for the SPARC V8 architecture, specifically targeting the GR712RC (LEON3FT dual-core) and GR740 (LEON4FT quad-core) system-on-chips.
What Lince is: - A fetch-decode-execute interpreter designed for correctness and peripheral coverage over raw performance. - A foundation for an SMP2-compliant simulation model (ECSS-E-ST-40-07) intended for space industry simulation environments. - Capable of booting RTEMS uniprocessor and SMP configurations.
What Lince is NOT (yet): - A cycle-accurate simulator. - An emulator with MMU or FPU support (both are out of the MVP scope and stubbed). - A block-translating JIT.
1.1 Layer Diagram¶
The project is structured as a stack of CMake modules with strict unidirectional dependencies:
flowchart TD
lince_app[lince_app <br> CLI executable] --> lince_runtime
demo_dma[demo_dma_device <br> examples] --> lince_runtime
lince_runtime[lince_runtime <br> Emulator, ElfLoader, Scheduler] --> lince_defaults
lince_runtime --> lince_peripherals
lince_defaults[lince_defaults <br> Loggers, CharDevices] -.-> lince_interfaces
lince_peripherals[lince_peripherals <br> IrqMP, GPTimer, ApbUart] --> lince_bus
lince_bus[lince_bus <br> SystemBus, RAM] --> lince_core
lince_core[lince_core <br> SPARC V8 ISA, CpuState, Decoders] -.-> lince_interfaces
lince_interfaces[lince_interfaces <br> Header-only API]
classDef default fill:#f9f9f9,stroke:#333,stroke-width:1px;
class lince_interfaces default;
1.2 Non-Negotiable Design Principles¶
- Zero singletons / Zero global mutable state. You must be able to instantiate multiple
Emulatorobjects in the same process without interference. - Errors as values. The public boundary uses
Result<T>(which maps totl::expected<T, ErrorCode>). Exceptions are allowed internally but must be caught before crossing out. - Zero direct I/O from the core. No
printforstd::coutinside the emulator. All output goes through injected interfaces (ILogger,ICharacterDevice). - Time is a parameter. The emulator never reads the host system clock. Time is driven from the outside via
SimTimeNs. - Strong types everywhere. Raw
uint32_tis avoided in favor ofPhysAddr,VirtAddr,CoreId, etc. - Simple execution model. Round-robin single-threaded execution across cores means Total Store Order (TSO) and atomics are trivially correct.
2. The src/ Modules in Detail¶
2.1 lince_interfaces (src/interfaces/)¶
A header-only vocabulary library defining the core types and contracts without logic dependencies.
- Types (
types.hpp):PhysAddr,VirtAddr,CoreId,SimTimeNs,IrqLine,AccessSize,ErrorCode,Result<T>. - Ranges (
address_range.hpp):AddressRangerepresents a half-open[base, base+size)physical segment. - Peripheral Contract (
iperipheral.hpp,peripheral_context.hpp): The unified contract for all devices on the bus (see §4). - Bus Contracts (
ibus_master.hpp,icpu_bus.hpp): Interfaces for DMA and core-side bus access. - Services:
ILogger,ICharacterDevice,IInterruptSource,IScheduler,IPublisher,IFaultInjector,ITimeSource.
2.2 lince_core (src/core/)¶
Implements the SPARC V8 architecture (Integer Unit only).
CpuState: The full architectural state of a core. Features an 8-window register file (128 physical slots + 8 globals), PSR/WIM/TBR, Y register, PC/nPC, and implements SPARC §5.1.2.3 delayed writes for the PSR pipeline. Also tracks ASR19 power-down state and error_mode.DecoderandDecodedInsn(decoder.cpp): Unpacks raw 32-bit SPARC words (Format 1, 2, and 3) into anInsnKindenum and all relevant operand/immediate fields.handlers_*.cpp: The execution units. Separated into ALU, Branch (CTI), Load/Store, RegWin (SAVE/RESTORE/RETT), and Special instructions.step.hpp: The single-cycle fetch/decode/execute driver. Responsible for branch delay slot dispatch and trap injection.trap.hpp: SPARC V8 trap constant definitions andExecStatusto trap translation.
2.3 lince_bus (src/bus/)¶
Physical memory routing.
Ram: A rawstd::vector<std::byte>block natively lacking endianness logic.SystemBus: The central physical-address router. Manages multiple RAM configurations (owning) and an arbitrary number of MMIO regions (non-owning).- Implements Big-Endian semantics. Typed accessors (
read_physical_u32) useencode_be/decode_be. - Implements
IBusMaster: Peripherals use the same bus instances for DMA, meaning DMA shares the exact same memory map as CPU accesses.
- Implements Big-Endian semantics. Typed accessors (
2.4 lince_peripherals (src/peripherals/)¶
Core GR712RC peripherals required to boot RTEMS:
| Class | MMIO Base | IRQ | Description |
|---|---|---|---|
IrqMP |
0x80000200 |
— | Multi-core Interrupt Controller (maskable IFORCE/IPEND, broadcasting). |
GPTimer |
0x80000300 |
8 | 4 sub-timers + prescaler. Timer 4 is initialized as an active watchdog. |
ApbUart |
0x80000100 |
3 | Primary serial port. Has an 8-byte RX FIFO; TX is immediate via ICharacterDevice. |
MemCtrl |
0x80000000 |
— | FTMCTRL stub. Ignores writes, returns 0 on reads. |
2.5 lince_defaults (src/defaults/)¶
Standalone implementations of lince_interfaces services intended for use without an SMP2 environment:
| Class | Implements | Behavior |
|---|---|---|
StdoutLogger |
ILogger |
fprintf(stderr, ...) with configurable severity levels. |
StdoutCharDevice |
ICharacterDevice |
putchar() on TX, getchar() on RX. |
NullFaultInjector |
IFaultInjector |
No-op; assumes hardware is always healthy. |
DebugPublisher |
IPublisher |
Dumps registered SMP2-style observable fields to JSON on stderr. |
2.6 lince_runtime (src/runtime/)¶
Orchestration and execution loop.
Emulator: The public entry point. Owns the event scheduler, cores, bus, bridge, and peripherals. Round-robin multi-core execution and trap dispatch.EmulatorConfig: Configuration primitive;gr712rc_config()provides defaults.ElfLoader: Parses SPARC BE ET_EXEC binaries. Identifies the%spbase, and initializes secondary cores into a power-down state.CpuBusBridge: AdaptsSystemBusintoICpuBusfor instruction fetches and CPU-driven load/stores.EventScheduler: Processes timed tasks (e.g. GPTimer ticks). Feedsnext_event_time()back to the idle skipper loop.
2.7 lince_app (src/app/)¶
The CLI entrypoint: lince-emu. Parses arguments, constructs an EmulatorConfig, sets the UART terminal options, and handles ErrorMode post-mortem reporting (dumping PC, PSR, TBR, missing global/in registers).
3. The Execution Cycle¶
3.1 The run_for Loop¶
A simplified representation of the main scheduling loop in Emulator::run_for():
sequenceDiagram
participant E as Emulator
participant S as Scheduler
participant C as CpuState
participant P as Peripherals
loop While sim_time < deadline
E->>S: fire_pending(sim_time)
loop For each Core
alt is_powered_down()
E-->>E: Skip quantum
else Active
E->>E: sample_interrupts()
loop For Quantum Instructions
E->>C: step()
Note right of C: Fetch word<br/>Decode -> DecodedInsn<br/>Execute -> ExecStatus<br/>Trap handle<br/>Advance PC/nPC
end
end
end
E->>P: tick(sim_time)
alt All cores powered down
E->>S: next_event_time()
E->>E: sim_time = min(next, now + 1ms)
else
E->>E: sim_time += quantum * ns_per_insn
end
end
3.2 Branch Delay Slots and Annul¶
In SPARC V8, control-transfer instructions (CTIs) do not mutate PC and nPC directly. Instead:
1. JMPL, CALL, Bicc set branch_taken_ and compute branch_target_ on CpuState.
2. The executing loop fetches the instruction at nPC for the next cycle (the delay slot).
3. After the delay slot executes, PC adopts branch_target_ instead of nPC.
If a Bicc specifies the annul bit (,a), and the branch is NOT taken, CpuState sets annul_next_ = true. The step() cycle skips the execution of the next instruction entirely but still advances PC/nPC. Hardware interrupts clear annul_next_ upon TRAP entry (SPARC V8 §5.1.2.2).
3.3 PSR Write Pipeline Delivery¶
As mandated by SPARC V8 §5.1.2.3, the immediate fields ICC and PIL update immediately on WRPSR. Delayed fields S, ET, PS, and CWP are buffered in pending_psr_.
The step() loop calls commit_psr_pipeline() each cycle, meaning three cycles must pass before CWP updates actually impact window calculations. This avoids edge cases in the SPARC architecture that can cause infinite loops or data corruption.
Note: Unit tests that invoke execute() directly must manually call commit_psr_pipeline() × 3 to assert on delayed fields.
3.4 Interrupt Sampling and Idle Time Skipping¶
Before a core begins its quantum, sample_interrupts() queries the IrqMP context for its core index. If a pending IRQ level is higher than PSR.PIL, and traps are enabled (PSR.ET=1), a hardware TRAP is injected (tt=0x10 + level).
Idle skip mechanism: Cores halt locally by writing a non-zero value to ASR19 (is_powered_down_ == true). When all active cores are halted, simulating their quantum is effectively wasted latency. The emulator skips SimTimeNs forward to the lesser of scheduler.next_event_time() or now + 1ms (kMaxIdleNs). This allows GPTimer interrupts to fire almost instantly in simulation time, dramatically improving idle-heavy workloads like RTEMS sp04.
4. How to Add a Custom Peripheral¶
This section is a complete guide to plugging an arbitrary user-defined peripheral into Lince. Code is based on the reference example shipped under examples/demo-dma/.
4.1 Interface Implementation¶
Peripherals must implement IPeripheral.
#include "lince/iperipheral.hpp"
class DemoDmaDevice : public lince::IPeripheral {
public:
explicit DemoDmaDevice(lince::PhysAddr base_addr)
: base_{base_addr} {}
// Identification
std::string_view name() const override { return "demo_dma"; }
// Bounds check enforced by the SystemBus
lince::AddressRange mmio_range() const override {
return {base_, 0x10}; // 16 bytes of MMIO space
}
// Wiring of services (DMA, IRQ, Scheduler)
void attach(const lince::PeripheralContext& ctx) override {
ctx_ = ctx; // Copy the context!
}
void reset() override {
if (ctx_.irq) ctx_.irq->lower();
status_ = 0;
}
lince::Result<uint32_t> mmio_read(lince::PhysAddr addr,
lince::AccessSize size) override {
if (size != lince::AccessSize::Word) return lince::make_error(lince::ErrorCode::AlignmentError);
// Map relative offset to registers
// ...
return status_;
}
lince::Result<void> mmio_write(lince::PhysAddr addr,
lince::AccessSize size, uint32_t val) override {
if (size != lince::AccessSize::Word) return lince::make_error(lince::ErrorCode::AlignmentError);
// ...
return {};
}
void tick(lince::SimTimeNs now) override { /* polling state machines here */ }
void publish(lince::IPublisher& pub) override { /* SMP2 properties */ }
private:
lince::PhysAddr base_;
lince::PeripheralContext ctx_{};
uint32_t status_{0};
};
4.2 Handling DMA and Big-Endian Reads¶
The ctx_.bus->dma_read() operations transfer raw arrays of std::byte. Since SPARC is Big-Endian, manual composition of a uint32_t via byte extraction ([0]<<24 | [1]<<16 | ...) is required when reading arbitrary CPU memory structures representing integers. See examples/demo-dma/demo_dma_device.cpp for endianness-safe DMA transactions.
4.3 Triggering Interrupts¶
If your peripheral raises an IRQ, simply call:
The emulator wires the underlyingIrqBridge correctly so this line maps to the IrqMP's broadcast matrix handling.
4.4 Registering with the Emulator¶
Pick a base address outside the GR712RC default APB space (e.g., 0x80000800) and an IRQ line unused by defaults (e.g., 10):
auto demo = std::make_unique<DemoDmaDevice>(PhysAddr{0x80000800});
emu.add_peripheral(std::move(demo), lince::IrqLine{10});
add_peripheral automatically allocates the IRQ line across the IrqMP, verifies the address window doesn't overlap existing mappings, assigns the ILogger and IScheduler, and delegates the ownership of the device.
5. Testing and Conventions¶
5.1 Test Strategy¶
- Framework: Catch2 v3.
- Unit Tests (
tests/unit/): Validates individual decoupled modules (CpuState,SystemBus, Instruction categories). - Integration Tests (
tests/integration/): Tests full Emulator orchestration.test_demo_dma_device.cppvalidates the custom peripheral hooks;test_bare_metal.cppruns tiny hex blobs end-to-end. - Assembly Execution (
tests/asm/): Real.SSPARC assembler pipelines compiled using the RCC cross-compiler (BCC 1.3.2 typically). - RTEMS System Tests (
tests/rtems/): Real unmodified RTEMS binary tests ensuring 100% integration passing rate. Verified locally using CTest timeouts.
Execution:
5.2 Style and Naming¶
- Namespaces:
lince,lince::core,lince::bus,lince::peripherals,lince::runtime. - Type Names:
PascalCase. - Function Names:
snake_case. Private members:trailing_underscore_. - Pragmas:
#pragma onceacross all headers. - Includes:
lince/project headers first, followed by third-party deps, followed by<stdlib>(separated by newlines). - Use
[[nodiscard]]aggressively on allResult<T>or object retrieval methods.