Skip to content

Architecture overview

This page is the canonical entry point for developers working on the Lince source code. If you are a user looking to run the emulator or consume it as a C++ library, see the user guide.

For the project's current state and the judgment calls taken during the latest work sessions, read development/status. The sister pages in this section drill into individual concerns:

1. Overview and Layers

Lince is a C++20 functional emulator for the SPARC V8 architecture, specifically targeting the GR712RC (LEON3FT dual-core) and GR740 (LEON4FT quad-core) system-on-chips.

What Lince is: - A fetch-decode-execute interpreter designed for correctness and peripheral coverage over raw performance. - A foundation for an SMP2-compliant simulation model (ECSS-E-ST-40-07) intended for space industry simulation environments. - Capable of booting RTEMS uniprocessor and SMP configurations.

What Lince is NOT (yet): - A cycle-accurate simulator. - An emulator with MMU or FPU support (both are out of the MVP scope and stubbed). - A block-translating JIT.

1.1 Layer Diagram

The project is structured as a stack of CMake modules with strict unidirectional dependencies:

flowchart TD
    lince_app[lince_app <br> CLI executable] --> lince_runtime
    demo_dma[demo_dma_device <br> examples] --> lince_runtime

    lince_runtime[lince_runtime <br> Emulator, ElfLoader, Scheduler] --> lince_defaults
    lince_runtime --> lince_peripherals

    lince_defaults[lince_defaults <br> Loggers, CharDevices] -.-> lince_interfaces

    lince_peripherals[lince_peripherals <br> IrqMP, GPTimer, ApbUart] --> lince_bus

    lince_bus[lince_bus <br> SystemBus, RAM] --> lince_core

    lince_core[lince_core <br> SPARC V8 ISA, CpuState, Decoders] -.-> lince_interfaces

    lince_interfaces[lince_interfaces <br> Header-only API]

    classDef default fill:#f9f9f9,stroke:#333,stroke-width:1px;
    class lince_interfaces default;

1.2 Non-Negotiable Design Principles

  • Zero singletons / Zero global mutable state. You must be able to instantiate multiple Emulator objects in the same process without interference.
  • Errors as values. The public boundary uses Result<T> (which maps to tl::expected<T, ErrorCode>). Exceptions are allowed internally but must be caught before crossing out.
  • Zero direct I/O from the core. No printf or std::cout inside the emulator. All output goes through injected interfaces (ILogger, ICharacterDevice).
  • Time is a parameter. The emulator never reads the host system clock. Time is driven from the outside via SimTimeNs.
  • Strong types everywhere. Raw uint32_t is avoided in favor of PhysAddr, VirtAddr, CoreId, etc.
  • Simple execution model. Round-robin single-threaded execution across cores means Total Store Order (TSO) and atomics are trivially correct.

2. The src/ Modules in Detail

2.1 lince_interfaces (src/interfaces/)

A header-only vocabulary library defining the core types and contracts without logic dependencies.

  • Types (types.hpp): PhysAddr, VirtAddr, CoreId, SimTimeNs, IrqLine, AccessSize, ErrorCode, Result<T>.
  • Ranges (address_range.hpp): AddressRange represents a half-open [base, base+size) physical segment.
  • Peripheral Contract (iperipheral.hpp, peripheral_context.hpp): The unified contract for all devices on the bus (see §4).
  • Bus Contracts (ibus_master.hpp, icpu_bus.hpp): Interfaces for DMA and core-side bus access.
  • Services: ILogger, ICharacterDevice, IInterruptSource, IScheduler, IPublisher, IFaultInjector, ITimeSource.

2.2 lince_core (src/core/)

Implements the SPARC V8 architecture (Integer Unit only).

  • CpuState: The full architectural state of a core. Features an 8-window register file (128 physical slots + 8 globals), PSR/WIM/TBR, Y register, PC/nPC, and implements SPARC §5.1.2.3 delayed writes for the PSR pipeline. Also tracks ASR19 power-down state and error_mode.
  • Decoder and DecodedInsn (decoder.cpp): Unpacks raw 32-bit SPARC words (Format 1, 2, and 3) into an InsnKind enum and all relevant operand/immediate fields.
  • handlers_*.cpp: The execution units. Separated into ALU, Branch (CTI), Load/Store, RegWin (SAVE/RESTORE/RETT), and Special instructions.
  • step.hpp: The single-cycle fetch/decode/execute driver. Responsible for branch delay slot dispatch and trap injection.
  • trap.hpp: SPARC V8 trap constant definitions and ExecStatus to trap translation.

2.3 lince_bus (src/bus/)

Physical memory routing.

  • Ram: A raw std::vector<std::byte> block natively lacking endianness logic.
  • SystemBus: The central physical-address router. Manages multiple RAM configurations (owning) and an arbitrary number of MMIO regions (non-owning).
    • Implements Big-Endian semantics. Typed accessors (read_physical_u32) use encode_be/decode_be.
    • Implements IBusMaster: Peripherals use the same bus instances for DMA, meaning DMA shares the exact same memory map as CPU accesses.

2.4 lince_peripherals (src/peripherals/)

Core GR712RC peripherals required to boot RTEMS:

Class MMIO Base IRQ Description
IrqMP 0x80000200 Multi-core Interrupt Controller (maskable IFORCE/IPEND, broadcasting).
GPTimer 0x80000300 8 4 sub-timers + prescaler. Timer 4 is initialized as an active watchdog.
ApbUart 0x80000100 3 Primary serial port. Has an 8-byte RX FIFO; TX is immediate via ICharacterDevice.
MemCtrl 0x80000000 FTMCTRL stub. Ignores writes, returns 0 on reads.

2.5 lince_defaults (src/defaults/)

Standalone implementations of lince_interfaces services intended for use without an SMP2 environment:

Class Implements Behavior
StdoutLogger ILogger fprintf(stderr, ...) with configurable severity levels.
StdoutCharDevice ICharacterDevice putchar() on TX, getchar() on RX.
NullFaultInjector IFaultInjector No-op; assumes hardware is always healthy.
DebugPublisher IPublisher Dumps registered SMP2-style observable fields to JSON on stderr.

2.6 lince_runtime (src/runtime/)

Orchestration and execution loop.

  • Emulator: The public entry point. Owns the event scheduler, cores, bus, bridge, and peripherals. Round-robin multi-core execution and trap dispatch.
  • EmulatorConfig: Configuration primitive; gr712rc_config() provides defaults.
  • ElfLoader: Parses SPARC BE ET_EXEC binaries. Identifies the %sp base, and initializes secondary cores into a power-down state.
  • CpuBusBridge: Adapts SystemBus into ICpuBus for instruction fetches and CPU-driven load/stores.
  • EventScheduler: Processes timed tasks (e.g. GPTimer ticks). Feeds next_event_time() back to the idle skipper loop.

2.7 lince_app (src/app/)

The CLI entrypoint: lince-emu. Parses arguments, constructs an EmulatorConfig, sets the UART terminal options, and handles ErrorMode post-mortem reporting (dumping PC, PSR, TBR, missing global/in registers).


3. The Execution Cycle

3.1 The run_for Loop

A simplified representation of the main scheduling loop in Emulator::run_for():

sequenceDiagram
    participant E as Emulator
    participant S as Scheduler
    participant C as CpuState
    participant P as Peripherals

    loop While sim_time < deadline
        E->>S: fire_pending(sim_time)

        loop For each Core
            alt is_powered_down()
                E-->>E: Skip quantum
            else Active
                E->>E: sample_interrupts()
                loop For Quantum Instructions
                    E->>C: step()
                    Note right of C: Fetch word<br/>Decode -> DecodedInsn<br/>Execute -> ExecStatus<br/>Trap handle<br/>Advance PC/nPC
                end
            end
        end

        E->>P: tick(sim_time)

        alt All cores powered down
            E->>S: next_event_time()
            E->>E: sim_time = min(next, now + 1ms)
        else
            E->>E: sim_time += quantum * ns_per_insn
        end
    end

3.2 Branch Delay Slots and Annul

In SPARC V8, control-transfer instructions (CTIs) do not mutate PC and nPC directly. Instead: 1. JMPL, CALL, Bicc set branch_taken_ and compute branch_target_ on CpuState. 2. The executing loop fetches the instruction at nPC for the next cycle (the delay slot). 3. After the delay slot executes, PC adopts branch_target_ instead of nPC.

If a Bicc specifies the annul bit (,a), and the branch is NOT taken, CpuState sets annul_next_ = true. The step() cycle skips the execution of the next instruction entirely but still advances PC/nPC. Hardware interrupts clear annul_next_ upon TRAP entry (SPARC V8 §5.1.2.2).

3.3 PSR Write Pipeline Delivery

As mandated by SPARC V8 §5.1.2.3, the immediate fields ICC and PIL update immediately on WRPSR. Delayed fields S, ET, PS, and CWP are buffered in pending_psr_.

The step() loop calls commit_psr_pipeline() each cycle, meaning three cycles must pass before CWP updates actually impact window calculations. This avoids edge cases in the SPARC architecture that can cause infinite loops or data corruption. Note: Unit tests that invoke execute() directly must manually call commit_psr_pipeline() × 3 to assert on delayed fields.

3.4 Interrupt Sampling and Idle Time Skipping

Before a core begins its quantum, sample_interrupts() queries the IrqMP context for its core index. If a pending IRQ level is higher than PSR.PIL, and traps are enabled (PSR.ET=1), a hardware TRAP is injected (tt=0x10 + level).

Idle skip mechanism: Cores halt locally by writing a non-zero value to ASR19 (is_powered_down_ == true). When all active cores are halted, simulating their quantum is effectively wasted latency. The emulator skips SimTimeNs forward to the lesser of scheduler.next_event_time() or now + 1ms (kMaxIdleNs). This allows GPTimer interrupts to fire almost instantly in simulation time, dramatically improving idle-heavy workloads like RTEMS sp04.


4. How to Add a Custom Peripheral

This section is a complete guide to plugging an arbitrary user-defined peripheral into Lince. Code is based on the reference example shipped under examples/demo-dma/.

4.1 Interface Implementation

Peripherals must implement IPeripheral.

#include "lince/iperipheral.hpp"

class DemoDmaDevice : public lince::IPeripheral {
public:
    explicit DemoDmaDevice(lince::PhysAddr base_addr) 
        : base_{base_addr} {}

    // Identification
    std::string_view name() const override { return "demo_dma"; }

    // Bounds check enforced by the SystemBus
    lince::AddressRange mmio_range() const override {
        return {base_, 0x10};  // 16 bytes of MMIO space
    }

    // Wiring of services (DMA, IRQ, Scheduler)
    void attach(const lince::PeripheralContext& ctx) override {
        ctx_ = ctx; // Copy the context!
    }

    void reset() override {
        if (ctx_.irq) ctx_.irq->lower();
        status_ = 0;
    }

    lince::Result<uint32_t> mmio_read(lince::PhysAddr addr, 
                                      lince::AccessSize size) override {
        if (size != lince::AccessSize::Word) return lince::make_error(lince::ErrorCode::AlignmentError);
        // Map relative offset to registers
        // ...
        return status_;
    }

    lince::Result<void> mmio_write(lince::PhysAddr addr, 
                                   lince::AccessSize size, uint32_t val) override {
        if (size != lince::AccessSize::Word) return lince::make_error(lince::ErrorCode::AlignmentError);
        // ...
        return {};
    }

    void tick(lince::SimTimeNs now) override { /* polling state machines here */ }
    void publish(lince::IPublisher& pub) override { /* SMP2 properties */ }

private:
    lince::PhysAddr base_;
    lince::PeripheralContext ctx_{};
    uint32_t status_{0};
};

4.2 Handling DMA and Big-Endian Reads

The ctx_.bus->dma_read() operations transfer raw arrays of std::byte. Since SPARC is Big-Endian, manual composition of a uint32_t via byte extraction ([0]<<24 | [1]<<16 | ...) is required when reading arbitrary CPU memory structures representing integers. See examples/demo-dma/demo_dma_device.cpp for endianness-safe DMA transactions.

4.3 Triggering Interrupts

If your peripheral raises an IRQ, simply call:

if (ctx_.irq) ctx_.irq->raise();
The emulator wires the underlying IrqBridge correctly so this line maps to the IrqMP's broadcast matrix handling.

4.4 Registering with the Emulator

Pick a base address outside the GR712RC default APB space (e.g., 0x80000800) and an IRQ line unused by defaults (e.g., 10):

auto demo = std::make_unique<DemoDmaDevice>(PhysAddr{0x80000800});
emu.add_peripheral(std::move(demo), lince::IrqLine{10});

add_peripheral automatically allocates the IRQ line across the IrqMP, verifies the address window doesn't overlap existing mappings, assigns the ILogger and IScheduler, and delegates the ownership of the device.


5. Testing and Conventions

5.1 Test Strategy

  • Framework: Catch2 v3.
  • Unit Tests (tests/unit/): Validates individual decoupled modules (CpuState, SystemBus, Instruction categories).
  • Integration Tests (tests/integration/): Tests full Emulator orchestration. test_demo_dma_device.cpp validates the custom peripheral hooks; test_bare_metal.cpp runs tiny hex blobs end-to-end.
  • Assembly Execution (tests/asm/): Real .S SPARC assembler pipelines compiled using the RCC cross-compiler (BCC 1.3.2 typically).
  • RTEMS System Tests (tests/rtems/): Real unmodified RTEMS binary tests ensuring 100% integration passing rate. Verified locally using CTest timeouts.

Execution:

ctest --test-dir build --output-on-failure

5.2 Style and Naming

  • Namespaces: lince, lince::core, lince::bus, lince::peripherals, lince::runtime.
  • Type Names: PascalCase.
  • Function Names: snake_case. Private members: trailing_underscore_.
  • Pragmas: #pragma once across all headers.
  • Includes: lince/ project headers first, followed by third-party deps, followed by <stdlib> (separated by newlines).
  • Use [[nodiscard]] aggressively on all Result<T> or object retrieval methods.