Skip to content

Debugging with GDB

Lince embeds a GDB Remote Serial Protocol (RSP) stub inside the Emulator. Any RSP-speaking front-end (sparc-rtems5-gdb is the canonical one) can attach over TCP and drive the simulated SPARC guest with normal continue, step, breakpoint, bt, info threads, etc.

The stub lives in src/runtime/src/gdb_stub*.cpp, listens on 127.0.0.1:<port>, is single-client, and is opt-in — by default nothing binds. There is no GDB binary started by Lince; you bring your own.

Quick start

# Terminal A — emulator with stub enabled
$ lince-emu --image hello.rom --gdb-port 1234

# Terminal B — GDB (point it at the .exe for symbols, not the .rom)
$ sparc-rtems5-gdb hello.exe
(gdb) target remote :1234
0x40000020 in _start ()
(gdb) c

If you want GDB to be attached before the first instruction executes, add --gdb-wait to lince-emu. Otherwise the emulator runs free and GDB can attach at any moment (see Attach modes below).

Why two files?

The .rom is what the emulator runs (a bootloader + compressed payload produced by mkprom2). The .exe is the original linked binary with .symtab and DWARF — GDB needs that side for names and source lines. They share the same load address layout, so breakpoints in hello.exe resolve to the right address in the running .rom.

Configuration

EmulatorConfig exposes two fields and a CLI shortcut for each:

Field CLI flag Default Effect
gdb_stub_port --gdb-port N 0 (off) TCP port to listen on. 0 disables the stub entirely (no socket, no thread, the per-step hot path costs a single null-pointer check).
gdb_stub_wait_for_client --gdb-wait false If true and gdb_stub_port != 0, initialize() blocks the calling thread inside a accept() until GDB connects.

pacing interacts with the stub:

  • Realtime (default) — wall-clock paced. GDB sees the simulation advance at roughly real time, including stop replies.
  • Turbo — free-running. The simulation can blow past your breakpoint setup if you delay between connect and continue; usually fine because RSP packets are blocking on the stub side.

Attach modes

There are two ways GDB can become attached to a running stub. Both are supported transparently — you do not need a CLI flag for late-binding.

Early / blocking attach (--gdb-wait)

emu.initialize()
   ├─ binds 127.0.0.1:1234
   └─ blocks in accept() ──── GDB: target remote :1234 ──┐
                          ◀────── accept returns ────────┘
   └─ resumes, sets stop_pending_=true
emu.run_for(...)
   └─ first iteration sees stop_pending_, returns Breakpoint
   └─ CLI hands over to GdbStub::process_until_resume()

Use it when you want to inspect early-boot code: _start, BSP init, the bootloader trampoline. Without --gdb-wait those instructions are gone by the time you connect.

Late-binding attach (default when --gdb-wait is off)

initialize() opens the listening socket but returns immediately. The run loop polls the listening socket once per quantum (poll_accept()) and, when GDB connects, halts with HaltReason::Breakpoint so the CLI can drive process_until_resume(). The first packet GDB sends (usually ?) gets answered with T05 (or the more specific signal — see Stop signals) as if the program had stopped at a breakpoint.

emu.initialize()              ──► binds 127.0.0.1:1234, returns
emu.run_for(LongBudget)
   ├─ iteration N: poll_accept() → still no client, run instructions
   ├─ ...
   ├─ iteration M: poll_accept() → ACCEPTED
   └─ returns Breakpoint
       └─ CLI: process_until_resume() → first '?' → T05thread:1;

A second connection while one is attached is drained and rejected: the stub accepts the FD just enough to send EOF, then closes it. The backlog stays clear; the original client is undisturbed.

Supported RSP packets

Packet Meaning Status
g / G Read / write all general registers
m addr,len Read memory (word path for aligned MMIO)
M addr,len:bytes Write memory
c / s Continue / single-step
Z0 addr,kind / z0 Insert / remove software breakpoint
? Halt reason
vCont? / vCont;c;s Multi-thread step/continue (basic)
qSupported Capability negotiation
qC Current thread ID
qfThreadInfo / qsThreadInfo Thread list
qThreadExtraInfo,id Per-thread display string (name)
qSymbol:: Symbol-lookup handshake
qAttached "Attached vs. spawned" — always 1
H g\|c <id> Select thread for g/c operations
D Detach
k Kill (sets ErrorMode)

qSupported advertises PacketSize=4000;qXfer:features:read-;swbreak+;qSymbol+.

Not yet implemented

  • Z2/Z3/Z4 — hardware watchpoints (read/write/access). SPARC V8 has no real watchpoint hardware; we can emulate via per-load/store hooks on the bus, but it's not wired up.
  • qXfer:features:read:target.xml — explicit register description. GDB falls back to its built-in sparc arch which matches our layout.
  • Non-stop / async mode.
  • Per-thread register context (see Thread-awareness limits).

RTEMS thread-awareness (B1)

When the guest is RTEMS, the stub reports the executing thread on each core as a real RTEMS Objects_Id (e.g. 0x0a010001) with its 4-byte ASCII name (e.g. INIT) instead of a generic Thread 1. This is the B1 scope: the currently-running thread per core; blocked and ready threads are not enumerated.

Handshake — qSymbol

The stub advertises qSymbol+. GDB opens the handshake with qSymbol:: on attach (and on every file reload):

GDB ──► qSymbol::
stub ◄── qSymbol:5f5065725f4350555f496e666f726d6174696f6e   (hex of "_Per_CPU_Information")
GDB ──► qSymbol:40028440:5f5065725f...                       (the resolved address)
stub ◄── OK                                                  (handshake done)

The address is stored in per_cpu_addr_. If GDB replies with an empty address section (qSymbol::<name_hex>), the stub leaves thread-awareness disabled and falls back to the legacy "one thread per core" model — so bare-metal ELFs without RTEMS just work without changes.

Layout assumptions

The stub knows three offsets into RTEMS 5 internal structs. All three are verified against RCC 1.3.2 / RTEMS 5 / leon3 + leon3_smp BSPs via DWARF inspection of hello-world.elf:

Symbol or field Value Source
Per_CPU_Control_envelope size 128 (1<<7) PER_CPU_CONTROL_SIZE_LOG2 in score/percpu.h
Per_CPU_Control.executing offset 32 DWARF
Thread_Control.Object.id offset 8 DWARF (after Chain_Node)
Thread_Control.Object.name offset 12 DWARF

They live in src/runtime/include/lince/runtime/gdb_stub.hpp under namespace lince::runtime::rtems_layout. If RCC bumps to a newer RTEMS and these drift, the validation step below catches it.

Validation and fallback

Every thread read goes through try_read_executing_id(core), which gates on three checks before reporting an ID:

  1. executing pointer is non-zero. A zero means RTEMS has not scheduled anything on this core (pre-boot, or no thread bound yet).
  2. The pointer dereferences without BusError. The SystemBus rejects reads outside RAM/peripheral ranges, so a garbage pointer never produces garbage data — it fails the read.
  3. Object.id has a valid API field. Bits [26:24] must be one of 0x01 (Internal/idle), 0x02 (Classic tasks), 0x03 (POSIX threads). Any other value means the offset drifted or we're reading from a non-RTEMS image.

If any check fails, the stub silently falls back to the legacy per-core TID model (m1, m2, ...) and emits no error packet — qfThreadInfo still returns a list, just with synthetic IDs. The fallback is re-evaluated on every qfThreadInfo and every stop reply, so a guest that boots RTEMS mid-run gets thread-aware output as soon as the relevant memory is populated, without reconnecting.

How it looks in GDB

(gdb) target remote :1234
0x40005c40 in _CPU_Thread_Idle_body ()
(gdb) info threads
  Id   Target Id                       Frame
* 1    Thread 0x0a010001 (INIT)        0x40005c40 in _CPU_Thread_Idle_body ()
(gdb) qfthreadinfo                       # raw RSP — for debugging the stub itself
m0a010001

Without B1 the same output would be Thread 1 with no name.

Thread-awareness B1 and what it does not do

B1 covers the executing thread per core. Out of scope for now:

  • B2 — enumerate blocked / ready threads. Walking _Objects_Information_table[OBJECTS_CLASSIC_API][OBJECTS_RTEMS_TASKS].local_table[] is straightforward but requires another offset.
  • B3 — context switch via Thread_Control.Registers. Today g on a non-current thread reads the host core's registers, which is wrong for blocked threads but correct for the executing one.
  • B4 — qThreadExtraInfo with priority / state. We return the packed 4-byte ASCII name; we don't yet decode current_state into (Blocked on semaphore #42) and friends.

Stop signals

A stop reply has the shape T<sig>thread:<tid>;. The <sig> field is not the host Linux signal table — it's GDB's internal table from gdb/include/gdb/signals.def. Notably, SIGBUS = 10 (not 7).

Signal Value When the stub emits it
SIGINT 2 Ctrl-C from GDB while the run loop is active
SIGILL 4 error_mode with TT ∈ {0x02 illegal_instruction, 0x03 privileged, 0x04 fp_disabled, 0x24 cp_disabled, 0x25 unimpl_FLUSH}
SIGTRAP 5 Software breakpoint, single-step completion, attach, or error_mode from a ta instruction (TT ≥ 0x80)
SIGFPE 8 error_mode with TT ∈ {0x08 fp_exception, 0x28 cp_exception, 0x2A division_by_zero}
SIGBUS 10 error_mode with TT = 0x07 mem_address_not_aligned
SIGSEGV 11 error_mode with TT ∈ {0x01 instruction_access, 0x05/0x06 window over/underflow, 0x09 data_access, 0x20/0x21/0x29/0x2B access_error} — also the default for unmapped TTs

The mapping lives in signal_from_tt(uint32_t) in gdb_stub_transport.cpp. The TT is read from TBR[11:4] after the core enters error_mode; SPARC V8 §7.3 guarantees that the offending trap type stays latched there.

Thread ID in stop replies

When thread-awareness is active (qSymbol handshake completed and the core's executing thread reads validly), stop replies carry the RTEMS Objects_Id, not the synthetic core+1:

attach (proactive, pre-handshake)   ──► T05thread:1;
after handshake, then '?'           ──► T05thread:0a010001;
breakpoint hit mid-run              ──► T05thread:0a010001;
error_mode from alignment fault     ──► T0athread:0a010001;

The first reply uses the legacy TID because the qSymbol handshake happens after ? in GDB's standard sequence; that one packet is expected to be ignored by GDB once it resyncs.

Stop-on-ErrorMode

When any core enters SPARC error_mode (a trap delivered while PSR.ET == 0) and a GDB client is attached, the run loop does not return silently with HaltReason::ErrorMode. Instead it:

  1. Reads the offending TT from TBR and maps it through signal_from_tt.
  2. Calls GdbStub::report_error_mode(core) which arms a stop reply with the chosen signal.
  3. Returns HaltReason::Breakpoint so the CLI hands control to process_until_resume().
  4. GDB receives a stop reply like T0athread:0a010001; and the user sees:
Program received signal SIGBUS, Bus error.
0x40005c40 in some_unaligned_load ()
(gdb) bt

The "already reported" state is latched per client, so issuing continue from GDB after the crash returns HaltReason::ErrorMode (no Breakpoint), the resume loop in lince-emu/main.cpp exits, and the post-mortem dump is printed to stderr.

When no client is attached, the legacy behaviour applies: the run loop returns HaltReason::ErrorMode straight to the caller (the CLI prints the post-mortem and exits 5).

Example session

$ lince-emu --image sptest.rom --gdb-port 1234 &
[1] 12345
[INFO] [emulator] Initialized with 1 core(s), 16777216 bytes RAM at 0x40000000
[INFO] [gdb_stub] listening on 127.0.0.1:1234

$ sparc-rtems5-gdb sptest.exe
GNU gdb (...) 12.1
Reading symbols from sptest.exe...
(gdb) target remote :1234
Remote debugging using :1234
0x40005c40 in _CPU_Thread_Idle_body ()
(gdb) info threads
  Id   Target Id                Frame
* 1    Thread 0x0a010001 (UI1 ) 0x40005c40 in _CPU_Thread_Idle_body ()
(gdb) break Init
Breakpoint 1 at 0x40002340: file init.c, line 28.
(gdb) c
Continuing.

Breakpoint 1, Init (argument=0) at init.c:28
28      directive_failed( status, "rtems_initialize_executive" );
(gdb) bt
#0  Init (argument=0) at init.c:28
#1  0x40004b18 in _Thread_Handler ()
#2  0x00000000 in ?? ()
(gdb) c
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x40003088 in rtems_shutdown_executive (result=0) at ...

CLI integration in lince-emu

The standalone front-end wraps the resume loop:

auto result = emu.run_for(SimTimeNs{budget});
while (auto* stub = emu.gdb_stub()) {
    if (result.reason != HaltReason::Breakpoint
        || !stub->client_connected()) {
        break;
    }
    const auto action = stub->process_until_resume();
    if (action == ResumeAction::Detach || action == ResumeAction::Kill) {
        break;
    }
    const auto remaining = /* deduct elapsed from budget */;
    result = emu.run_for(SimTimeNs{remaining});
}

In short: run_for returns whenever the stub wants the CLI to talk to GDB (breakpoint, single-step, ErrorMode-with-stub, late-binding accept, Ctrl-C); process_until_resume() blocks on RSP until GDB says continue / step / detach / kill.

Programmatic embedding

If you embed Emulator as a library, the same flow is available without the CLI wrapper. EmulatorConfig::gdb_stub_port controls the listener; emu.gdb_stub() returns the GdbStub* (or nullptr if disabled).

EmulatorConfig cfg = gr712rc_config();
cfg.gdb_stub_port = 1234;
cfg.gdb_stub_wait_for_client = true;   // block initialize() until attached

auto emu = *Emulator::create(cfg);
emu->initialize();   // blocks here on accept()
emu->load_elf("hello.exe");

auto r = emu->run_for(SimTimeNs{1'000'000'000ULL});
while (r.reason == HaltReason::Breakpoint && emu->gdb_stub()->client_connected()) {
    if (emu->gdb_stub()->process_until_resume() != ResumeAction::Continue) {
        break;
    }
    r = emu->run_for(SimTimeNs{1'000'000'000ULL});
}

Limitations and caveats

  • Single-client. RSP is single-client by design; a second connection is accepted-and-closed to drain the kernel backlog.
  • Linux only. Uses POSIX sockets directly (socket/bind/ listen/accept/recv). Porting to macOS would be trivial; porting to Windows requires WinSock.
  • No qXfer:features. GDB uses its built-in SPARC arch description. Works for vanilla SPARC V8 / LEON3 — if you need %asr17 and friends in info registers, those are not yet exposed beyond what's in the standard g-packet.
  • No floating-point in g/G yet. The 32 FP register slots are zero-padded. Adequate for bt; insufficient for inspecting FP state mid-fault.
  • RTEMS layout assumption. B1 is verified against RCC 1.3.2; if you update RCC, run ./build/tests/lince_tests "[gdb_stub][rtems-aware]" and watch for the [!mayfail] live-guest test. A skip is fine; a fail means the offsets drifted and you need to refresh rtems_layout::* against new DWARF.

Where to look in the code

File Responsibility
src/runtime/include/lince/runtime/gdb_stub.hpp Public API of GdbStub, codec helpers, StopSignal, signal_from_tt, RTEMS layout constants
src/runtime/src/gdb_codec.cpp Packet framing ($payload#cc), hex parsing, run-length expansion
src/runtime/src/gdb_stub.cpp Packet dispatch, command handlers (g/m/Z/q.*/H/?/c/s/D/k)
src/runtime/src/gdb_stub_transport.cpp TCP listener (start_listening, wait_for_client, poll_accept), packet I/O, RTEMS reads (try_read_executing_id, try_read_thread_name), report_error_mode, send_stop_reply
src/runtime/src/emulator.cpp run_until_unpaced hooks: poll_accept for late-binding, report_error_mode for stop-on-ErrorMode
tests/unit/test_gdb_stub_codec.cpp Codec unit tests + signal_from_tt table coverage
tests/integration/test_gdb_stub_protocol.cpp End-to-end RSP tests (handshake, attach modes, RTEMS-aware against live guest)
tests/integration/test_gdb_stub_rtems.cpp Optional sparc-rtems5-gdb round-trip (skipped if GDB is missing)