Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Diagnostics and Observability

lean-rs and lean-rs-host project errors to the same stable [lean_rs::LeanDiagnosticCode] taxonomy and emit structured tracing spans against the lean_rs target. Long-running host session operations can also report live progress through lean_rs_host::LeanProgressSink; see architecture/13-structured-progress.md. lean-rs-worker uses its own LeanWorkerError surface for process-boundary outcomes such as child exit, request timeout, cancellation, data-sink panic, diagnostic-sink panic, and typed command decode failures. A downstream caller gets:

  • a stable identifier (err.code()) to react by family, independent of internal stage tags that may grow new variants;
  • visibility into where time is spent and where failures originate, without rebuilding the crates.

The taxonomy is unified across lean-rs and lean-rs-host (every recoverable failure on either side maps to a stable code); the span catalogue is split by emitting crate so the layer boundary is visible at the log line. Lean internal panics are outside this error taxonomy: there is no SessionPoisoned code. A Lean runtime panic during a LeanSession call may terminate the process; see architecture/06-panic-containment.md. When the same failure happens inside lean-rs-worker, the parent observes a typed worker exit instead of an in-process LeanDiagnosticCode.

Diagnostic codes

Codeas_str()MeaningCommon fix
RuntimeInitlean_rs.runtime_initLean runtime init failed (panic in lean_initialize_*, task-manager failure, thread-attach floor).Pin one lean-toolchain across the workspace. Restart the process if an earlier crash left thread-local state behind.
Linkinglean_rs.linkingA linkable artefact was missing or mismatched: invalid Lake package/module identifier, missing initializer symbol, header-digest mismatch.Verify the lib_name matches what lake build emitted (inspect .lake/build/lib/). Rebuild the capability against the same lean-toolchain as the host process.
ModuleInitlean_rs.module_initA capability dylib could not be opened, parsed, or its root initializer raised. Includes Lake project root not existing.For shipped crates, rebuild through CargoLeanCapability and inspect LeanCapabilityPreflight. For LeanHost::from_lake_project, verify the Lake root contains lakefile.lean / lakefile.toml and re-run lake build. Only low-level LeanLibrary callers should repair loader search paths manually.
SymbolLookuplean_rs.symbol_lookupA function or global symbol was not present in the loaded dylib (dlsym miss or arity mismatch).Add the Lean module exporting the symbol to the capability’s lean_lib. Check @[export] on the Lean side. Check that the Rust call site’s expected arity matches the export.
AbiConversionlean_rs.abi_conversionWrong Lean kind for the requested Rust type, integer out of range, invalid UTF-8, or a queried declaration was missing from the environment.For query_declaration: compare against session.list_declarations(None). For numeric overflow: widen the Rust target type or split the call. For String/ByteArray: check the Lean encoder.
LeanExceptionlean_rs.lean_exceptionLean raised through its IO error channel. Inspect LeanException::kind() for the IO.Error constructor.Branch on exc.kind(). For noFileOrDirectory / permissionDenied, adjust the path or working directory of the Lean code.
Elaborationlean_rs.elaborationTerm parsing or elaboration produced diagnostics. Payload is LeanElabFailure with typed diagnostics.Walk failure.diagnostics(): each carries a severity, bounded message, optional position, and file_label. If failure.truncated() == true, raise LeanElabOptions::diagnostic_byte_limit and rerun.
Unsupportedlean_rs.unsupportedThe host shim returned unsupported for the requested service, or an optional meta-service symbol was absent at load.Rebuild the bundled host shims and the consumer capability with the same Lean toolchain. The fixture in this repo exercises all four meta services (infer_type, whnf, heartbeat_burn, is_def_eq).
Cancelledlean_rs.cancelledA lean-rs-host cooperative cancellation token was observed before a host-controlled FFI dispatch.Treat the operation as aborted and discard partial work. Create a fresh token before retrying.
Internallean_rs.internalA pub(crate) invariant tripped, or a callback panicked inside the safe boundary.File a bug; include the bounded message and the as_str() id.

The enum is #[non_exhaustive]; new variants may be added. Variant names and as_str() ids are stable across patch releases.

Loader preflight

Manifest-backed shipped capabilities can be checked before LeanCapability opens any dylib:

#![allow(unused)]
fn main() {
let report = lean_rs::LeanCapabilityPreflight::new(
    lean_rs::LeanBuiltCapability::manifest_path(env!("LEAN_RS_CAPABILITY_MY_CAPABILITY_MANIFEST")),
)
.check();

if let Some(check) = report.first_error() {
    eprintln!("{}: {}", check.code(), check.repair_hint());
}
}

LeanCapability::from_build_manifest runs the same preflight internally when that produces a clearer failure. The broad LeanError still reports LeanDiagnosticCode::ModuleInit, but the bounded message includes the stable loader code and repair hint.

Loader codeas_str()MeaningCommon fix
MissingManifestlean_rs.loader.missing_manifestThe manifest path was absent, unreadable, or pointed at a missing file.Rebuild through CargoLeanCapability and package the emitted manifest.
MalformedManifestlean_rs.loader.malformed_manifestThe manifest was not valid JSON or missed required fields.Rebuild the Lean capability with the same lean-rs toolchain helper.
UnsupportedManifestSchemalean_rs.loader.unsupported_manifest_schemaThe manifest schema is not understood by this runtime crate.Rebuild with matching lean-toolchain and lean-rs versions.
MissingPrimaryDyliblean_rs.loader.missing_primary_dylibThe primary capability dylib named by the manifest is missing.Rebuild and package the built shared library.
MissingTransitiveDependencylean_rs.loader.missing_transitive_dependencyA dependency dylib named by the manifest is missing.Package every manifest dependency, including generic interop shims when used.
UnsupportedArchitecturelean_rs.loader.unsupported_architectureA dylib is not a native object for this target architecture.Rebuild the Lean capability for the current Cargo target.
UnsupportedToolchainFingerprintlean_rs.loader.unsupported_toolchain_fingerprintThe manifest was built with a mismatched or unsupported Lean toolchain fingerprint.Rebuild the Lean capability with the same Lean toolchain as the Rust binary.
StaleManifestlean_rs.loader.stale_manifestThe manifest appears older than the primary dylib it describes.Rebuild through CargoLeanCapability so the manifest and dylib match.
MissingInitializerlean_rs.loader.missing_initializerThe primary dylib does not export the package/module initializer named by the manifest.Check package/module names and rebuild the shared target.
MissingImportedSymbollean_rs.loader.missing_imported_symbolA Lean/imported symbol is not supplied by the manifest dependency set.Rebuild through CargoLeanCapability so dependency dylibs are recorded.

Normal users should not repair shipped apps by setting LD_LIBRARY_PATH or DYLD_*. A missing search path in the canonical flow is a packaging or artifact-manifest problem, not a caller-managed loader-flag problem.

Worker bootstrap checks

Packaged worker applications can check deployment state before running a real command:

#![allow(unused)]
fn main() {
let report = builder.check();
if let Some(check) = report.first_error() {
    eprintln!("{}: {}", check.code(), check.repair_hint());
}
}

LeanWorkerCapabilityBuilder::check() validates the app-owned worker child, manifest-backed capability artifact, protocol handshake, import session, and optional metadata expectation. It returns LeanWorkerBootstrapReport, not raw child pids, pipe handles, protocol frames, loader flags, or std::process configuration.

Worker bootstrap codeas_str()MeaningCommon fix
WorkerChildUnresolvedlean_rs.worker.bootstrap.child_unresolvedThe configured child locator did not resolve to a file.Ship an app-owned worker child binary beside the application or set the documented worker-child override.
WorkerChildNotExecutablelean_rs.worker.bootstrap.child_not_executableThe child path is missing or is not executable on this platform.Build the worker child binary and package it with executable permissions.
CapabilityPreflight { code: ... }lean_rs.loader.*Manifest-backed loader preflight failed.Follow the loader-code repair hint: rebuild through CargoLeanCapability, package missing artifacts, or use a compatible Lean toolchain.
WorkerHandshakeFailedlean_rs.worker.bootstrap.handshake_failedThe child started but did not speak the worker protocol.Ensure the binary calls lean_rs_worker::run_worker_child_stdio() and matches the parent crate version.
CapabilityMetadataMismatchlean_rs.worker.bootstrap.metadata_mismatchThe capability metadata export did not match the caller’s expectation.Select or rebuild the intended capability package.
WorkerStartupFailedlean_rs.worker.bootstrap.startup_failedStartup failed outside a more specific bootstrap class.Run the bootstrap check in the deployment environment and inspect the bounded message.

Internal covers Rust callback panics caught before they unwind across C or Lean. It does not mean a Lean kernel/runtime panic was contained. Those failures require a worker-process boundary.

L1 callback shims also return LeanCallbackStatus for callback-local outcomes. Ok = 0 continues the Lean-side callback loop; StaleHandle = 1 means Lean called a dropped handle; Panic = 2 means Rust contained a callback panic and recorded LeanError::Host on the live handle; WrongPayload = 3 means Lean used a handle with the wrong payload trampoline; and Stopped = 4 means the Rust callback asked Lean to stop cleanly.

lean-toolchain uses a separate LinkDiagnostics enum for build-script work. Those diagnostics cover Lean discovery, unsupported toolchains, missing lake, missing Lake targets, Lake build failures, and unresolved Lake outputs. They render as one line so a build.rs can either return them or print them through cargo:warning=.

Cancelled is cooperative. It is returned only when lean-rs-host regains control and checks the token; it does not pre-empt an in-flight Lean call. See architecture/07-cooperative-cancellation.md.

Progress sink panics are caught at the Rust callback boundary and surfaced as LeanError::Host with stage CallbackPanic, code lean_rs.internal.

Worker process diagnostics and pool snapshots

lean-rs-worker deliberately uses a separate worker error surface for process boundary failures. Child panic/abort, request timeout, cancellation-triggered cycle, stale lease use, row sink panic, diagnostic sink panic, and typed command decode failure are worker outcomes, not LeanDiagnosticCode values from the in-process host stack.

For large local runs, use LeanWorkerPoolSnapshot and LeanWorkerSessionLease::snapshot() for operational state. Snapshots summarize queue depth, active workers, warm leases, restart reasons, best-effort child RSS, stream outcomes, delivered row counts, payload bytes, elapsed stream time, and backpressure counters. They intentionally do not expose child pids, worker ids, pipe handles, protocol frames, or scheduling internals.

Matching on codes

LeanError, LeanElabFailure, and LeanMetaResponse all project to the same taxonomy via .code():

#![allow(unused)]
fn main() {
use lean_rs::{LeanDiagnosticCode, LeanError};

fn report(err: &LeanError) {
    match err.code() {
        LeanDiagnosticCode::Linking => eprintln!("rebuild the capability: {err}"),
        LeanDiagnosticCode::ModuleInit => eprintln!("check `lake build` produced the dylib: {err}"),
        LeanDiagnosticCode::LeanException => {
            if let LeanError::LeanException(exc) = err {
                eprintln!("Lean raised {:?}: {}", exc.kind(), exc.message());
            }
        }
        LeanDiagnosticCode::Cancelled => eprintln!("caller cancelled the operation"),
        other => eprintln!("unhandled {other}: {err}"),
    }
}
}

LeanMetaResponse::code() returns an Option: None on Ok, Some(Unsupported) when the capability lacked the requested service, and Some(Elaboration) on the other two failure shapes (which carry a LeanElabFailure). LeanElabFailure::code() always returns Elaboration.

Tracing

lean-rs and lean-rs-host declare spans against the lean_rs target. Neither installs a subscriber—pick one downstream, or use DiagnosticCapture for tests.

Recommended RUST_LOG scopes:

WorkloadRUST_LOG
Production defaultlean_rs=info,lean_toolchain=info
Dev iterationlean_rs=debug,lean_toolchain=debug
Full dispatch tracelean_rs=trace,lean_toolchain=trace

info covers init, library open, and session import (once-per-cycle events). debug adds per-session-method dispatch (query_declaration, elaborate, kernel_check, bulk methods, pool acquire/release). trace adds per-dispatch (LeanExported::call) and per-decoder events.

A typical tracing-subscriber wire-up:

#![allow(unused)]
fn main() {
tracing_subscriber::fmt()
    .with_env_filter(
        tracing_subscriber::EnvFilter::try_from_default_env()
            .unwrap_or_else(|_| tracing_subscriber::EnvFilter::new("lean_rs=info")),
    )
    .init();
}

Spans emitted by lean-rs (L1)

FFI-primitive spans: init, library open, module initializer, typed-export dispatch, ABI decode. Fire whether the caller is lean-rs-host or any other downstream of lean-rs.

SpanLevelFields
lean_rs.runtime.initinfo
lean_rs.module.library.opendebugpath (shortened)
lean_rs.module.library.initializedebugpackage, module
lean_rs.module.initializer.calldebuginitializer
lean_rs.module.exported.calltracearity
lean_rs.abi.decode (event)traceshape, len (bytes for strings; element count otherwise)

Spans emitted by lean-rs-host (L2)

Host-stack session and pool spans. Fire only if the caller opted into the L2 stack and is driving a session.

SpanLevelFields
lean_rs.host.session.importinfoimports_len (count of &str imports)
lean_rs.host.session.query_declarationdebugname
lean_rs.host.session.list_declarationsdebug
lean_rs.host.session.list_declarations_filtereddebuginclude_private, include_generated, include_internal
lean_rs.host.session.declaration_source_rangedebugname
lean_rs.host.session.declaration_type / _kind / _namedebugname
lean_rs.host.session.declaration_type_bulk / _kind_bulk / _name_bulkdebugbatch_size
lean_rs.host.session.elaboratedebugsource_len (chars), heartbeats, diagnostic_byte_limit
lean_rs.host.session.kernel_checkdebugsource_len, heartbeats, diagnostic_byte_limit
lean_rs.host.session.check_evidencedebug
lean_rs.host.session.summarize_evidencedebug
lean_rs.host.session.run_metadebugservice (name), heartbeats, diagnostic_byte_limit
lean_rs.host.session.call_capabilitydebugsymbol, arity
lean_rs.host.session.query_declarations_bulkdebugbatch_size
lean_rs.host.session.elaborate_bulkdebugbatch_size, heartbeats, diagnostic_byte_limit
lean_rs.host.pool.acquiredebugimports_len
lean_rs.host.pool.acquire.result (event)debughit (bool)
lean_rs.host.pool.release (event)tracekept (bool)

Capturing diagnostics in tests

DiagnosticCapture is an in-process buffer for spans and events emitted against the lean_rs target. Thread-local and bounded; the guard restores the previous subscriber on Drop.

#![allow(unused)]
fn main() {
use lean_rs::{DiagnosticCapture, LeanDiagnosticCode, LeanRuntime};
use lean_rs_host::LeanHost;

#[test]
fn rebuild_advice_fires_on_missing_dylib() {
    let capture = DiagnosticCapture::install();

    let runtime = LeanRuntime::init().unwrap();
    let host = LeanHost::from_lake_project(runtime, "/path/to/lake").unwrap();
    let err = host
        .load_capabilities("my_pkg", "DefinitelyMissingLib")
        .expect_err("missing dylib must fail");

    assert_eq!(err.code(), LeanDiagnosticCode::ModuleInit);

    let events = capture.events();
    assert!(events.iter().any(|e| e.span.as_deref()
        == Some("lean_rs.module.library.open")));
}
}

The guard is !Send + !Sync: it pins to the installing thread. Default capacity is 256 events; over-capacity drops the oldest entry and bumps capture.overflowed(). Use DiagnosticCapture::with_capacity(N) for larger budgets. Scope is just lean_rs; events from other crates pass through untouched.

Redaction and bounding

Two values can grow without bound: Lean-authored text (capability messages, diagnostic messages) and filesystem paths. Both are bounded at the construction site:

  • Lean-authored strings pass through bound_message (the same 4 KiB cap that protects LeanError::message). Enforced on the UTF-8 char boundary, so multibyte sequences are never split.
  • Filesystem paths emitted as span fields are shortened to the basename plus up to two parent components (build/lib/lib.dylib). The full absolute path is only emitted at trace level by call sites that explicitly need it.

Paths are not treated as secrets; shortening is a bounding decision, not a redaction decision. If a downstream policy requires full path suppression, install a tracing-subscriber filter that drops the relevant fields.

Cross-references