# oxgraph + oxcode > oxgraph is a storage-agnostic, zero-copy graph and hypergraph topology substrate for Rust. oxcode is code intelligence for agents, built on top of it. ## Benchmarks These numbers track the 0.3.0 engine overhaul: O(change) identity-reconcile writes, an intrinsic reverse-adjacency index, and zero-copy / non-O(base) index open. They are measured **end to end through the downstream consumer [oxcode](/oxcode/getting-started)** (tree-sitter code indexing on top of `oxgraph-db`), the realistic workload the engine is designed for. ### Method * **baseline**: oxcode at its pre-overhaul commit on published **oxgraph 0.2.4** (`apply_delta` wholesale rewrite + O(total-incidences) tombstone). * **current**: oxcode on **oxgraph 0.3.0** (identity reconcile + zero-copy open). * Identical source corpora; release builds; sequential runs on an otherwise-idle 16-core machine. Incremental reindex = append one function to a source file and re-index, verified by confirming the new symbol is queryable afterward. * Measured 2026-06-03. ### storage-hub 328 files · 40,749 elements · 527,999 relations · 1,055,998 incidences | metric | 0.2.4 baseline | 0.3.0 | change | | --------------------------------- | ----------------------------: | -----------: | ----------------: | | incremental reindex (1-file edit) | **> 150 s** (\~62 min, O(n²)) | **4,842 ms** | **≈ 770× faster** | | symbol query (p50) | 3,902 ms | 988 ms | 3.9× faster | | cold index | 15,165 ms | 11,968 ms | 1.3× faster | | reindex, no change | 1,349 ms | 834 ms | 1.6× faster | | WAL written per reindex | 953 MB | 5.2 MB | O(change) | | database on disk | 805 MB | 1.18 GB | 1.5× larger\* | ### harnessing 76 files · 11,091 elements · 45,901 relations | metric | 0.2.4 baseline | 0.3.0 | change | | --------------------------------- | -------------: | ---------: | -------------: | | incremental reindex (1-file edit) | **27,648 ms** | **444 ms** | **62× faster** | | symbol query (p50) | 457 ms | 154 ms | 3.0× faster | | cold index | 1,797 ms | 1,313 ms | 1.4× faster | \* The larger on-disk size is the deliberate tradeoff for O(1)-style index open: 0.3.0 persists the derived index (equality / label / adjacency postings) as zero-copy sections that are borrowed at open instead of rebuilt in RAM. ### What changed * **Incremental reindex** went from O(n²) to O(change). The 0.2.4 `tombstone_*` primitives were O(total incidences) per call (no reverse adjacency), so a bulk edge replacement was O(n²) — \~62 min on a 528 K-relation graph. 0.3.0 adds an intrinsic reverse-adjacency index (cascade is O(log n + degree)) and identity-keyed reconcile verbs (`upsert_element` / `upsert_relation` / `retain`) whose unchanged subjects emit zero mutations. * **Query latency** dropped 3–4× from zero-copy index open: the base index is persisted at freeze and borrowed from the memory map at open, rather than decoded and rebuilt from records on every command. Verification: `just ci` (fmt, taplo, clippy, deny, workspace tests) and `just verify` (miri on the zero-copy borrow path; `cargo kani` algebraic proofs — 68 verified, 0 failures), plus a freeze→open differential proptest against the owned-index oracle. ## Concepts ### Topology, kept separate from meaning and storage oxgraph models graph and hypergraph *topology* — the question of what connects to what — and keeps it separate from what the data means and where it is stored. * **The foundation** defines connectivity and refuses to interpret it. * **Layouts** decide the byte representation. * **Properties and domain meaning** live in layers above. Algorithms bind to capability traits, not to a concrete container, so the same BFS runs over an in-memory layout, a memory-mapped file, or rows read from a database. ### Why it exists Graph-shaped data shows up everywhere: compilers, databases, knowledge graphs, provenance, build systems. Most of those systems rebuild large topology into a heap-owned graph before they can traverse it, and most reinvent node and edge IDs, adjacency, serialization, and validation while doing it. oxgraph splits those concerns apart so one narrow goal drives the design: open a large graph from a snapshot, memory-map it, and start traversing without rebuilding it in memory. ### The pipeline: bytes → validate → view → traverse A snapshot is a topology-agnostic byte container. You validate it once, then borrow its sections as typed slices and point a layout view at them. From there you traverse directly against the bytes — there is no owned graph rebuilt in RAM. ### Verification The substrate is `unsafe`-free (`unsafe_code = "forbid"`). Correctness is defended by `proptest` strategies, `miri` on the zero-copy borrow path, and `cargo kani` algebraic proofs (symmetry, totality, roundtrip, merge laws). The 0.3.0 release verifies 68 kani proofs with 0 failures, plus a freeze→open differential proptest against an owned-index oracle. ## The crate family oxgraph is a family of small crates. Most users depend on the umbrella crate [`oxgraph`](https://docs.rs/oxgraph) and enable features; you can also depend on the individual crates directly. ### Foundation (`no_std`) | Crate | Gives you | | ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------- | | [`oxgraph-topology`](https://docs.rs/oxgraph-topology) | Capability traits for discrete topology: elements, relations, incidences, roles, identity. The shared vocabulary everything else builds on. | | [`oxgraph-graph`](https://docs.rs/oxgraph-graph) | Binary graph vocabulary over topology: nodes, edges, directed traversal, neighbors. | | [`oxgraph-hyper`](https://docs.rs/oxgraph-hyper) | Hypergraph vocabulary over topology: vertices, hyperedges, participants. | | [`oxgraph-layout-util`](https://docs.rs/oxgraph-layout-util) | Shared index and word types plus offset-integrity checks that concrete layouts reuse. | | [`oxgraph-snapshot`](https://docs.rs/oxgraph-snapshot) | Topology-agnostic byte container. Validate a snapshot and borrow its sections as typed slices. | ### Borrowed layouts | Crate | Gives you | | ---------------------------------------------------------- | ------------------------------------------------------------------------------------------- | | [`oxgraph-csr`](https://docs.rs/oxgraph-csr) | Compressed-sparse-row graph views over offset and target slices, native or from a snapshot. | | [`oxgraph-csc`](https://docs.rs/oxgraph-csc) | Inbound (compressed-sparse-column) graph views for reverse traversal. | | [`oxgraph-hyper-bcsr`](https://docs.rs/oxgraph-hyper-bcsr) | Directed bipartite-CSR hypergraph views, dense in both directions. | ### Algorithms and properties | Crate | Gives you | | ------------------------------------------------------ | ------------------------------------------------------------------------------------- | | [`oxgraph-algo`](https://docs.rs/oxgraph-algo) | BFS and PageRank over the capability traits, with `no_std`, `alloc`, and `std` tiers. | | [`oxgraph-property`](https://docs.rs/oxgraph-property) | Arrow-backed named, typed property layers and snapshot identity maps. | ### Built on top [`oxgraph-db`](https://docs.rs/oxgraph-db) is an embedded OxGraph-native database with catalogs, queries, and durable identity; [`oxgraph-postgres`](https://docs.rs/oxgraph-postgres) is a Postgres-backed engine. `oxgraphd` (server and CLI) and `oxgraph-pgrx` (Postgres extension) ship in the workspace but are not published to crates.io. ## Feature flags The umbrella crate's default feature set is empty. A feature is the only thing that pulls a layer in: ```bash cargo add oxgraph --features csr,algo-std ``` Feature names map to layers: | Feature | Use it for | | -------------------------------- | -------------------------------------------------------------- | | `topology`, `graph`, `hyper` | The capability traits for graphs and hypergraphs. | | `csr`, `csc`, `hyper-bcsr` | Borrowed layouts over slices. | | `snapshot`, `snapshot-alloc` | Reading snapshots, and building them (`-alloc`). | | `algo`, `algo-alloc`, `algo-std` | BFS and PageRank at the matching allocation tier. | | `graph-build`, `hyper-build` | Append/freeze builders that produce a layout, then a snapshot. | | `property-arrow` | Arrow-backed property layers. | | `db`, `postgres` | The embedded database and the Postgres engine. | | `full` | Everything. | ### Allocation tiers Several layers ship at `no_std`, `alloc`, and `std` tiers so you only pull in an allocator (or `std`) when your target allows it. `oxgraph-algo` is the clearest example: `algo` (`no_std`), `algo-alloc`, and `algo-std`. ## Getting started `oxgraph` is a storage-agnostic, zero-copy-friendly graph and hypergraph topology substrate for Rust. > **Topology here. Meaning elsewhere. Storage anywhere.** The core is `no_std` and the substrate is `unsafe`-free. The pipeline it is built around is `bytes → validate → view → traverse`: you point a view at a byte slice, it validates the layout, and you walk the graph. No parse step, no heap graph rebuilt from the bytes, no copy of the edges. ### Install Depend on the umbrella crate and turn on the features you need. Its default feature set is empty, so a feature is the only thing that pulls a layer in. ```bash cargo add oxgraph --features csr,algo-std ``` You can also depend on the individual crates directly if you want a tighter dependency graph. ### What you reach for, by task | You want to | Reach for | | ------------------------------------------------------------ | ----------------------------------------- | | Expose your own storage as a graph or hypergraph | the `topology` / `graph` / `hyper` traits | | Borrow a CSR or bipartite-CSR layout over slices | `oxgraph-csr`, `oxgraph-hyper-bcsr` | | Open a validated snapshot and traverse it without rebuilding | `oxgraph-snapshot` plus a layout | | Run BFS or PageRank at a `no_std`, `alloc`, or `std` tier | `oxgraph-algo` | | Attach Arrow-backed named properties | `oxgraph-property` | | Run an embedded database or a Postgres engine | `oxgraph-db`, `oxgraph-postgres` | Every layer ships runnable examples under its crate's `examples/` directory (`cargo run --example -p `). ### Next * [Concepts](/oxgraph/concepts) — topology vs. meaning vs. storage * [The crate family](/oxgraph/crates) * [Feature flags](/oxgraph/features) * [Benchmarks](/oxgraph/benchmarks) ### Status Pre-1.0 and still changing. The traits and crates are not stable yet. The snapshot byte format is an internal ABI candidate, not a stable interchange format — treat persisted snapshots as coupled to the crate version that wrote them. Licensed MIT. ## Benchmarks ### Agent-task benchmark An agent answers *"How does tokio schedule and run async tasks?"* with and without each tool, on the Tokio codebase, measuring efficiency and blind-judged answer quality. oxcode and codegraph were measured on different agent harnesses, so the comparable unit is each tool's improvement **vs. its own no-tool baseline**, not absolute numbers. | arm | answer quality | tokens | cost | tool calls | wall time | | ------------------------------------ | -------------: | -------: | -------: | ---------: | --------: | | baseline (no tool) | 0.98 | — | — | — | — | | oxcode — codex/gpt-5.5, CLI, n=6 | 0.96 (tied) | +15% | +4% | −4% | +14% | | **oxcode — codex/gpt-5.5, MCP, n=6** | **0.93** | **−74%** | **−57%** | **−84%** | **−60%** | | codegraph — Opus 4.8, MCP, published | not measured | −38% | even | −57% | −18% | Percentages are change vs. that tool's own no-tool baseline (negative = reduction, better; quality is the blind LLM-judge score, 0–1). All oxcode rows come from one n=6 release suite on Tokio. Absolute medians: tokens 395k (baseline) → 455k (CLI) → 104k (MCP); cost $0.17 → $0.18 → $0.07; tool calls 28 → 27 → 5; wall 97s → 111s → 39s. ### The MCP server is the headline Delivering the same bounded, PageRank-curated context through a one-call [`oxcode_explore`](/oxcode/mcp) MCP tool — instead of a CLI the agent composes — cuts tool calls 84%, tokens 74%, cost 57%, and wall 60% vs. the no-tool baseline, exceeding codegraph's published reductions (−57% tool calls / −38% tokens). The CLI arm is statistically tied with the baseline: the agent treats a shell binary as a supplement to its own grep/read, not a replacement — so the gap was always **tool delivery, not index quality**. The one cost the quality gate exposes (and a quality-blind benchmark would hide): MCP answer quality dips to 0.93 vs. 0.98, a completeness trade-off from the leaner exploration. codegraph numbers are from its README, re-validated 2026-06-02. The engine-level numbers behind these results — incremental reindex, query latency, cold index — live in [oxgraph Benchmarks](/oxgraph/benchmarks). ## CLI ### Navigation commands | Command | What it does | | ------------------- | ------------------------------------------------------------------------------------------------------- | | `index` | Index a project into `.oxcode/index.oxgdb/`. | | `status` | Report index state for a project. | | `context` | Rank entry-point symbols for a task, then expand nearby relationships. Deterministic and graph-derived. | | `symbols` | Keyword discovery over symbols, with repeatable `--kind` filters. | | `files` | Find files relevant to a query. | | `symbol` | Resolve a single selector to its definition, signature, and source. | | `calls` | Walk the call graph outward from a symbol. | | `callers` | Walk the call graph inward to a symbol. | | `query` / `explain` | Execute raw [OxQL](/oxcode/oxql). | `context` ranks entry-point symbols for the task text, then expands nearby `calls`, `contains`, `references`, and `implements` relationships. For keyword discovery use `symbols`; do not pass plain English phrases to `query`. ### Selectors Navigation commands accept several selector forms: * `element:` — a concrete OxGraph element ID. * an exact crate-qualified name such as `my_crate::auth::tenant_middleware` (qualified names are anchored at the crate, so the first segment is the package name with `-` normalized to `_`). * `name:` — a simple function name. * `file::` — the innermost symbol covering a source line. ### Symbol kinds `symbols` accepts repeatable `--kind ` filters. Valid kinds: `file`, `module`, `namespace`, `package`, `class`, `struct`, `enum`, `trait`, `interface`, `impl_block`, `function`, `method`, `field`, `variable`, `constant`, `type_alias`, `macro`. ## Getting started `oxcode` indexes source code into a native OxGraph database. It uses tree-sitter for extraction, resolves code references into graph relations, and stores the result in a native OxGraph database under `.oxcode/index.oxgdb/`. It is the first downstream consumer of the [oxgraph](/oxgraph/getting-started) engine. The CLI keeps raw OxQL available, but agent navigation should usually start with `context`, `symbols`, `files`, and the call-graph commands, because they expand graph IDs back into function names, definition ranges, signatures, docstrings, source previews, and call-site source context. ### Quick start ```sh cargo run -p oxcode -- index --path path/to/rust/project cargo run -p oxcode -- status --path path/to/rust/project cargo run -p oxcode -- context "How does entry reach helper?" --path path/to/rust/project --limit 8 --json cargo run -p oxcode -- symbols "entry helper" --path path/to/rust/project --limit 20 --json cargo run -p oxcode -- symbol crate::entry --path path/to/rust/project --json cargo run -p oxcode -- calls crate::entry --depth 2 --path path/to/rust/project cargo run -p oxcode -- callers crate::helper --depth 2 --path path/to/rust/project ``` The generated `.oxcode/` directory writes its own `.gitignore`, so the index is never committed by accident. ### Architecture The workspace uses a hybrid Rust architecture: * **`oxcode-model`** — storage-neutral vocabulary: code-graph kinds, identifier newtypes, the graph schema catalog, the selector grammar, the extraction/resolution IR, and agent-facing report DTOs. * **`oxcode-core`** — indexing, extraction, reference resolution, OxGraph storage, navigation, formatting, and the public `ProjectIndex` facade. * **`oxcode`** — thin CLI package and binary. * **`oxcode-mcp`** — MCP server (stdio) exposing oxcode's read-only queries to coding agents. The model crate's typed schema is the single source of truth that the storage layer derives property registration, read-key caching, and indexes from. ### Next * [Language support](/oxcode/languages) * [CLI](/oxcode/cli) * [MCP server](/oxcode/mcp) * [OxQL](/oxcode/oxql) * [Benchmarks](/oxcode/benchmarks) ## Language support oxcode supports two tiers of language coverage. ### High-fidelity (hand-written extractors) **Rust**, **Go**, and **TypeScript / JavaScript** (`.ts` / `.tsx` / `.js` / `.jsx` / `.mts` / `.cts`). These resolve receiver-typed method calls, qualified names, and imports precisely. ### Best-effort (generic, query-driven extractor) **Python**, **Java**, **C**, and **C++**. These extract symbols, containment, and approximate call edges that resolve only at the scoped/simple tiers (no receiver typing), so some edges are marked ambiguous. Qualified names use `::` internally regardless of the language's own separator. ### Adding and promoting languages Run `oxcode languages` to list the registered extractors. * A best-effort language is promoted to high fidelity by adding a hand-written extractor. * A new best-effort language is added with a tree-sitter query plus a profile entry (see `crates/oxcode-core/src/extract/profiles.rs`). Recognized source files in a language with no extractor yet (e.g. Ruby, PHP, C#) are reported as skipped rather than silently dropped. ## MCP server `oxcode-mcp` is a stdio MCP server that exposes oxcode's read-only queries to coding agents (Claude, Cursor, and others). It is the recommended way to give an agent code context — delivering the same bounded, PageRank-curated context through a single tool call instead of a CLI the agent has to compose. ### Tools | Tool | Purpose | | ----------------------------------- | ----------------------------------------------------------------- | | `oxcode_explore` | One-call, PageRank-curated context for a task. The headline tool. | | `oxcode_search` | Keyword discovery over symbols. | | `oxcode_callers` / `oxcode_callees` | Walk the call graph in or out from a symbol. | | `oxcode_symbol` | Resolve a selector to its definition and source. | | `oxcode_files` | Find files relevant to a query. | | `oxcode_status` | Report index state. | ### Why MCP is the headline On the Tokio agent-task benchmark, the one-call `oxcode_explore` MCP tool cuts tool calls 84%, tokens 74%, cost 57%, and wall time 60% versus the no-tool baseline — exceeding codegraph's published reductions. The CLI arm, by contrast, is statistically tied with the baseline: an agent treats a shell binary as a supplement to its own grep/read, not a replacement. The gap was always tool delivery, not index quality. See [Benchmarks](/oxcode/benchmarks) for the full table. ## OxQL `query` and `explain` execute raw OxQL (an OxGraph-native, Cypher-flavored query language). For keyword discovery prefer [`symbols`](/oxcode/cli); do not pass plain English phrases to `query`. ### Accepted profile ```text CATALOG MATCH ELEMENTS MATCH ELEMENTS HAS LABEL