# oxgraph + oxcode

> oxgraph is a storage-agnostic, zero-copy graph and hypergraph topology substrate for Rust. oxcode is code intelligence for agents, built on top of it.

## Benchmarks

These numbers track the 0.3.0 engine overhaul: O(change) identity-reconcile
writes, an intrinsic reverse-adjacency index, and zero-copy / non-O(base) index
open. They are measured **end to end through the downstream consumer
[oxcode](/oxcode/getting-started)** (tree-sitter code indexing on top of
`oxgraph-db`), the realistic workload the engine is designed for.

### Method

* **baseline**: oxcode at its pre-overhaul commit on published **oxgraph 0.2.4**
  (`apply_delta` wholesale rewrite + O(total-incidences) tombstone).
* **current**: oxcode on **oxgraph 0.3.0** (identity reconcile + zero-copy open).
* Identical source corpora; release builds; sequential runs on an otherwise-idle
  16-core machine. Incremental reindex = append one function to a source file and
  re-index, verified by confirming the new symbol is queryable afterward.
* Measured 2026-06-03.

### storage-hub

328 files · 40,749 elements · 527,999 relations · 1,055,998 incidences

| metric                            |                0.2.4 baseline |        0.3.0 |            change |
| --------------------------------- | ----------------------------: | -----------: | ----------------: |
| incremental reindex (1-file edit) | **> 150 s** (\~62 min, O(n²)) | **4,842 ms** | **≈ 770× faster** |
| symbol query (p50)                |                      3,902 ms |       988 ms |       3.9× faster |
| cold index                        |                     15,165 ms |    11,968 ms |       1.3× faster |
| reindex, no change                |                      1,349 ms |       834 ms |       1.6× faster |
| WAL written per reindex           |                        953 MB |       5.2 MB |         O(change) |
| database on disk                  |                        805 MB |      1.18 GB |     1.5× larger\* |

### harnessing

76 files · 11,091 elements · 45,901 relations

| metric                            | 0.2.4 baseline |      0.3.0 |         change |
| --------------------------------- | -------------: | ---------: | -------------: |
| incremental reindex (1-file edit) |  **27,648 ms** | **444 ms** | **62× faster** |
| symbol query (p50)                |         457 ms |     154 ms |    3.0× faster |
| cold index                        |       1,797 ms |   1,313 ms |    1.4× faster |

\* The larger on-disk size is the deliberate tradeoff for O(1)-style index open:
0.3.0 persists the derived index (equality / label / adjacency postings) as
zero-copy sections that are borrowed at open instead of rebuilt in RAM.

### What changed

* **Incremental reindex** went from O(n²) to O(change). The 0.2.4 `tombstone_*`
  primitives were O(total incidences) per call (no reverse adjacency), so a bulk
  edge replacement was O(n²) — \~62 min on a 528 K-relation graph. 0.3.0 adds an
  intrinsic reverse-adjacency index (cascade is O(log n + degree)) and
  identity-keyed reconcile verbs (`upsert_element` / `upsert_relation` / `retain`)
  whose unchanged subjects emit zero mutations.
* **Query latency** dropped 3–4× from zero-copy index open: the base index is
  persisted at freeze and borrowed from the memory map at open, rather than
  decoded and rebuilt from records on every command.

Verification: `just ci` (fmt, taplo, clippy, deny, workspace tests) and
`just verify` (miri on the zero-copy borrow path; `cargo kani` algebraic proofs —
68 verified, 0 failures), plus a freeze→open differential proptest against the
owned-index oracle.


## Concepts

### Topology, kept separate from meaning and storage

oxgraph models graph and hypergraph *topology* — the question of what connects to
what — and keeps it separate from what the data means and where it is stored.

* **The foundation** defines connectivity and refuses to interpret it.
* **Layouts** decide the byte representation.
* **Properties and domain meaning** live in layers above.

Algorithms bind to capability traits, not to a concrete container, so the same
BFS runs over an in-memory layout, a memory-mapped file, or rows read from a
database.

### Why it exists

Graph-shaped data shows up everywhere: compilers, databases, knowledge graphs,
provenance, build systems. Most of those systems rebuild large topology into a
heap-owned graph before they can traverse it, and most reinvent node and edge
IDs, adjacency, serialization, and validation while doing it.

oxgraph splits those concerns apart so one narrow goal drives the design: open a
large graph from a snapshot, memory-map it, and start traversing without
rebuilding it in memory.

### The pipeline: bytes → validate → view → traverse

A snapshot is a topology-agnostic byte container. You validate it once, then
borrow its sections as typed slices and point a layout view at them. From there
you traverse directly against the bytes — there is no owned graph rebuilt in RAM.

### Verification

The substrate is `unsafe`-free (`unsafe_code = "forbid"`). Correctness is
defended by `proptest` strategies, `miri` on the zero-copy borrow path, and
`cargo kani` algebraic proofs (symmetry, totality, roundtrip, merge laws). The
0.3.0 release verifies 68 kani proofs with 0 failures, plus a freeze→open
differential proptest against an owned-index oracle.


## The crate family

oxgraph is a family of small crates. Most users depend on the umbrella crate
[`oxgraph`](https://docs.rs/oxgraph) and enable features; you can also depend on
the individual crates directly.

### Foundation (`no_std`)

| Crate                                                        | Gives you                                                                                                                                   |
| ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------- |
| [`oxgraph-topology`](https://docs.rs/oxgraph-topology)       | Capability traits for discrete topology: elements, relations, incidences, roles, identity. The shared vocabulary everything else builds on. |
| [`oxgraph-graph`](https://docs.rs/oxgraph-graph)             | Binary graph vocabulary over topology: nodes, edges, directed traversal, neighbors.                                                         |
| [`oxgraph-hyper`](https://docs.rs/oxgraph-hyper)             | Hypergraph vocabulary over topology: vertices, hyperedges, participants.                                                                    |
| [`oxgraph-layout-util`](https://docs.rs/oxgraph-layout-util) | Shared index and word types plus offset-integrity checks that concrete layouts reuse.                                                       |
| [`oxgraph-snapshot`](https://docs.rs/oxgraph-snapshot)       | Topology-agnostic byte container. Validate a snapshot and borrow its sections as typed slices.                                              |

### Borrowed layouts

| Crate                                                      | Gives you                                                                                   |
| ---------------------------------------------------------- | ------------------------------------------------------------------------------------------- |
| [`oxgraph-csr`](https://docs.rs/oxgraph-csr)               | Compressed-sparse-row graph views over offset and target slices, native or from a snapshot. |
| [`oxgraph-csc`](https://docs.rs/oxgraph-csc)               | Inbound (compressed-sparse-column) graph views for reverse traversal.                       |
| [`oxgraph-hyper-bcsr`](https://docs.rs/oxgraph-hyper-bcsr) | Directed bipartite-CSR hypergraph views, dense in both directions.                          |

### Algorithms and properties

| Crate                                                  | Gives you                                                                             |
| ------------------------------------------------------ | ------------------------------------------------------------------------------------- |
| [`oxgraph-algo`](https://docs.rs/oxgraph-algo)         | BFS and PageRank over the capability traits, with `no_std`, `alloc`, and `std` tiers. |
| [`oxgraph-property`](https://docs.rs/oxgraph-property) | Arrow-backed named, typed property layers and snapshot identity maps.                 |

### Built on top

[`oxgraph-db`](https://docs.rs/oxgraph-db) is an embedded OxGraph-native database
with catalogs, queries, and durable identity;
[`oxgraph-postgres`](https://docs.rs/oxgraph-postgres) is a Postgres-backed
engine. `oxgraphd` (server and CLI) and `oxgraph-pgrx` (Postgres extension) ship
in the workspace but are not published to crates.io.


## Feature flags

The umbrella crate's default feature set is empty. A feature is the only thing
that pulls a layer in:

```bash
cargo add oxgraph --features csr,algo-std
```

Feature names map to layers:

| Feature                          | Use it for                                                     |
| -------------------------------- | -------------------------------------------------------------- |
| `topology`, `graph`, `hyper`     | The capability traits for graphs and hypergraphs.              |
| `csr`, `csc`, `hyper-bcsr`       | Borrowed layouts over slices.                                  |
| `snapshot`, `snapshot-alloc`     | Reading snapshots, and building them (`-alloc`).               |
| `algo`, `algo-alloc`, `algo-std` | BFS and PageRank at the matching allocation tier.              |
| `graph-build`, `hyper-build`     | Append/freeze builders that produce a layout, then a snapshot. |
| `property-arrow`                 | Arrow-backed property layers.                                  |
| `db`, `postgres`                 | The embedded database and the Postgres engine.                 |
| `full`                           | Everything.                                                    |

### Allocation tiers

Several layers ship at `no_std`, `alloc`, and `std` tiers so you only pull in an
allocator (or `std`) when your target allows it. `oxgraph-algo` is the clearest
example: `algo` (`no_std`), `algo-alloc`, and `algo-std`.


## Getting started

`oxgraph` is a storage-agnostic, zero-copy-friendly graph and hypergraph
topology substrate for Rust.

> **Topology here. Meaning elsewhere. Storage anywhere.**

The core is `no_std` and the substrate is `unsafe`-free. The pipeline it is built
around is `bytes → validate → view → traverse`: you point a view at a byte slice,
it validates the layout, and you walk the graph. No parse step, no heap graph
rebuilt from the bytes, no copy of the edges.

### Install

Depend on the umbrella crate and turn on the features you need. Its default
feature set is empty, so a feature is the only thing that pulls a layer in.

```bash
cargo add oxgraph --features csr,algo-std
```

You can also depend on the individual crates directly if you want a tighter
dependency graph.

### What you reach for, by task

| You want to                                                  | Reach for                                 |
| ------------------------------------------------------------ | ----------------------------------------- |
| Expose your own storage as a graph or hypergraph             | the `topology` / `graph` / `hyper` traits |
| Borrow a CSR or bipartite-CSR layout over slices             | `oxgraph-csr`, `oxgraph-hyper-bcsr`       |
| Open a validated snapshot and traverse it without rebuilding | `oxgraph-snapshot` plus a layout          |
| Run BFS or PageRank at a `no_std`, `alloc`, or `std` tier    | `oxgraph-algo`                            |
| Attach Arrow-backed named properties                         | `oxgraph-property`                        |
| Run an embedded database or a Postgres engine                | `oxgraph-db`, `oxgraph-postgres`          |

Every layer ships runnable examples under its crate's `examples/` directory
(`cargo run --example <name> -p <crate>`).

### Next

* [Concepts](/oxgraph/concepts) — topology vs. meaning vs. storage
* [The crate family](/oxgraph/crates)
* [Feature flags](/oxgraph/features)
* [Benchmarks](/oxgraph/benchmarks)

### Status

Pre-1.0 and still changing. The traits and crates are not stable yet. The
snapshot byte format is an internal ABI candidate, not a stable interchange
format — treat persisted snapshots as coupled to the crate version that wrote
them. Licensed MIT.


## Benchmarks

### Agent-task benchmark

An agent answers *"How does tokio schedule and run async tasks?"* with and
without each tool, on the Tokio codebase, measuring efficiency and blind-judged
answer quality. oxcode and codegraph were measured on different agent harnesses,
so the comparable unit is each tool's improvement **vs. its own no-tool
baseline**, not absolute numbers.

| arm                                  | answer quality |   tokens |     cost | tool calls | wall time |
| ------------------------------------ | -------------: | -------: | -------: | ---------: | --------: |
| baseline (no tool)                   |           0.98 |        — |        — |          — |         — |
| oxcode — codex/gpt-5.5, CLI, n=6     |    0.96 (tied) |     +15% |      +4% |        −4% |      +14% |
| **oxcode — codex/gpt-5.5, MCP, n=6** |       **0.93** | **−74%** | **−57%** |   **−84%** |  **−60%** |
| codegraph — Opus 4.8, MCP, published |   not measured |     −38% |     even |       −57% |      −18% |

Percentages are change vs. that tool's own no-tool baseline (negative = reduction,
better; quality is the blind LLM-judge score, 0–1). All oxcode rows come from one
n=6 release suite on Tokio.

Absolute medians: tokens 395k (baseline) → 455k (CLI) → 104k (MCP);
cost $0.17 → $0.18 → $0.07; tool calls 28 → 27 → 5; wall 97s → 111s → 39s.

### The MCP server is the headline

Delivering the same bounded, PageRank-curated context through a one-call
[`oxcode_explore`](/oxcode/mcp) MCP tool — instead of a CLI the agent composes —
cuts tool calls 84%, tokens 74%, cost 57%, and wall 60% vs. the no-tool baseline,
exceeding codegraph's published reductions (−57% tool calls / −38% tokens).

The CLI arm is statistically tied with the baseline: the agent treats a shell
binary as a supplement to its own grep/read, not a replacement — so the gap was
always **tool delivery, not index quality**. The one cost the quality gate
exposes (and a quality-blind benchmark would hide): MCP answer quality dips to
0.93 vs. 0.98, a completeness trade-off from the leaner exploration. codegraph
numbers are from its README, re-validated 2026-06-02.

The engine-level numbers behind these results — incremental reindex, query
latency, cold index — live in [oxgraph Benchmarks](/oxgraph/benchmarks).


## CLI

### Navigation commands

| Command             | What it does                                                                                            |
| ------------------- | ------------------------------------------------------------------------------------------------------- |
| `index`             | Index a project into `.oxcode/index.oxgdb/`.                                                            |
| `status`            | Report index state for a project.                                                                       |
| `context`           | Rank entry-point symbols for a task, then expand nearby relationships. Deterministic and graph-derived. |
| `symbols`           | Keyword discovery over symbols, with repeatable `--kind` filters.                                       |
| `files`             | Find files relevant to a query.                                                                         |
| `symbol`            | Resolve a single selector to its definition, signature, and source.                                     |
| `calls`             | Walk the call graph outward from a symbol.                                                              |
| `callers`           | Walk the call graph inward to a symbol.                                                                 |
| `query` / `explain` | Execute raw [OxQL](/oxcode/oxql).                                                                       |

`context` ranks entry-point symbols for the task text, then expands nearby
`calls`, `contains`, `references`, and `implements` relationships. For keyword
discovery use `symbols`; do not pass plain English phrases to `query`.

### Selectors

Navigation commands accept several selector forms:

* `element:<id>` — a concrete OxGraph element ID.
* an exact crate-qualified name such as `my_crate::auth::tenant_middleware`
  (qualified names are anchored at the crate, so the first segment is the package
  name with `-` normalized to `_`).
* `name:<name>` — a simple function name.
* `file:<path>:<line>` — the innermost symbol covering a source line.

### Symbol kinds

`symbols` accepts repeatable `--kind <kind>` filters. Valid kinds:

`file`, `module`, `namespace`, `package`, `class`, `struct`, `enum`, `trait`,
`interface`, `impl_block`, `function`, `method`, `field`, `variable`,
`constant`, `type_alias`, `macro`.


## Getting started

`oxcode` indexes source code into a native OxGraph database. It uses tree-sitter
for extraction, resolves code references into graph relations, and stores the
result in a native OxGraph database under `.oxcode/index.oxgdb/`. It is the first
downstream consumer of the [oxgraph](/oxgraph/getting-started) engine.

The CLI keeps raw OxQL available, but agent navigation should usually start with
`context`, `symbols`, `files`, and the call-graph commands, because they expand
graph IDs back into function names, definition ranges, signatures, docstrings,
source previews, and call-site source context.

### Quick start

```sh
cargo run -p oxcode -- index --path path/to/rust/project
cargo run -p oxcode -- status --path path/to/rust/project
cargo run -p oxcode -- context "How does entry reach helper?" --path path/to/rust/project --limit 8 --json
cargo run -p oxcode -- symbols "entry helper" --path path/to/rust/project --limit 20 --json
cargo run -p oxcode -- symbol crate::entry --path path/to/rust/project --json
cargo run -p oxcode -- calls crate::entry --depth 2 --path path/to/rust/project
cargo run -p oxcode -- callers crate::helper --depth 2 --path path/to/rust/project
```

The generated `.oxcode/` directory writes its own `.gitignore`, so the index is
never committed by accident.

### Architecture

The workspace uses a hybrid Rust architecture:

* **`oxcode-model`** — storage-neutral vocabulary: code-graph kinds, identifier
  newtypes, the graph schema catalog, the selector grammar, the
  extraction/resolution IR, and agent-facing report DTOs.
* **`oxcode-core`** — indexing, extraction, reference resolution, OxGraph
  storage, navigation, formatting, and the public `ProjectIndex` facade.
* **`oxcode`** — thin CLI package and binary.
* **`oxcode-mcp`** — MCP server (stdio) exposing oxcode's read-only queries to
  coding agents.

The model crate's typed schema is the single source of truth that the storage
layer derives property registration, read-key caching, and indexes from.

### Next

* [Language support](/oxcode/languages)
* [CLI](/oxcode/cli)
* [MCP server](/oxcode/mcp)
* [OxQL](/oxcode/oxql)
* [Benchmarks](/oxcode/benchmarks)


## Language support

oxcode supports two tiers of language coverage.

### High-fidelity (hand-written extractors)

**Rust**, **Go**, and **TypeScript / JavaScript**
(`.ts` / `.tsx` / `.js` / `.jsx` / `.mts` / `.cts`).

These resolve receiver-typed method calls, qualified names, and imports
precisely.

### Best-effort (generic, query-driven extractor)

**Python**, **Java**, **C**, and **C++**.

These extract symbols, containment, and approximate call edges that resolve only
at the scoped/simple tiers (no receiver typing), so some edges are marked
ambiguous. Qualified names use `::` internally regardless of the language's own
separator.

### Adding and promoting languages

Run `oxcode languages` to list the registered extractors.

* A best-effort language is promoted to high fidelity by adding a hand-written
  extractor.
* A new best-effort language is added with a tree-sitter query plus a profile
  entry (see `crates/oxcode-core/src/extract/profiles.rs`).

Recognized source files in a language with no extractor yet (e.g. Ruby, PHP, C#)
are reported as skipped rather than silently dropped.


## MCP server

`oxcode-mcp` is a stdio MCP server that exposes oxcode's read-only queries to
coding agents (Claude, Cursor, and others). It is the recommended way to give an
agent code context — delivering the same bounded, PageRank-curated context
through a single tool call instead of a CLI the agent has to compose.

### Tools

| Tool                                | Purpose                                                           |
| ----------------------------------- | ----------------------------------------------------------------- |
| `oxcode_explore`                    | One-call, PageRank-curated context for a task. The headline tool. |
| `oxcode_search`                     | Keyword discovery over symbols.                                   |
| `oxcode_callers` / `oxcode_callees` | Walk the call graph in or out from a symbol.                      |
| `oxcode_symbol`                     | Resolve a selector to its definition and source.                  |
| `oxcode_files`                      | Find files relevant to a query.                                   |
| `oxcode_status`                     | Report index state.                                               |

### Why MCP is the headline

On the Tokio agent-task benchmark, the one-call `oxcode_explore` MCP tool cuts
tool calls 84%, tokens 74%, cost 57%, and wall time 60% versus the no-tool
baseline — exceeding codegraph's published reductions. The CLI arm, by contrast,
is statistically tied with the baseline: an agent treats a shell binary as a
supplement to its own grep/read, not a replacement. The gap was always tool
delivery, not index quality.

See [Benchmarks](/oxcode/benchmarks) for the full table.


## OxQL

`query` and `explain` execute raw OxQL (an OxGraph-native, Cypher-flavored query
language). For keyword discovery prefer [`symbols`](/oxcode/cli); do not pass
plain English phrases to `query`.

### Accepted profile

```text
CATALOG
MATCH ELEMENTS
MATCH ELEMENTS HAS LABEL <label>
MATCH ELEMENTS WHERE <property> = '<value>'
MATCH RELATIONS TYPE <type>
GRAPH calls WALK FROM <element-id> DEPTH <n> [DIRECTION outgoing|incoming|both] [LIMIT n]
```

### Examples

```sh
cargo run -p oxcode -- query "MATCH ELEMENTS WHERE qualified_name = 'crate::entry'" --path path/to/rust/project
cargo run -p oxcode -- query "MATCH RELATIONS TYPE calls" --format expand --path path/to/rust/project
cargo run -p oxcode -- query "GRAPH calls WALK FROM 12 DEPTH 2 DIRECTION both LIMIT 100" --path path/to/rust/project
```

Use `--format expand` to expand graph IDs back into function names, definition
ranges, signatures, and source context.