Capability-bound infrastructure

Lease layer for disaggregated fabrics

Firmware for fabric resources

Fail-closed infrastructure

Capabilities for hardware. Leases instead of root.

The lease layer for disaggregated infrastructure.

BIOS for the disaggregated era.

Everything you allocate has a deadline — and a fence behind it.

Tenura is the firmware boundary for disaggregated GPU, memory, and storage fabrics. Tenants share metal under capability tokens — TTL-bounded, audience-scoped, fenced on teardown failure.

Every GPU, memory pool, and storage namespace becomes a time-bounded lease. One capability model, one teardown contract, one audit log — across compute, memory, fabric, and block.

PC BIOS made hardware addressable to operating systems. fabricBIOS makes disaggregated fabric resources addressable to runtimes — with leases and capability tokens built in from day one.

Hardware-enforced teardown, capability-scoped access, no leaked GPU time and no leaked memory pages. The infrastructure layer that fails closed.

Request access Read the docs →

Private beta · self-hosted today · managed cells case-by-case

grafos · acme tenant

$ grafos deploy run inference.wasm \
    --tenant acme --mem 80G --lease-secs 300

✓ admitted   tenant=acme    cell=us-east-1
✓ leased     mem=80GiB     node=node-7
✓ running    program=inference.wasm

  lease expires in 5m
  auto-teardown on expiry · revocable on demand▍

What "lease-bound" actually buys you

Short-lived by default

Every capability token has a TTL of 300 seconds or less. Long-lived secrets are not the unit of access.

Cleanup is enforcement, not hope

If teardown fails on lease expiry, the resource enters a FENCED state. No new lease lands on dirty hardware.

mTLS is the floor

The control plane is QUIC with mutual TLS by default. There is no plaintext mode to opt in or out of.

How it works

Intent on top, capability tokens at the boundary, deterministic teardown at the metal.

01

A program declares intent

A grafOS program asks for resources by shape, not by identity: a GPU, 80 GiB of memory, a queue pair, a 5-minute TTL.
fabric.alloc_gpu().lease_secs(300).acquire()?
02

fabricBIOS mints and enforces the scoped capability

After scheduler admission, the target fabricBIOS node issues a signed, audience-bound token for exactly that lease and hardware. Replay-protected, fenced by epoch.
cap.audience = node-7 · cap.epoch = 42 · cap.ttl ≤ 300s
03

Expiry triggers teardown — automatically

When the lease ends or is revoked, the data plane is torn down. Failure to tear down does not retry — it fences the resource.
lease expired → teardown(ok) | teardown(fail) → FENCED

Why it exists

GPU infrastructure is too valuable to be trapped in static boxes.

A 4-GPU node serving eight tenants today usually pins one quarter-GPU per tenant for the day — and runs at 30–40% utilization while everyone holds capacity they aren't using. Tenura replaces that with leases minted on demand and reclaimed automatically when the workload finishes or the TTL elapses.

The split between grafOS and fabricBIOS keeps policy out of the hardware boundary. Programs say what they need; the scheduler admits the request; fabricBIOS mints and enforces the scoped capability at the resource boundary.

Multi-tenant inference

Slice one GPU between tenants by lease, not by VM. Idle capacity returns to the pool in seconds.

Burst training

Reserve a fabric of GPUs for a job, hold it only for the run, release it on completion or failure.

Shared research clusters

Replace static "GPU per grad student" with leases that expire when a notebook idles.

Disaggregated memory and storage

Lease memory and block from anywhere on the fabric. The lease, not the wire, is what teardown walks.

Read deeper

How Tenura works, in four reads.

Read the docs

grafOS and fabricBIOS concepts, the CLI flow, and operator notes.

Architecture

The programming model behind leases, capabilities, placement, recovery.

Economics

How finer-grained leases change utilization and capacity planning.

Use cases

Concrete fits: multi-tenant inference, burst training, shared research, disaggregated memory.

Common questions

Three things people ask first.

Is this a scheduler, a hypervisor, or a runtime?: A control layer beneath all three. grafOS handles program intent and scheduler admission; fabricBIOS mints, validates, revokes, and enforces capabilities at the resource boundary.
What hardware does it run on today?: NVIDIA GPUs (L4 silicon-validated), CXL/RDMA fabrics, NVMe-oF block, and Raspberry Pi 5 bare metal as a development target.
Self-hosted or managed?: Self-hosted today. Managed cells in private beta — that is what the form below opts into.

See the full FAQ →

Beta access

Tell us what you want to build.

We review applications weekly. Approved teams get a one-click signup link by email and a short onboarding call to size the workload. A workload note helps us route access to teams that can exercise the fabric meaningfully.

Self-hosted install — Linux + an NVIDIA GPU is enough to start.
Managed cells available case-by-case during private beta.
Capability tokens, mTLS, and audit logs are on by default.

After approval

One line installs the CLI and joins your cell.

curl -fsSL https://get.tenura.systems/install.sh | sh

macOS arm64 / Linux x86_64 / Linux arm64. The installer drops grafos on your PATH and registers your beta token.

On Windows? Same release, manual extract.

grafos-0.2.8-x86_64-pc-windows-gnu.tar.gz @ releases.tenura.systems/0.2.8/

Download the tarball, verify the minisig against the same public key as every other platform, extract grafos.exe, and drop on PATH. PowerShell installer is on the roadmap. Quickstart →