ARAIL — EXPERIMENTS & EDUCATION
A rail for AI experimentation
& education.
ARAIL ships as a blueprint. Take an agent and an IDE and tailor / set up yourself. A Python runtime, AeroLLM, a starter model, and supportive agents — including Buddy. Everything you need to start a session offline, on a single box.
If you have a smaller machine, choose Minimalist mode. Else set up Maximum mode to build out all the functionality: a Rust router, AeroLLM, an agent swarm, AutoResearch, frontier teacher models.
QuKaiZen · ARAIL · v0.4 · airgapped
02 — ORIGIN
Why I built this.
First there was Buddy. Then a place for him to live.
I wanted to learn AI — properly. I was curious how the models were built. I didn't know where to begin, so I started with a companion: Buddy. Then I built an environment for him. That environment became ARAIL.
Then I found agentic AI, then autoresearch — and couldn't get enough. The lab needed to be pluggable, and I had to be able to see what was happening inside it.
03 — THE RAIL
AutoResearch AI Lab.
The name is the goal. A rail — straight, fast, repeatable — that you mount things onto. The rail is the scaffolding. What rides on it can change month to month — the operating surface stays the same.
- same dashboard for every experiment
- same activity stream for every agent
- same knowledge base behind every loop
- same airgapped → hybrid switch for every call
THE LAB — MINIMALIST CORE · MAXIMOST REACH
IN THE LAB · MISSION CONTROL
Not a mockup — the real operating surface. Every experiment, agent, and knowledge loop on one screen, server-rendered and airgapped.





05 — DEFINITION
Three keystrokes to a local AI lab.
A clone-and-run research bench. Dashboard, chat, autoresearch loop, knowledge base, agents — server-rendered, no SPA, no telemetry. Default mode is airgapped. Zero network calls until you flip LAB_MODE=hybrid.
$ git clone cdarnell/arail cloned to ~/arail $ ./arail setup ✓ venv created ✓ deps pinned ✓ portal ready $ ./arail start ▶ serving on 127.0.0.1:8080 ⬤ airgapped ⬤ buddy · sre · researcher $ open http://127.0.0.1:8080
04 — COST POSTURE
Pay for frontier models. Just not by accident.
Simulate before you spend.
LOCAL MODE — AIRGAPPED
$0
Iterate freely. Run a thousand experiments overnight. Zero network calls until you say so.
HYBRID MODE — METERED
$$
Every cloud call is logged with provider, model, and tokens. Same surface. Same agents. Real receipts.
06 — TIERS
Two tiers, one surface.
TIER · STARTER · MINIMALIST
The lab
- Dashboard
- Chat — local + cloud
- Autoresearch loop
- Knowledge Base
- Agents — Buddy · SRE · Researcher
- Airgapped by default
TIER · OPERATOR · MAXIMUM
The operator's console
- everything in Minimalist +
- Admin · Docs · Notebooks
- 405B local inference via AeroLLM
- Cloud SDKs · provider routing
- Still airgapped unless you flip
LAB_MODE=hybrid
BUDDY TUNNEL
Hybrid only — gateway required
The dashed branch in the diagram. Telegram, Slack, Discord, Signal cleanly; iMessage and WhatsApp through bridges. Needs internet and a gateway. Airgapped mode blocks the tunnel by design — see docs/BUDDY.md.
07 — PLUGGABILITY
What plugs into the rail.
Four mounting points. Swap any one. Same protocol, same activity stream, same token accounting.
08 — COMPARE
Same prompt. Two minds. One screen.
Local vs cloud. Small vs large. Base vs fine-tuned. Real comparison, not benchmarks.
parse_goal> explain the bias-variance tradeoff like I write firmware
QWEN-3-8B
local · MLX
Bias is your model's stuck-at fault — it always answers a little wrong. Variance is marginal noise — different inputs ring differently. You tune capacity to balance them, same as a low-pass filter…
CLAUDE-OPUS-4.1
cloud · anthropic
Think of bias as systematic miscalibration baked into your model class, and variance as jitter from the specific training samples you happened to draw. They trade off through model capacity…
09 — OBSERVABILITY
If you can't see it, you can't trust it.
Every agent action — timestamped, attributed, token-counted, streamed live.
Activity stream
tail -f · live
10 — MEMORY
Knowledge base — the gem.
The lab gets smarter the longer you use it. Drop data in. Agents drop data in. Relevant pieces get RAG'd back into context — for you and for them.
KNOWLEDGE_BASE/
RAG · retrieve · augment · generate
YOU
papers · notes · code
BUDDY
distilled summaries
RESEARCHER
experiment notes
SRE
crash transcripts
11 — CONSTRAINT
The hardware wall.
Even a $5,000 GPU still settles.
The compromise: < 100B parameters · small context · loud fans · hot rack. That wall is real. You can buy your way closer to it. You can't buy your way through it.
So I stopped trying to buy through it. I built around it.
WHAT YOU ACTUALLY WANT TO RUN
405B
WHAT FITS — AT SMALL CONTEXT
~70B
12 — INFERENCE
AeroLLM — built around the wall.
Layered inference for MLX and CUDA. Same idea AirLLM proved — stream weights in, compute, evict — rebuilt for unified memory and quiet enclosures.
# INSPIRATION
▸ AirLLM
- + proved the idea
- − unstable on long runs
- − no MLX support
- − CUDA-only · 3090 sounded like a leaf blower
# THE ANSWER
▸ AeroLLM — same idea, made for laptops
- ✓ MLX + CUDA
- ✓ stable on hour-long runs
- ✓ ships with ARAIL Maximum
Thanks to the AirLLM team for charting the path.
13 — HARDWARE
We meet you where you are.
No pretending. Minimalist runs almost anywhere. Maximum needs the rig.
MINIMALIST
| cpu | any modern x86_64 / Apple silicon |
| ram | 16 GB |
| gpu | optional |
| disk | 20 GB |
| os | macOS · Linux · WSL2 |
| net | none required |
MAXIMUM
| cpu | M-series MacBook Pro · serious gaming rig |
| ram | 64 GB unified / 64 GB sys + 24 GB VRAM |
| gpu | RTX 3090 / 4090 · M3 Max / M4 Pro+ |
| disk | 500 GB NVMe |
| os | macOS 14+ · Linux |
| net | required only for hybrid mode |
14 — NEXT FRONTIER
Project Nucleus — a model that learns from your knowledge base.
Nucleus reads the world's published research on distillation and efficient fine-tuning, then applies it to a base model + your KB. You get a smaller, sharper, you-shaped model.
01 · INGEST
knowledge_base/
papers · notes · experiments
02 · SURVEY
techniques
distillation · LoRA · DPO
03 · TRAIN
recipe
picked, not guessed
04 · SHIP
you-shaped model
smaller · sharper · local
⬤ IN DEVELOPMENT · EXPECTED v0.6 — EARLY LAB ACCESS VIA DISCORD
15 — AVAILABLE
A QuKaiZen product. Built the kaizen way.
Available by request.
MINIMALIST
⬤ AVAILABLE
Dashboard · Chat · Autoresearch · Knowledge Base · Agents. Airgapped by default · hybrid opt-in. Runs on any modern laptop.
MAXIMUM
⬤ AVAILABLE
Everything in Minimalist + Admin · Docs · Notebooks. AeroLLM · 405B local inference. Cloud SDKs · provider routing. Serious rig or M-series MacBook Pro.
NUCLEUS — ADD-ON
⬤ PREVIEW
Fine-tuning pipeline. A smaller, sharper, you-shaped model distilled from your knowledge base. See /nucleus.
ACCESS
By request · qukaizen.com
Not open source · Pull requests not accepted · Source available under select arrangements