POC configuration — tunables locked in superskill.yaml for production (air-gapped)

PIPELINE IDLE

No active runs. Hit Start to begin a pipeline run.

Awaiting hardware probe...|Memory: 85%GPU: 0%|I/O: 0.00.0 MB/s
Memory Budget20.5 / 24 GB (85%)
OS 3.0G
Student 6.0G
Teacher 9.0G
LoRA Grads 2.5G
Free 3.5G
OS
Student
Teacher
LoRA Grads
Free

Tunables

LoRA Rank (r)16
4Decomposition rank. Higher = more capacity, more memory.64
Memory: linear
LoRA Alpha32
8Scaling factor. Typically 2x rank.128
LoRA Dropout0.05
0Regularization. Higher for small datasets.0.2
LoRA Layers16
4Transformer layers with LoRA adapters. Fewer = less memory.32
Memory: linear
Batch Size4
1Micro-batch size. Limited by unified memory.16
Memory: linear
Grad Accumulation4
1Effective batch = batch_size x this. Free quality boost.16
Max Seq Length2048 tokens
256Max token length. Memory scales quadratically with attention.8192
Memory: quadratic
Learning Rate1.0e-4
0.00001Peak LR for cosine schedule. Lower for larger rank.0.0005
Epochs / Cycle3
1Full passes per swarm cycle. More = more fitting.10
Warmup Steps100
10Linear warmup before peak LR.500
Checkpoint Every500 steps
25Save frequency. More = safer but more disk I/O.2000
Grad Checkpoint
Trade compute for memory. Useful on 16GB machines.

Presets

Development

Simulated phases with stub delays. Good for testing the dashboard and pipeline flow.

Standard

Real training. Balanced quality settings. Pause anytime, resume when ready.

Machine will be busy during training. Pause if you need it back.

Maximum Quality

Full convergence. Runs until the swarm gives up trying to break it.

Machine will be saturated. Let it run — pause if needed.

Learn →