Digit Recognizer
Legacy MNIST kept as the canonical scaffold — the smoke-test that proves challenge.json, the dashboard registry, the remote bootstrap, and the submit-to-Kaggle flow all still work.
The role this entry plays
Every new competition in this repo is registered the same way:
challenge.json on disk, an entry in the operator dashboard, a
handful of shell scripts wired through the Makefile (sync, train,
pull, submit), and an AGENTS.md describing the Codex flow. When
any of that scaffold breaks, MNIST is the cheapest way to find out.
So digit-recognizer stays in the registry as a known-good smoke test. Its current state is not the score — it’s the contract.
Dataset spotlight
Roughly balanced across all ten digits — 28×28 grayscale. The actual Kaggle digit-recognizer train.csv counts (canonical MNIST).
Stylized glyphs in place of MNIST pixels — the real raster data lives in Kaggle's CSV. The point here isn't classification difficulty; it's that the project's value sits in the operational scaffold, not the score.
Data schema
MNIST (Kaggle digit-recognizer) 42,000 train rows · 28,000 test rows. The smallest possible CV competition — which is what makes it useful as a plumbing check.
The scaffold contract
- 01 01 · scripts/new-competition.sh <slug> creates the workspace Reads Kaggle metadata via the CLI, lays down competitions/<slug>/ with challenge.json + AGENTS.md.
- 02 02 · challenge.json describes the project remote host · session · log paths · action scripts The schema is intentionally small — only what the dashboard needs to route sync/pull/submit verbs.
- 03 03 · dashboard auto-registers Flask app picks up the new folder Watch the local-or-remote training log; trigger sync/pull/submit from the browser via auth-gated endpoints.
- 04 04 · make {sync, train, pull, submit} the four verbs competition=<slug> argument routes commands to the right project. Same surface across every competition in the repo.
- 05 05 · green submit on MNIST smoke test passes If MNIST reaches Kaggle with a valid submission CSV, the new-competition scaffold is healthy.
{
"id": "digit-recognizer",
"name": "Digit Recognizer",
"project_dir": "digit-recognizer",
"kaggle_competition": "digit-recognizer",
"description": "Legacy MNIST classification smoke-test kept as a standard registered challenge.",
"outputs_dir": "outputs",
"training_log": "training_log.jsonl",
"metrics_log": "training_log.jsonl",
"submissions_log": "submissions_log.jsonl",
"remote_host": "vast",
"remote_session": "training",
"remote_session_prefix": "",
"remote_project_dir": "",
"remote_path_export": "export PATH=\"$HOME/.local/bin:$PATH\"",
"remote_training_log": "outputs/train.log",
"actions": {
"sync_train": "scripts/sync-to-gpu.sh",
"pull_results": "scripts/pull-results.sh",
"submit": "scripts/submit-to-kaggle.sh",
"stop_training": "scripts/stop-training.sh"
}
} Why a smoke test matters
When a Kaggle competition takes weeks of remote training, the operational surface is less forgiving than the modeling surface. A typo in a sync script, a stale Kaggle auth token, a dashboard endpoint that quietly stopped routing — these silently waste days. MNIST takes minutes to train and seconds to submit. If it doesn’t go end-to-end, something in the plumbing is wrong, and we know before paying for GPU.
This entry doesn’t compete. It guards the path that lets real competitions compete.
→ For the operational stack this scaffold lives on, see the workflow page.