Numbers,
strategy,
and the infra
to repeat both.
Jaguar Re-ID
Pairwise jaguar re-identification — scored as image similarity, not classification. The retrieval-first refactor moved a 0.421 public baseline to 0.871.
- Computer Vision
- Bioinformatics
- Audio
- Tabular
Stanford RNA 3D Folding 2
Predict the 3D structure of RNA from sequence. A classical sequence-NN baseline beats anything that ignores the closest train neighbors; RibonanzaNet3D refines the candidates; a learned reranker picks the final five.
BirdCLEF 2026
Acoustic species ID in Pantanal soundscapes — research-track Kaggle. Pipeline live with a class-prior baseline; deep audio modeling is next.
AI Adoption · Fortune 500
EDA + clustering on a synthetic Fortune-500 AI-adoption panel. Random Forest ROI predictor, K-means personas, PCA projection, correlation atlas, use-case trajectories.
Student Performance EDA
EDA on student grades — Random Forest predictor of overall_score, behavioral K-means personas, sleep / study correlations, A→F grade distributions.
Digit Recognizer
Legacy MNIST kept as the canonical scaffold — the smoke-test that proves challenge.json, the dashboard registry, the remote bootstrap, and the submit-to-Kaggle flow all still work.