01 kaggle work · shivam bhardwaj

Kaggle Challenges

FEATURED · FILE 01 Computer Vision Case study

Jaguar Re-ID

Pairwise jaguar re-identification — scored as image similarity, not classification. The retrieval-first refactor moved a 0.421 public baseline to 0.871.

result: Leaderboard case study
confidence: High — public score and fixed validation both agree
next: Test higher-resolution EVA/Swin backbones only after preserving the retrieval validation split.

baseline 0.421

best public 0.871

mAP (retrieval) +106.9% improvement

EVA-02 Large 448GeM poolingArcFace headTTA flipAQEk-reciprocal rerank

sbl1 · local vast.ai · remote GPU

read writeup →

02 · catalog · 6 entries

Computer Vision
Bioinformatics
Audio
Tabular

FILE 02 / Bioinformatics Active

Stanford RNA 3D Folding 2

Predict the 3D structure of RNA from sequence. A classical sequence-NN baseline beats anything that ignores the closest train neighbors; RibonanzaNet3D refines the candidates; a learned reranker picks the final five.

result: Active hybrid baseline
next: Tighten the learned reranker over the 128-candidate pool before more compute-heavy refinement.

33.98 33.89 top-1 RMSD

5,716 train targets · MSAs up to 14k seqs · up to 125k nt read →

FILE 03 / Audio Baseline

BirdCLEF 2026

Acoustic species ID in Pantanal soundscapes — research-track Kaggle. Pipeline live with a class-prior baseline; deep audio modeling is next.

result: Submission-shape baseline
next: Confirm the official metric, then build grouped validation around soundscape windows.

TBA per Kaggle eval

weak labels at recording level · dense per-window submission read →

FILE 04 / Tabular Active

AI Adoption · Fortune 500

EDA + clustering on a synthetic Fortune-500 AI-adoption panel. Random Forest ROI predictor, K-means personas, PCA projection, correlation atlas, use-case trajectories.

result: Synthetic EDA reference
next: Replace synthetic panel assumptions with a real longitudinal adoption source.

Random Forest R² (runtime)

572 KB · 2020–2025 · Fortune 500 read →

FILE 04 / Computer Vision Case study

CSIRO Pasture Biomass Estimation

Multi-task ResNet50 + NDVI/Height fusion + constraint loss for 5 dry-matter targets on pasture imagery. 13% RMSE improvement over single-target baseline.

result: Archived case study
next: Run image-stratified cross-validation at 512px+ input resolution.

single-target ~52 (Dry_Total) 45.41 (Dry_Total) · multi-task w/ constraint RMSE per target

1,785 samples · 357 unique pasture images · field measurements (NDVI + height) read →

FILE 05 / Tabular Active

Student Performance EDA

EDA on student grades — Random Forest predictor of overall_score, behavioral K-means personas, sleep / study correlations, A→F grade distributions.

result: Synthetic EDA reference
next: Separate predictive features from policy levers before presenting causal takeaways.

Random Forest R² (runtime)

792 KB · grades A · B · C · D · F read →

FILE 06 / Computer Vision Active

Digit Recognizer

Legacy MNIST kept as the canonical scaffold — the smoke-test that proves challenge.json, the dashboard registry, the remote bootstrap, and the submit-to-Kaggle flow all still work.

result: Scaffold smoke test
next: Keep the run tiny and use it to verify auth, dashboard actions, and submit flow.

accuracy

42k train · 28k test · 28×28 grayscale read →