CSIRO Pasture Biomass Estimation · kaggle

The challenge

CSIRO + Meat & Livestock Australia + Google Australia ($75K prize, deadline 2026-01-28). Given a pasture image, predict 5 dry-matter targets: Dry_Clover_g, Dry_Dead_g, Dry_Green_g, Dry_Total_g, GDM_g. Field measurements (NDVI, height) come paired with each image.

The point isn't just to fit one regressor — it's that the 5 targets are physically related: total dry matter equals the sum of the three component classes. Any model that doesn't enforce this is leaving signal on the table.

What I noticed in the data

Three structural facts that drove everything else:

Dry_Total = Dry_Clover + Dry_Dead + Dry_Green is exact in the labels. Not a noisy approximation — a constraint. The model should know.
357 unique images × 5 targets each = 1,785 rows. Splitting by row leaks images across train and validation; the split must be by image_path.
State and species dominate. Pasture in NSW averages 39.7 g, WA 18.8 g; Phalaris species runs 59.2 g, Clover 18.9 g. Image-only models miss this.

The architecture

ResNet50 (ImageNet pretrained) backbone — shared feature extractor
Linear projection 2048 → 1024 → 512
Concatenate NDVI + Height (2-d) onto image features → joint vector
5 task-specific heads (one per target)
Constraint loss term: λ · |Dry_Total − (Dry_Clover + Dry_Dead + Dry_Green)|²
Weighted MSE: Dry_Total carries 1.5× weight (it's the headline metric)

Training: H100 PCIe on vast.ai, batch 16, AdamW @ 1e-4, cosine annealing, 67 epochs (early-stopped). Image resolution 224×224 (source is 2000×1000; revisiting at higher res is the obvious next experiment).

Results

| Target | Val RMSE | |-----------------|----------| | Dry_Total_g | 45.41 (main metric) | | Dry_Green_g | 26.47 | | GDM_g | 32.63 | | Dry_Dead_g | 15.92 | | Dry_Clover_g | 13.64 |

13% improvement over a single-target baseline (separate regressors per target, no constraint loss, no feature fusion). The constraint loss contributed roughly half of the lift — it's the cheapest, highest-impact modeling choice on this dataset.

What I'd do differently

Higher input resolution. 2000×1000 source → 224×224 input throws away most of the spatial signal. A 512×512 or 768×768 crop schedule (with random crops in training) should help the dense-fine-grass cases.
Per-state / per-species heads. State and species are categorical features with strong per-class biomass priors; conditioning the model on these (either as one-hots concatenated into the joint vector, or as separate task groups) is probably worth another few %.
Cross-validation, not a single split. 357 unique images is small enough that the 80/20 split has variance. 5-fold CV with image-stratified splits would tighten the estimate.

Source archive

Full pipeline (notebooks + training scripts + submission notebooks + model checkpoints pointer) lives at github.com/Shivam-Bhardwaj/csiro-kaggle-pasture-biomass (archived 2026-05-22). Unarchive to resume.