CSIRO Pasture Biomass Estimation
Multi-task ResNet50 + NDVI/Height fusion + constraint loss for 5 dry-matter targets on pasture imagery. 13% RMSE improvement over single-target baseline.
The challenge
CSIRO + Meat & Livestock Australia + Google Australia ($75K prize, deadline
2026-01-28). Given a pasture image, predict 5 dry-matter targets:
Dry_Clover_g, Dry_Dead_g, Dry_Green_g, Dry_Total_g, GDM_g. Field
measurements (NDVI, height) come paired with each image.
The point isn’t just to fit one regressor — it’s that the 5 targets are physically related: total dry matter equals the sum of the three component classes. Any model that doesn’t enforce this is leaving signal on the table.
What I noticed in the data
Three structural facts that drove everything else:
Dry_Total = Dry_Clover + Dry_Dead + Dry_Greenis exact in the labels. Not a noisy approximation — a constraint. The model should know.- 357 unique images × 5 targets each = 1,785 rows. Splitting by row
leaks images across train and validation; the split must be by
image_path. - State and species dominate. Pasture in NSW averages 39.7 g, WA 18.8 g; Phalaris species runs 59.2 g, Clover 18.9 g. Image-only models miss this.
The architecture
- ResNet50 (ImageNet pretrained) backbone — shared feature extractor
- Linear projection 2048 → 1024 → 512
- Concatenate NDVI + Height (2-d) onto image features → joint vector
- 5 task-specific heads (one per target)
- Constraint loss term:
λ · |Dry_Total − (Dry_Clover + Dry_Dead + Dry_Green)|² - Weighted MSE:
Dry_Totalcarries 1.5× weight (it’s the headline metric)
Training: H100 PCIe on vast.ai, batch 16, AdamW @ 1e-4, cosine annealing, 67 epochs (early-stopped). Image resolution 224×224 (source is 2000×1000; revisiting at higher res is the obvious next experiment).
Results
| Target | Val RMSE |
|---|---|
| Dry_Total_g | 45.41 (main metric) |
| Dry_Green_g | 26.47 |
| GDM_g | 32.63 |
| Dry_Dead_g | 15.92 |
| Dry_Clover_g | 13.64 |
13% improvement over a single-target baseline (separate regressors per target, no constraint loss, no feature fusion). The constraint loss contributed roughly half of the lift — it’s the cheapest, highest-impact modeling choice on this dataset.
What I’d do differently
- Higher input resolution. 2000×1000 source → 224×224 input throws away most of the spatial signal. A 512×512 or 768×768 crop schedule (with random crops in training) should help the dense-fine-grass cases.
- Per-state / per-species heads. State and species are categorical features with strong per-class biomass priors; conditioning the model on these (either as one-hots concatenated into the joint vector, or as separate task groups) is probably worth another few %.
- Cross-validation, not a single split. 357 unique images is small enough that the 80/20 split has variance. 5-fold CV with image-stratified splits would tighten the estimate.
Source archive
Full pipeline (notebooks + training scripts + submission notebooks + model checkpoints pointer) lives at github.com/Shivam-Bhardwaj/csiro-kaggle-pasture-biomass (archived 2026-05-22). Unarchive to resume.