Runs the complete species distribution modeling (SDM/ENM) pipeline: occurrence preparation, model fitting (MaxEnt, ensemble), thresholding, projection under climate scenarios, and interpretation. Use this skill when the user mentions habitat suitability, niche modeling, MaxEnt, biomod2, potential distribution, range maps, suitable area mapping, climate projections, invasion risk, range shift analysis, suitability mapping, ENM, ecological niche model, or calibration area definition.
Domain: SDM · ENM · MaxEnt · Ensemble · Projection
Phase: 2 — Modeling
Used by: run-sdm-study
Guides the agent through the complete species distribution / ecological niche modeling pipeline: from occurrence and predictor preparation to model fitting, ensemble building, thresholding, projection, and interpretation.
| Input | Format | Required |
|---|---|---|
| Occurrence records (cleaned) | CSV with lat/lon | Yes |
| Environmental predictor stack |
| GeoTIFF (multiband or stack) |
| Yes |
| Study area / calibration area | SHP, GPKG | Yes |
| Future/alternative scenario rasters | GeoTIFF | Optional |
| Background / pseudo-absence points | CSV | Optional |
| Output | Description |
|---|---|
suitability_current.tif | Continuous suitability map (current) |
suitability_binary.tif | Thresholded binary map |
suitability_scenarios/ | Projected maps per scenario |
ensemble_sd.tif | Uncertainty (SD across algorithms) |
variable_importance.csv | Predictor contributions |
response_curves.png | Marginal response per predictor |
sdm_report.md | Full methodological narrative |
predictive-modeling-best-practices skill for collinearity reduction| Condition | Diagnosis | Recommended Action |
|---|---|---|
| n_occurrences < 10 | Insufficient data for reliable model fitting | Do not fit model; use literature-based range map with explicit caveat |
| 10 ≤ n_occurrences < 30 | Low sample size — model may be unreliable | Proceed with caution; apply high regularisation (RM ≥ 2); report uncertainty |
| AUC_test < 0.7 | Potentially poor discriminative ability, OR species has a genuinely narrow niche | First, diagnose the cause: (1) Plot marginal response curves — if presences cluster in a narrow environmental range (< 10% of available gradient), low AUC may reflect ecological reality (narrow-niche species), NOT a poor model. Document as "narrow-niche species; AUC expected to be low". (2) If presences span the full gradient and AUC is still low, the model is genuinely poor — revise predictor set, expand calibration grid, check coordinate quality and spatial autocorrelation. See: Lobo et al. 2008 (Glob. Ecol. Biogeogr.), Warren & Seifert 2011 |
| MESS/MOP extrapolation > 20% of projection area | Model projecting into novel environmental conditions | Mask novel-condition areas in final map; report extrapolation extent in report |
| ΔAICc between top models < 2 | Top model is not clearly best | Use ensemble of top models; report Akaike weights alongside mean suitability map |
R: biomod2, ENMeval, dismo, maxnet, sdm, kuenm
Python: elapid, pysdm, sklearn
resources/sdm-checklist.md — SDM reporting checklist (based on ODMAP protocol)resources/calibration-area-guide.md — M area selection methodsresources/algorithm-comparison.md — algorithm strengths and limitationsexamples/sdm/ — full worked exampleSuitability ≠ probability of occurrence. The continuous output (
suitability_current.tif) is an index of relative environmental suitability, not a probability. Do not label outputs as "probability of presence" in reports, maps, or captions. Use terms such as "habitat suitability index" or "climatic suitability score".
Bounding-box clip ≠ study area mask. Clipping rasters by a rectangular bounding box (e.g.,
-75,-35to-30,6) does not restrict predictions to a political boundary or ecological region. If results must be restricted to a specific territory (e.g., Brazil), load a vector polygon (st_read()/gpd.read_file()) and mask the raster to that geometry before any further analysis. Failure to do so will inflate apparent suitable area and may produce ecologically misleading maps.
Demo / synthetic predictors. If predictors were generated by a mock script rather than downloaded from WorldClim, CHELSA, or another validated source, all model outputs are for pipeline demonstration only. Do not report metrics (AUC, TSS) or suitable area figures as if they describe real species ecology. Replace synthetic predictors before any scientific use.