Validate simulations before, during, and after execution. Use for pre-flight checks, runtime monitoring, post-run validation, diagnosing failed simulations, checking convergence, detecting NaN/Inf, or verifying mass/energy conservation.
Provide a three-stage validation protocol: pre-flight checks, runtime monitoring, and post-flight validation for materials simulations.
Before running validation scripts, collect from the user:
| Input | Description | Example |
|---|---|---|
| Config file | Simulation configuration (JSON/YAML) | simulation.json |
| Log file | Runtime output log | simulation.log |
| Metrics file | Post-run metrics (JSON) | results.json |
| Required params | Parameters that must exist |
dt,dx,kappa| Valid ranges | Parameter bounds | dt:1e-6:1e-2 |
Is simulation about to start?
├── YES → Run Stage 1: preflight_checker.py
│ └── BLOCK status? → Fix issues, do NOT run simulation
│ └── WARN status? → Review warnings, document if accepted
│ └── PASS status? → Proceed to run simulation
│
Is simulation running?
├── YES → Run Stage 2: runtime_monitor.py (periodically)
│ └── Alerts? → Consider stopping, check parameters
│
Has simulation finished?
├── YES → Run Stage 3: result_validator.py
│ └── Failed checks? → Do NOT use results
│ → Run failure_diagnoser.py
│ └── All passed? → Results are valid
| Metric | Conservative | Standard | Relaxed |
|---|---|---|---|
| Mass tolerance | 1e-6 | 1e-3 | 1e-2 |
| Residual growth | 2x | 10x | 100x |
| dt reduction | 10x | 100x | 1000x |
| Script | Output Fields |
|---|---|
scripts/preflight_checker.py | report.status, report.blockers, report.warnings |
scripts/runtime_monitor.py | alerts, residual_stats, dt_stats |
scripts/result_validator.py | checks, confidence_score, failed_checks |
scripts/failure_diagnoser.py | probable_causes, recommended_fixes |
scripts/preflight_checker.py --config simulation.jsonpython3 scripts/preflight_checker.py \
--config simulation.json \
--required dt,dx,kappa \
--ranges "dt:1e-6:1e-2,dx:1e-4:1e-1" \
--min-free-gb 1.0 \
--json
scripts/runtime_monitor.py --log simulation.log periodicallypython3 scripts/runtime_monitor.py \
--log simulation.log \
--residual-growth 10.0 \
--dt-drop 100.0 \
--json
scripts/result_validator.py --metrics results.jsonpython3 scripts/result_validator.py \
--metrics results.json \
--bound-min 0.0 \
--bound-max 1.0 \
--mass-tol 1e-3 \
--json
When validation fails:
python3 scripts/failure_diagnoser.py --log simulation.log --json
User: My phase field simulation crashed after 1000 steps. Can you help me figure out why?
Agent workflow:
python3 scripts/failure_diagnoser.py --log simulation.log --json
python3 scripts/runtime_monitor.py --log simulation.log --json
| Error | Cause | Resolution |
|---|---|---|
Config not found | File path invalid | Verify config path exists |
Non-numeric value | Parameter is not a number | Fix config file format |
out of range | Parameter outside bounds | Adjust parameter or bounds |
Output directory not writable | Permission issue | Check directory permissions |
Insufficient disk space | Disk nearly full | Free up space or reduce output |
| Status | Meaning | Action |
|---|---|---|
| PASS | All checks passed | Proceed with confidence |
| WARN | Non-critical issues found | Review and document |
| BLOCK | Critical issues found | Must fix before proceeding |
| Score | Meaning |
|---|---|
| 1.0 | All validation checks passed |
| 0.75+ | Most checks passed, minor issues |
| 0.5-0.75 | Significant issues, review carefully |
| < 0.5 | Major problems, do not trust results |
| Pattern in Log | Likely Cause | Recommended Fix |
|---|---|---|
| NaN, Inf, overflow | Numerical instability | Reduce dt, increase damping |
| max iterations, did not converge | Solver failure | Tune preconditioner, tolerances |
| out of memory | Memory exhaustion | Reduce mesh, enable out-of-core |
| dt reduced | Adaptive stepping triggered | May be okay if controlled |
references/validation_protocol.md - Detailed checklist and criteriareferences/log_patterns.md - Common failure signatures and regex patterns