Interactive Hamilton DAG development via MCP tools. Validate, visualize, scaffold, and execute Hamilton pipelines without leaving the conversation. Use when building or debugging Hamilton dataflows interactively.
The Hamilton MCP server exposes Hamilton's DAG compilation, validation, and execution as interactive tools. It enables a tight feedback loop: write functions, validate the DAG, visualize dependencies, fix errors, and execute -- all without leaving the conversation.
Run via uvx (recommended). Add --with for whichever libraries your code uses:
uvx --from "apache-hamilton[mcp]" hamilton-mcp # minimal
uvx --from "apache-hamilton[mcp]" --with pandas --with numpy hamilton-mcp # pandas/numpy project
uvx --from "apache-hamilton[mcp]" --with polars hamilton-mcp # polars project
Or install and run directly:
pip install "apache-hamilton[mcp]"
hamilton-mcp
Or use programmatically:
from hamilton.plugins.h_mcp import get_mcp_server
mcp = get_mcp_server()
mcp.run()
Always follow this sequence when building Hamilton DAGs interactively:
ask user -> capabilities -> scaffold -> validate -> visualize -> correct -> execute
Before calling any tool, ask the user which data libraries they use (pandas, numpy, polars, etc.). Then pass their answer as preferred_libraries to hamilton_capabilities and hamilton_scaffold. This ensures scaffolds match the user's project, not the server's environment.
// Example: user says "I use pandas"
// Tool call: hamilton_capabilities(preferred_libraries=["pandas"])
{
"libraries": {
"pandas": true,
"numpy": true,
"polars": false,
"graphviz": true
},
"available_scaffolds": [
"basic", "basic_pure_python", "config_based",
"data_pipeline", "parameterized"
]
}
Decision rules:
basic_pure_python scaffold and int/float/str/dict typesgraphviz is available: use hamilton_visualize to show the DAG structureUse hamilton_scaffold with a pattern name from the capabilities response:
| Pattern | Libraries Required | Use Case |
|---|---|---|
basic_pure_python | None | Simple pipelines with built-in types |
basic | pandas | DataFrame cleaning & counting |
parameterized | pandas | Multiple nodes from one function |
config_based | pandas | Environment-conditional logic |
data_pipeline | pandas | ETL: ingest, clean, transform, aggregate |
ml_pipeline | pandas, numpy | Feature engineering & train/test split |
data_quality | pandas, numpy | Validation with @check_output |
Always validate before executing. hamilton_validate_dag compiles the DAG without running it, catching:
// Success response
{
"valid": true,
"node_count": 5,
"nodes": ["cleaned", "feature_a", "feature_b", "raw_data", "result"],
"inputs": ["data_path"],
"errors": []
}
// Failure response
{
"valid": false,
"node_count": 0,
"nodes": [],
"inputs": [],
"errors": [{"type": "SyntaxError", "message": "...", "detail": "line 5"}]
}
Self-correction loop: If validation fails, read the error, fix the code, and validate again. Do not proceed to execution until validation passes.
hamilton_visualize returns DOT graph source. Use this to:
hamilton_list_nodes returns structured info for every node:
Use this to understand what inputs the DAG needs before execution.
hamilton_execute runs the DAG with provided inputs and returns results. Key parameters:
code: The full Python sourcefinal_vars: List of node names to compute (only these and their dependencies run)inputs: Dict of external input valuestimeout_seconds: Safety limit (default 30s)WARNING: This executes arbitrary Python code. Always validate first.
"No module named 'X'"
The code imports a library that isn't installed. Call hamilton_capabilities to check availability, then rewrite without the missing library.
"Missing dependencies: ['node_name']" A function parameter doesn't match any function name or external input. Either:
inputs when executing"Execution timed out after Ns"
The code takes too long. Reduce data size, simplify computation, or increase timeout_seconds.
Validation passes but execution fails Validation checks structure, not runtime behavior. Common causes:
| Tool | Purpose | When to Use |
|---|---|---|
hamilton_capabilities | Environment discovery | Always first |
hamilton_scaffold | Generate starter code | Starting a new pipeline |
hamilton_validate_dag | Compile-time validation | Before every execution |
hamilton_list_nodes | Inspect DAG structure | Understanding dependencies |
hamilton_visualize | DOT graph generation | Explaining structure (requires graphviz) |
hamilton_execute | Run the DAG | After successful validation |
hamilton_get_docs | Hamilton documentation | Learning decorators, patterns |
If the MCP server is not running: Fall back to CLI:
# Validate a module
python -c "from hamilton import driver; import my_module; dr = driver.Builder().with_modules(my_module).build(); print('Valid!')"
If Hamilton is not installed: Provide the user with installation instructions:
uvx --from "apache-hamilton[mcp]" hamilton-mcp # Run via uvx (add --with <lib> as needed)
pip install "apache-hamilton[mcp]" # Or install directly
A successful MCP interaction produces:
hamilton_validate_dag with zero errorshamilton_list_nodeshamilton_execute/hamilton-core/hamilton-scale/hamilton-llm/hamilton-observabilityhamilton_get_docs("overview")