Discover, use, or subclass existing Dagster integration components (dbt, Looker, PowerBI, Fivetran, etc.). Handles configuration-file-based components (dbt, Sling) and API-based components (Fivetran, PowerBI) appropriately. Use only when an existing Dagster component exists within Dagster's integration libraries.
This skill helps you discover and work with existing Dagster integration components from the 70+ available integrations. It guides you through finding the right component, using it directly, or subclassing it to add custom functionality.
Key distinction: This skill differentiates between configuration-file-based components (like dbt and Sling) that read from local files, and API-based components (like Fivetran, PowerBI, Looker) that call external services. Configuration-based components should be used directly without demo_mode, while API-based components benefit from demo_mode implementation.
The documentation for subclassing components can be found here: https://docs.dagster.io/guides/build/components/creating-new-components/subclassing-components
When invoked, this skill will:
uv run dg docs integrations --json or browsing https://docs.dagster.io/integrations/librariesexecute() methodget_additional_scope()uv run dg check defsuv run dg list defsBefore running this skill, ensure:
uv is installed (check with uv --version)If you discover the integration package exists but has NO Component class:
uv run python -c "import dagster_<integration>; print([x for x in dir(dagster_<integration>) if 'Component' in x])"[], the integration doesn't have a Component classcreate-custom-dagster-component skill insteadThis skill is ONLY for integrations that have existing Component classes to subclass.
Help the user find the right integration component:
Run discovery command:
uv run dg docs integrations --json
Alternative: Browse https://docs.dagster.io/integrations/libraries for visual list
Common integrations include:
dagster_dbt.DbtProjectComponent - dbt projectsdagster_fivetran.FivetranComponent - Fivetran syncsdagster_sling.SlingReplicationCollectionComponent - Sling replicationsdagster_powerbi.PowerBIWorkspaceComponent - PowerBI workspacesdagster_looker.LookerComponent - Looker instancesdagster_airbyte.AirbyteComponent - Airbyte connectionsdagster_databricks.DatabricksComponent - Databricks workflowsdagster_snowflake.SnowflakeComponent - Snowflake resourcesIMPORTANT: First determine if this is a configuration-file-based or API-based component:
These components read from local configuration files and do NOT require external API credentials:
DbtProjectComponent - Reads from dbt project files (dbt_project.yml, SQL/Python models)SlingReplicationCollectionComponent - Reads from replication YAML filesFor configuration-file-based components:
These components call out to external services and require API credentials:
FivetranComponent - Calls Fivetran APIPowerBIWorkspaceComponent - Calls PowerBI APILookerComponent - Calls Looker APIAirbyteComponent - Calls Airbyte APICensusComponent - Calls Census APIFor API-based components:
Based on the component type, choose:
If using directly, skip to Step 7 to create the component instance YAML.
Install the required Dagster integration package:
uv add dagster-<integration-name>
Examples:
uv add dagster-dbtuv add dagster-slinguv add dagster-powerbiUse dg scaffold defs to create the component instance directory:
uv run dg scaffold defs <package>.<ComponentClass> <instance_name>
Example:
uv run dg scaffold defs dagster_sling.SlingReplicationCollectionComponent my_sling_sync
This creates a directory structure like:
defs/
my_sling_sync/
defs.yaml
# Other config files as needed
Create a component.py file in the component instance directory.
IMPORTANT: Only add demo_mode for API-based components!
When extending integration components, you must understand whether they use dataclass or Pydantic BaseModel patterns, as this determines how to add custom fields like demo_mode.
First, check the parent component's implementation:
# Check if it's a dataclass
uv run python -c "from <package> import <Component>; import dataclasses; print(dataclasses.is_dataclass(<Component>))"
Most Dagster integration components (like SlingReplicationCollectionComponent, DbtProjectComponent, FivetranComponent) use dataclass with the Resolvable interface.
This is the most common pattern for Dagster integration components:
from dataclasses import dataclass
import dagster as dg
from <integration_package> import <BaseComponentClass>
@dataclass
class Custom<ComponentName>(BaseComponentClass):
"""Customized component with demo mode support."""
# New field - will automatically appear in YAML schema via Resolvable
demo_mode: bool = False
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
"""Build definitions, using demo mode if enabled.
Note: The parent class fields (like API credentials) are still set from YAML,
but when demo_mode is True, we bypass the parent's build_defs() method
and return mocked assets instead, so those credentials are never used.
"""
if self.demo_mode:
# Return mock assets for demo mode - parent credentials are ignored
return self._build_demo_defs(context)
else:
# Use real integration with actual credentials from parent fields
return super().build_defs(context)
def _build_demo_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
"""Build demo mode definitions with mocked assets."""
@dg.asset(
key=dg.AssetKey(["mock_asset"]),
kinds={"integration_name"}, # IMPORTANT: Add the integration kind
)
def mock_asset(context: dg.AssetExecutionContext):
context.log.info("Demo mode: simulating asset execution")
return {"status": "demo_mode"}
return dg.Definitions(assets=[mock_asset])
Key points for dataclass components:
@dataclass decorator on your subclassfield(default_factory=...) for mutable defaults (lists, dicts)Resolvable interface (inherited from parent) handles YAML schema generationExample with multiple custom fields:
from dataclasses import dataclass, field
import dagster as dg
from dagster_sling import SlingReplicationCollectionComponent
@dataclass
class CustomSlingComponent(SlingReplicationCollectionComponent):
"""Extended Sling component with additional configuration."""
# New fields - all will appear in YAML schema
demo_mode: bool = False
enable_notifications: bool = False
notification_channel: str = "slack"
custom_tags: list[str] = field(default_factory=list)
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
if self.demo_mode:
return self._build_demo_defs(context)
return super().build_defs(context)
Some components may use Pydantic BaseModel. In these cases, inherit from both the parent component and dg.Model:
import dagster as dg
from <integration_package> import <BaseComponentClass>
class Custom<ComponentName>(BaseComponentClass, dg.Model):
"""Customized component with demo mode support."""
# New field - will appear in YAML schema
demo_mode: bool = False
def build_defs(self, context: dg.ComponentLoadContext) -> dg.Definitions:
if self.demo_mode:
return self._build_demo_defs(context)
return super().build_defs(context)
For older components that override execute() instead of build_defs():
from dataclasses import dataclass
from dagster import AssetExecutionContext
from <integration_package> import <BaseComponentClass>
from collections.abc import Iterator
from typing import Any
@dataclass
class Custom<ComponentName>(BaseComponentClass):
"""Customized component with demo mode support."""
demo_mode: bool = False
def execute(
self,
context: AssetExecutionContext,
**kwargs: Any,
) -> Iterator:
"""Custom execution logic with demo mode support."""
if self.demo_mode:
context.log.info("Running in demo mode with mocked data")
yield from self._execute_demo_mode(context, **kwargs)
else:
context.log.info("Running with real integration")
yield from super().execute(context, **kwargs)
def _execute_demo_mode(
self,
context: AssetExecutionContext,
**kwargs: Any,
) -> Iterator:
"""Demo mode implementation."""
from dagster import Output
context.log.info("Simulating integration execution locally")
yield Output(value=None, output_name="result")
Key customization points:
build_defs() or execute() - Check demo_mode and return mock dataget_additional_scope() for YAML templatingExample with custom templating:
from dataclasses import dataclass
from collections.abc import Mapping
from typing import Any
import dagster as dg
from <integration_package> import <BaseComponentClass>
@dataclass
class Custom<ComponentName>(BaseComponentClass):
demo_mode: bool = False
@classmethod
def get_additional_scope(cls) -> Mapping[str, Any]:
"""Add custom YAML templating functions."""
def _custom_cron(cron_schedule: str) -> dg.AutomationCondition:
return (
dg.AutomationCondition.on_cron(cron_schedule)
& ~dg.AutomationCondition.in_progress()
)
return {"custom_cron": _custom_cron}
When to do this: If other Dagster components in your pipeline will depend on assets from this component, override get_asset_spec() to generate asset keys that match downstream expectations.
This applies when:
depsBy default, integration components generate asset keys in their own structure. For example:
["fivetran", "raw", "customers"]["sling", "replications", "sync_name", "table"]["analytics", "marts", "customer_360"]The problem: Downstream components may expect different key structures, leading to broken dependencies or requiring per-asset configuration with meta.dagster.asset_key.
The solution: Override get_asset_spec() in the upstream component to generate keys that downstream components naturally reference.
def get_asset_spec(self, props) -> dg.AssetSpec:
"""Override to generate asset keys matching downstream component expectations.
This eliminates the need for meta.dagster.asset_key configuration in downstream
components by aligning keys at the source.
"""
base_spec = super().get_asset_spec(props)
original_key = base_spec.key.path
# Customize key structure for your pipeline
# Example: Flatten nested keys for easier consumption
custom_key = dg.AssetKey([...]) # Your key transformation logic
return base_spec.replace_attributes(key=custom_key)
Problem: Fivetran creates ["fivetran", "raw", "customers"], but dbt expects ["fivetran_raw", "customers"]
Solution:
from dagster_fivetran import FivetranAccountComponent
from dagster_fivetran.translator import FivetranConnectorTableProps
import dagster as dg
class CustomFivetranComponent(FivetranAccountComponent):
def get_asset_spec(self, props: FivetranConnectorTableProps) -> dg.AssetSpec:
"""Flatten asset keys for dbt compatibility."""
base_spec = super().get_asset_spec(props)
original_key = base_spec.key.path
# Flatten: ["fivetran", "raw", "table"] -> ["fivetran_raw", "table"]
if len(original_key) >= 2:
flattened_key = dg.AssetKey(["fivetran_raw", "_".join(original_key[1:])])
else:
flattened_key = dg.AssetKey(["fivetran_raw", original_key[-1]])
return base_spec.replace_attributes(key=flattened_key)
Result: dbt sources work automatically without meta.dagster configuration:
# sources.yml - references work naturally now