SSIS package analysis and extraction. Use when config.source.type is "ssis". Parses DTSX packages, extracts data flows, control flows, and maps transformations to target platform equivalents.
This skill is planned but not yet fully implemented. The structure below describes the intended capabilities.
Load this skill when the migration config specifies:
{
"source": {
"type": "ssis"
}
}
Parse .dtsx package files to extract:
Map SSIS transformations to target platform equivalents:
| SSIS Component | Target Equivalent |
|---|---|
| OLE DB Source | JDBC/Spark source |
| Derived Column | DataFrame transform |
| Lookup | Join operation |
| Conditional Split | Filter/When clause |
| Sort | orderBy |
| Aggregate | groupBy |
| Merge Join | join |
| Union All | union |
Build dependency graph of packages:
# Parse DTSX package
python scripts/parse_dtsx.py --package "path/to/package.dtsx"
# Extract data flows
python scripts/extract_dataflows.py --package "path/to/package.dtsx"
# Generate dependency graph
python scripts/build_dependency_graph.py --folder "path/to/ssis/project"
DTSX files are XML-based. Key namespaces:
DTS - Data Transformation Services elementsSQLDTS - SQL Server specific elementsPackage structure:
<DTS:Executable>
<DTS:ConnectionManagers/>
<DTS:Variables/>
<DTS:Executables> <!-- Control Flow -->
<DTS:Executable> <!-- Data Flow Task -->
<pipeline> <!-- Data Flow components -->
</pipeline>
</DTS:Executable>
</DTS:Executables>
</DTS:Executable>
To implement this skill:
scripts/parse_dtsx.py using Python's xml.etree