TMDL semantic model parsing, schema extraction, context building, and offline analysis for Power BI
| Property | Value |
|---|---|
| Domain | Power BI Semantic Models |
| Category | Model Understanding & Schema Extraction |
| Trifecta | SKILL.md + pbi-tmdl-analysis.instructions.md + tmdl-analysis.prompt.md |
| Depends | pbi-model-health (readiness scoring), pbi-business-qa (query generation) |
Parse TMDL (Tabular Model Definition Language) files into structured schema, extract tables, columns, measures, relationships, and hierarchies, then build compact LLM context for downstream Q&A, measure creation, and visual recommendation. This is the foundation skill -- every PBI capability depends on understanding the model first.
The cardinal rule: ground everything in the model. Never invent columns, measures, relationships, or tables that don't exist in the parsed TMDL. If the model lacks something, say so.
TMDL files live in a definition/ folder with a predictable layout.
| File/Pattern | Contains | Required |
|---|---|---|
model.tmdl | Model-level properties, data sources | Yes |
tables/*.tmdl | One file per table: columns, measures, partitions | Yes |
relationships.tmdl | All relationship definitions | No |
roles/*.tmdl | RLS role filters | No |
cultures/*.tmdl | Translation definitions | No |
perspectives/*.tmdl | Perspective membership | No |
expressions/*.tmdl | Shared M expressions | No |
The parser checks three locations (in order):
<root>/model.tmdl<root>/definition/model.tmdl<root>/*.SemanticModel/definition/model.tmdlTMDL uses indent-based nesting (one tab or four spaces per level).
model, table, column, measure, partition, relationship, role, culture, perspective, expression, hierarchy, level, calculationgroup, calculationitem, annotation, extendedproperty, datasource, querygroup
table 'Sales Data' ← quoted name with spaces
column Amount ← unquoted simple name
dataType: double ← property: value
formatString: #,##0.00 ← format metadata
measure 'Total Revenue' = SUM('Sales Data'[Amount])
description: Total revenue across all regions
relationship 'Sales to Product'
fromColumn: 'Sales Data'.ProductKey
toColumn: Product.ProductKey
crossFilteringBehavior: bothDirections
All TMDL property keys are normalized to camelCase during parsing (per TMDL spec, properties are case-insensitive). DataType → dataType, CrossFilteringBehavior → crossFilteringBehavior.
The parser converts TMDL nodes into a ModelInfo structure:
| Component | Extracted Fields |
|---|---|
| Tables | name, description, columns[], measures[], hierarchies[] |
| Columns | name, dataType, description, formatString, isHidden, expression |
| Measures | name, expression (DAX), description, formatString, displayFolder |
| Hierarchies | name, levels[] (name + column ref) |
| Relationships | fromTable.column → toTable.column, isActive, crossFilteringBehavior |
Relationship endpoints use two formats:
Sales.Amount'Sales Data'.'Amount USD' (with '' escaping for embedded quotes)Once parsed, the model is serialized to compact markdown for LLM context.
# Model: Adventure Works
## Tables (5)
### Sales
Columns:
OrderDate (dateTime) - Date the order was placed
Amount (double) - Revenue in USD
Measures:
Total Revenue = SUM(Sales[Amount]) - Total revenue across all regions
## Relationships (3)
Sales.ProductKey -> Product.ProductKey [active] (singleDirection)
Approximate: Math.ceil(text.length / 4) (~4 chars per token).
Large models may exceed the LLM token budget. Five progressive passes compress the context:
| Pass | What Gets Removed | Impact |
|---|---|---|
| 1 | Column descriptions | Low -- columns still named |
| 2 | Table/measure descriptions | Medium -- names carry semantics |
| 3 | Measure expressions | Medium -- DAX logic hidden |
| 4 | Columns beyond 10/table | High -- wide tables truncated |
| 5 | All columns | High -- only table names + measures |
Each pass checks if the token count is within budget before proceeding to the next.
| Rule | Check |
|---|---|
| Column exists | Every column referenced in DAX must appear in model context |
| Table exists | Every table referenced must be in the parsed schema |
| Relationship exists | TREATAS/USERELATIONSHIP only suggested when actual path is missing |
| Data type match | Filter arguments match the column's data type |
| Date table present | Time-intelligence functions flagged if no date table detected |
| No fabrication | If model lacks required data, say "the model does not include X" |
| Pattern | Why It Fails | Do This Instead |
|---|---|---|
| Assume column names from business question | Column may not exist or may be named differently | Parse TMDL first, match by name |
| Skip relationship check before DAX | CALCULATE filters require valid paths | Verify relationship chain exists |
| Ignore hidden columns/tables | Hidden objects exist for calculation, not display | Include in context, mark as hidden |
| Hardcode model structure | Every model is different | Always parse, never assume |