Consult PostgreSQL's parser and grammar (gram.y) to understand SQL syntax, DDL statement structure, and parsing rules when implementing pgschema features
Use this skill when you need to understand PostgreSQL's SQL syntax, DDL statement structure, or how PostgreSQL parses specific SQL constructs. This is essential for correctly parsing SQL files and generating valid DDL in pgschema.
Invoke this skill when:
ir/parser.gointernal/diff/*.goMain parser directory: https://github.com/postgres/postgres/blob/master/src/backend/parser/
Key files to reference:
gram.y - Main grammar file - Yacc/Bison grammar defining PostgreSQL SQL syntaxscan.l - Lexical scanner (Flex/Lex) - tokenization ruleskeywords.c - Reserved and non-reserved keywordsparse_clause.c - Parsing of clauses (WHERE, GROUP BY, ORDER BY, etc.)parse_expr.c - Expression parsing (operators, function calls, etc.)parse_type.c - Type name parsing and resolutionparse_relation.c - Table and relation parsingparse_target.c - Target list parsing (SELECT list, etc.)parse_func.c - Function call parsingparse_utilcmd.c - Utility commands (DDL statements like CREATE, ALTER, DROP)analyze.c - Post-parse analysisparse_node.c - Parse node creation utilitiesDetermine what kind of SQL you're working with:
| Statement Type | gram.y Section | parse_utilcmd.c Function |
|---|---|---|
| CREATE TABLE | CreateStmt | transformCreateStmt() |
| ALTER TABLE | AlterTableStmt | transformAlterTableStmt() |
| CREATE INDEX | IndexStmt | transformIndexStmt() |
| CREATE TRIGGER | CreateTrigStmt | transformCreateTrigStmt() |
| CREATE FUNCTION | CreateFunctionStmt | transformCreateFunctionStmt() |
| CREATE PROCEDURE | CreateFunctionStmt | (procedures are functions) |
| CREATE VIEW | ViewStmt | transformViewStmt() |
| CREATE MATERIALIZED VIEW | CreateMatViewStmt | - |
| CREATE SEQUENCE | CreateSeqStmt | transformCreateSeqStmt() |
| CREATE TYPE | CreateEnumStmt, CreateDomainStmt, CompositeTypeStmt | - |
| CREATE POLICY | CreatePolicyStmt | transformCreatePolicyStmt() |
| COMMENT ON | CommentStmt | - |
Search gram.y for the statement's production rule:
Example - Finding CREATE TRIGGER syntax:
# In the postgres repository
grep -n "CreateTrigStmt:" src/backend/parser/gram.y
What to look for:
CreateTrigStmt:)| branches)opt_* rules)*_list rules)gram.y uses Yacc/Bison syntax:
CreateTrigStmt:
CREATE opt_or_replace TRIGGER name TriggerActionTime TriggerEvents ON
qualified_name TriggerReferencing TriggerForSpec TriggerWhen
EXECUTE FUNCTION_or_PROCEDURE func_name '(' TriggerFuncArgs ')'
{
CreateTrigStmt *n = makeNode(CreateTrigStmt);
n->trigname = $4;
n->relation = $8;
n->funcname = $13;
/* ... */
$$ = (Node *)n;
}
Key elements:
CREATE, TRIGGER, ONname, qualified_name{ ... }): C code that builds the parse tree|): Different ways to write the same statementopt_Follow the grammar rules to understand the complete syntax:
Example - Understanding trigger events:
TriggerEvents:
TriggerOneEvent
| TriggerEvents OR TriggerOneEvent
TriggerOneEvent:
INSERT
| DELETE
| UPDATE
| UPDATE OF columnList
| TRUNCATE
This shows:
OF columnListAfter understanding the grammar, check how PostgreSQL transforms the parsed statement:
Example - How CREATE TRIGGER is processed:
// In parse_utilcmd.c
static void
transformCreateTrigStmt(CreateTrigStmt *stmt, const char *queryString)
{
// Validation and transformation logic
// - Check trigger name conflicts
// - Validate trigger function exists
// - Process WHEN condition
// - Handle constraint triggers
}
Use this understanding in pgschema:
For parsing (ir/parser.go):
pg_query_go which wraps libpg_query (based on PostgreSQL's parser)For DDL generation (internal/diff/*.go):
Grammar rules prefixed with opt_ are optional:
opt_or_replace:
OR REPLACE { $$ = true; }
| /* EMPTY */ { $$ = false; }
This means CREATE OR REPLACE TRIGGER ... and CREATE TRIGGER ... are both valid.
Lists are typically defined recursively:
columnList:
columnElem { $$ = list_make1($1); }
| columnList ',' columnElem { $$ = lappend($1, $3); }
Use | to show different syntax options:
TriggerActionTime:
BEFORE { $$ = TRIGGER_TYPE_BEFORE; }
| AFTER { $$ = TRIGGER_TYPE_AFTER; }
| INSTEAD OF { $$ = TRIGGER_TYPE_INSTEAD; }
Operator precedence is defined at the top of gram.y:
%left OR
%left AND
%right NOT
%nonassoc IS ISNULL NOTNULL
%nonassoc '<' '>' '=' LESS_EQUALS GREATER_EQUALS NOT_EQUALS
Most CREATE statements follow this pattern:
CreateSomethingStmt:
CREATE opt_or_replace SOMETHING name definition_elements
AlterSomethingStmt:
ALTER SOMETHING name alter_action
| ALTER SOMETHING IF_P EXISTS name alter_action
DropSomethingStmt:
DROP SOMETHING name opt_drop_behavior
| DROP SOMETHING IF_P EXISTS name opt_drop_behavior
columnDef:
ColId Typename opt_column_storage ColQualList
| ColId Typename opt_column_storage GeneratedConstraintElem
| ColId Typename opt_column_storage GENERATED generated_when AS IDENTITY_P OptParenthesizedSeqOptList
This covers:
column_name typecolumn_name type GENERATED ALWAYS AS (expr) STOREDcolumn_name type GENERATED ALWAYS AS IDENTITYTriggerWhen:
WHEN '(' a_expr ')' { $$ = $3; }
| /* EMPTY */ { $$ = NULL; }
index_elem:
ColId opt_collate opt_class opt_asc_desc opt_nulls_order
| func_expr_windowless opt_collate opt_class opt_asc_desc opt_nulls_order
| '(' a_expr ')' opt_collate opt_class opt_asc_desc opt_nulls_order
This shows indexes can be on:
ConstraintAttributeSpec:
ON DELETE key_action
| ON UPDATE key_action
| DEFERRABLE
| NOT DEFERRABLE
| INITIALLY DEFERRED
| INITIALLY IMMEDIATE
Check keywords.c for keyword classification:
Reserved keywords: Cannot be used as identifiers without quoting
SELECT, FROM, WHERE, CREATE, TABLE, etc.Type function name keywords: Can be used as function or type names
CHAR, CHARACTER, VARCHAR, etc.Unreserved keywords: Can be used as identifiers
ABORT, ABSOLUTE, ACCESS, ACTION, etc.Impact on pgschema: When generating DDL, quote identifiers that match reserved keywords.
In gram.y:
TableLikeClause:
LIKE qualified_name TableLikeOptionList
TableLikeOptionList:
TableLikeOptionList:
TableLikeOptionList INCLUDING TableLikeOption
| TableLikeOptionList EXCLUDING TableLikeOption
| /* EMPTY */
TableLikeOption:
TableLikeOption:
COMMENTS | CONSTRAINTS | DEFAULTS | IDENTITY_P | GENERATED | INDEXES | STATISTICS | STORAGE | ALL
This tells us:
LIKE table_name is the basic syntaxINCLUDING ALL, EXCLUDING INDEXES, etc.pgschema usage (ir/parser.go):
// Parse CREATE TABLE ... LIKE statements
if createTableStmt.Inherits != nil {
for _, inherit := range createTableStmt.Inherits {
if inherit.Relpersistence == "l" { // LIKE clause
table.LikeClause = &LikeClause{
Parent: inherit.Relname,
Options: parseLikeOptions(inherit),
}
}
}
}
In gram.y:
ConstraintAttributeSpec:
DEFERRABLE { $$ = CAS_DEFERRABLE; }
| NOT DEFERRABLE { $$ = CAS_NOT_DEFERRABLE; }
| INITIALLY DEFERRED { $$ = CAS_INITIALLY_DEFERRED; }
| INITIALLY IMMEDIATE { $$ = CAS_INITIALLY_IMMEDIATE; }
For constraint triggers:
CreateTrigStmt:
CREATE opt_or_replace CONSTRAINT TRIGGER name ...
This tells us:
CREATE CONSTRAINT TRIGGERDEFERRABLE or NOT DEFERRABLEINITIALLY DEFERRED or INITIALLY IMMEDIATEpgschema DDL generation (internal/diff/trigger.go):
func generateCreateTrigger(trigger *ir.Trigger) string {
var sql strings.Builder
sql.WriteString("CREATE ")
if trigger.IsConstraint {
sql.WriteString("CONSTRAINT ")
}
sql.WriteString("TRIGGER ")
sql.WriteString(quoteIdentifier(trigger.Name))
// ...
if trigger.Deferrable {
sql.WriteString(" DEFERRABLE")
}
if trigger.InitiallyDeferred {
sql.WriteString(" INITIALLY DEFERRED")
}
return sql.String()
}
In gram.y:
index_elem:
ColId opt_collate opt_class opt_asc_desc opt_nulls_order
{
$$ = makeIndexElem($1, NULL, NULL, $2, $3, $4, $5, NULL);
}
| func_expr_windowless opt_collate opt_class opt_asc_desc opt_nulls_order
{
$$ = makeIndexElem(NULL, $1, NULL, $2, $3, $4, $5, NULL);
}
| '(' a_expr ')' opt_collate opt_class opt_asc_desc opt_nulls_order
{
$$ = makeIndexElem(NULL, NULL, $2, $4, $5, $6, $7, NULL);
}
This tells us:
CREATE INDEX idx ON table (column)CREATE INDEX idx ON table (lower(column))CREATE INDEX idx ON table ((column + 1))(( ... ))pgschema parsing consideration:
// When parsing index definitions, handle all three forms:
// 1. Simple column reference
// 2. Function expression
// 3. Arbitrary expression (needs extra parens in DDL)
In gram.y:
GeneratedConstraintElem:
GENERATED generated_when AS '(' a_expr ')' STORED
{
Constraint *n = makeNode(Constraint);
n->contype = CONSTR_GENERATED;
n->generated_when = $2;
n->raw_expr = $5;
n->cooked_expr = NULL;
n->location = @1;
$$ = (Node *)n;
}
generated_when:
ALWAYS { $$ = ATTRIBUTE_IDENTITY_ALWAYS; }
| BY DEFAULT { $$ = ATTRIBUTE_IDENTITY_BY_DEFAULT; }
This tells us:
GENERATED ALWAYS AS (expression) STOREDGENERATED ALWAYS AS IDENTITY or GENERATED BY DEFAULT AS IDENTITYSTORED keyword for computed columnspgschema uses pg_query_go/v6 which provides Go bindings to libpg_query (PostgreSQL parser):
The parse tree from pg_query_go matches gram.y structure:
import "github.com/pganalyze/pg_query_go/v6"
result, err := pg_query.Parse(sqlStatement)
if err != nil {
return err
}
// result.Stmts contains parsed statement nodes
// Structure matches gram.y production rules
for _, stmt := range result.Stmts {
switch node := stmt.Stmt.Node.(type) {
case *pg_query.Node_CreateStmt:
// Handle CREATE TABLE
case *pg_query.Node_CreateTrigStmt:
// Handle CREATE TRIGGER
case *pg_query.Node_IndexStmt:
// Handle CREATE INDEX
}
}
Map gram.y rules to pg_query_go node fields:
gram.y:
CreateTrigStmt:
CREATE TRIGGER name TriggerActionTime TriggerEvents ON qualified_name
pg_query_go:
createTrigStmt := node.CreateTrigStmt
triggerName := createTrigStmt.Trigname // maps to 'name'
timing := createTrigStmt.Timing // maps to 'TriggerActionTime'
events := createTrigStmt.Events // maps to 'TriggerEvents'
relation := createTrigStmt.Relation // maps to 'qualified_name'
Clone postgres and build the parser:
git clone https://github.com/postgres/postgres.git
cd postgres
./configure
make -C src/backend/parser
Test parsing in pgschema:
import "github.com/pganalyze/pg_query_go/v6"
sql := "CREATE TRIGGER ..."
result, err := pg_query.Parse(sql)
if err != nil {
// Invalid syntax
fmt.Println("Parse error:", err)
}
// Valid syntax - examine result.Stmts
Test actual PostgreSQL behavior:
psql -c "CREATE TRIGGER ..."
# If PostgreSQL accepts it, the syntax is valid
# Use \d+ to see how PostgreSQL formats it
gram.y contains helpful comments explaining syntax choices and historical notes.
PostgreSQL's test suite has extensive SQL examples:
# In postgres repo
find src/test/regress/sql -name "*.sql" -exec grep -l "CREATE TRIGGER" {} \;
PostgreSQL syntax evolves across versions:
COMPRESSION clause for tablesMERGE statement, UNIQUE NULLS NOT DISTINCTMERGE enhancements, incremental view maintenanceFor pgschema (supports 14-17):
After consulting gram.y and implementing in pgschema:
Finding syntax in gram.y:
# Search for statement type
grep -n "CreateTrigStmt:" src/backend/parser/gram.y
# Find keyword definitions
grep -n "^TRIGGER" src/backend/parser/gram.y
# Understand an option
grep -A 10 "TriggerWhen:" src/backend/parser/gram.y
Understanding precedence:
# Look at top of gram.y
head -100 src/backend/parser/gram.y | grep -A 50 "%left\|%right\|%nonassoc"
Find utility command handling:
grep -n "transformCreateTrigStmt" src/backend/parser/parse_utilcmd.c