Extend an existing synthetic dataset by adding more rows or new columns while preserving FK integrity, ID continuity, and column distributions. Use this skill when the user wants to "add more rows", "append data", "extend this dataset", "add a new column", "grow my dataset", or needs a larger version of an existing synthetic dataset without regenerating from scratch.
Grow an existing dataset (xlsx, csv, or json) without regenerating — keep the original rows intact and append new ones that follow the same patterns.
pip install openpyxl faker numpy pandas pyyaml --break-system-packages
Ask the user two questions:
{table_name: [records]})<table>"<name><type><table>Look for a companion schema file next to the dataset:
<dataset>.schema.yaml./schema.yamltemplates/ directory in the synthdata-generate skillIf the original schema isn't available, infer a minimal schema from the existing columns (header names + dtype detection) and proceed.
# Add 500 rows to the 'orders' table
python scripts/extend.py --input data.xlsx --table orders --add-rows 500 --output data_extended.xlsx
# Add a new column with a lognormal distribution
python scripts/extend.py --input data.xlsx --table employees --add-column salary_2026 \
--col-type float --distribution lognormal --mean 100000 --sigma 0.4
# Use a provided schema file for new-row generation
python scripts/extend.py --input data.xlsx --schema data.schema.yaml --table orders --add-rows 500
CLI flags:
| Flag | Description |
|---|---|
--input | Existing dataset (xlsx/csv/json) |
--output | Output path (defaults to <input>_extended.<ext>) |
--table | Target table name |
--add-rows N | Append N new rows |
--add-column NAME | Add a new column |
--col-type | New column type: id, faker, choice, int, float, bool, date, timestamp, constant |
--schema | Optional YAML schema file (for richer row synthesis) |
--seed | Random seed (default: unique from current timestamp) |
Confirm row counts and FK integrity (every FK in new rows resolves to a parent ID).
E00001-E00500, new rows start at E00501--overwrite)