Reconstruct the state of a Delta Lake table at a specific version. Use when the user wants to understand what the table looked like at a point in time — which files were active, what the schema was, how many rows existed. Triggers on phrases like "time travel", "what did the table look like at version N", "restore version", "table state at version", "go back to version".
Reconstruct the logical state of a Delta Lake table at a specific version by replaying the transaction log. No Spark, no library — pure log replay.
/delta-skills:time-travel path/to/table --version 10
Version is required.
The transaction log is append-only. To reconstruct the table at version N:
add and remove actions from that point up to version NThis is exactly what Spark does — we're just doing it manually by reading JSON.
if [ ! -d "$TABLE_PATH/_delta_log" ]; then
echo "ERROR: Not a Delta table."
exit 1
fi
Check _delta_log/_last_checkpoint:
cat "$TABLE_PATH/_delta_log/_last_checkpoint"
# Returns: {"version":40,"size":123}
If the checkpoint version is ≤ target version, use it as the starting point. Otherwise start from version 0.
Also check for checkpoint files directly:
ls "$TABLE_PATH/_delta_log/"*.checkpoint.parquet 2>/dev/null
Starting from the checkpoint (or version 0), read each commit JSON file up to and including the target version.
Maintain a set of active files:
add action → add path to active set, store partitionValues and statsremove action → remove path from active setAlso track:
metaData action at or before version N)Table state at version 10: tables/customers
─────────────────────────────────────────────────────
Reconstructed from: checkpoint v0 + 10 commit files
Active files: 84 parquet files
Estimated size: 1.8 GB
Estimated rows: ~2,800,000
Partitioned by: country
Schema at v10:
id long
name string
email string
updated_at timestamp
(note: column 'country' was added later, in version 33)
Active files by partition:
country=FI 18 files ~510,000 rows
country=SE 15 files ~420,000 rows
country=DE 22 files ~640,000 rows
country=NO 12 files ~340,000 rows
... and 4 more partitions
To read these files, you would need:
- The 84 Parquet files listed above
- The schema as defined at v10
- These files still exist on disk (unless VACUUM has run since v10)
If the target version is old, the underlying Parquet files may have been deleted by VACUUM. Note:
⚠ Warning: If VACUUM has run since version 10, the underlying Parquet files
for removed versions may no longer exist on disk. Time travel only works
if the files are still physically present.
Check the history for VACUUM operations using the commit log to give a concrete warning.
remove actions include a deletionTimestamp — this is when the logical delete happened, not when VACUUM will clean up