Time Travel | Skills Pool

스킬 파일

Time Travel

Reconstruct the state of a Delta Lake table at a specific version. Use when the user wants to understand what the table looked like at a point in time — which files were active, what the schema was, how many rows existed. Triggers on phrases like "time travel", "what did the table look like at version N", "restore version", "table state at version", "go back to version".

chanukyapekala0 스타2026. 3. 21.

직업
카테고리: 데이터베이스 도구

스킬 내용

Reconstruct the logical state of a Delta Lake table at a specific version by replaying the transaction log. No Spark, no library — pure log replay.

Input

/delta-skills:time-travel path/to/table --version 10

Version is required.

How Delta time travel works

The transaction log is append-only. To reconstruct the table at version N:

Find the latest checkpoint at or before version N
Start from that checkpoint's file list (if available), or from version 0
Replay all add and remove actions from that point up to version N
The result is the set of active Parquet files at version N

This is exactly what Spark does — we're just doing it manually by reading JSON.

Steps

1. Validate the path is a Delta table

관련 스킬

if [ ! -d "$TABLE_PATH/_delta_log" ]; then
  echo "ERROR: Not a Delta table."
  exit 1
fi

cat "$TABLE_PATH/_delta_log/_last_checkpoint"
# Returns: {"version":40,"size":123}

ls "$TABLE_PATH/_delta_log/"*.checkpoint.parquet 2>/dev/null

Table state at version 10: tables/customers
─────────────────────────────────────────────────────

Reconstructed from:  checkpoint v0 + 10 commit files
Active files:        84 parquet files
Estimated size:      1.8 GB
Estimated rows:      ~2,800,000
Partitioned by:      country

Schema at v10:
  id             long
  name           string
  email          string
  updated_at     timestamp
  (note: column 'country' was added later, in version 33)

Active files by partition:
  country=FI     18 files   ~510,000 rows
  country=SE     15 files   ~420,000 rows
  country=DE     22 files   ~640,000 rows
  country=NO     12 files   ~340,000 rows
  ... and 4 more partitions

To read these files, you would need:
  - The 84 Parquet files listed above
  - The schema as defined at v10
  - These files still exist on disk (unless VACUUM has run since v10)

⚠ Warning: If VACUUM has run since version 10, the underlying Parquet files
  for removed versions may no longer exist on disk. Time travel only works
  if the files are still physically present.