Name: Batch Inference Jobs
Author: Heath-Moose

Search skills.../

Batch Inference Jobs | Skills Pool

For batch inference, there are two approaches:

1. **Warehouse-based** (`mv.run()`) - Runs on virtual warehouses
   - Best for: SQL pipelines, dbt models, Dynamic Tables, Snowpark DataFrames
   - Simpler setup, no compute pool required
   - Ideal for tabular data and lightweight models

2. **SPCS Job-based** (`mv.run_batch()`) - Runs on SPCS compute pools
   - Best for: Large-scale processing, GPU models, unstructured data (images/audio/video)
   - Requires compute pool setup
   - Supports parallel replicas for high throughput

Which approach do you need?

To run batch inference, I need:

1. **Model name**: What model do you want to use? (from Model Registry)
2. **Database/Schema**: Where is the model registered?

mv.show_functions()

What data do you want to run inference on?

1. **Snowflake table** - Tabular data (e.g., MY_DB.SCHEMA.INPUT_TABLE)
2. **Inline data** - Small dataset to create as DataFrame
3. **Unstructured data (non-template)** - Images/audio/video for models expecting raw bytes
   - Use with: Whisper, ViT, ResNet, YOLO, custom image/audio models
   - Best for: Focused tasks like image classification, audio transcription, object detection
4. **Unstructured data (template/LLM)** - Multimodal LLMs with OpenAI chat format
   - Use with: Qwen-VL, LLaVA, MedGemma, other vision-language LLMs
   - Best for: Image captioning, visual Q&A, multimodal reasoning

input_df = session.table("<DATABASE>.<SCHEMA>.<TABLE_NAME>")

input_df = session.create_dataframe([
    (5.1, 3.5, 1.4, 0.2),
    (4.9, 3.0, 1.4, 0.2),
], schema=["feature_1", "feature_2", "feature_3", "feature_4"])

Where should I write the inference results?

Provide a stage location (e.g., @MY_DB.MY_SCHEMA.OUTPUT_STAGE/results/)

@<DATABASE>.<SCHEMA>.<STAGE_NAME>/<optional_path>/

Based on your model, I recommend:
- **Compute Pool**: <`POOL_NAME`> (<INSTANCE_FAMILY>)

Do you want to use this pool, or specify a different one?

from snowflake.ml.model.batch import JobSpec

# Basic (single replica, model has only one function)
job_spec = JobSpec()

# Basic (single replica, model has multiple functions - must specify which one)
job_spec = JobSpec(function_name="<FUNCTION_NAME>")

# Scaled (multiple replicas)
job_spec = JobSpec(
    function_name="<FUNCTION_NAME>",  # Optional if model has only one function
    replicas=2,           # Number of replicas / instances
    num_workers=2,        # Workers per replica
)

I will submit a batch inference job with these settings:

- **Model**: <DATABASE>.<SCHEMA>.<MODEL_NAME> (version: <VERSION>)
- **Function**: <FUNCTION_NAME or "default (only one function)">
- **Input**: <INPUT_SOURCE> (<ROW_COUNT> rows)
- **Compute Pool**: <POOL_NAME>
- **Output**: @<DATABASE>.<SCHEMA>.<STAGE>/output/
- **Replicas**: <N>

Ready to submit? (Yes/No)

from snowflake.ml.registry import Registry
from snowflake.ml.model.batch import JobSpec, OutputSpec, SaveMode

# Session setup per environment guide
# e.g., create_snowpark_session() or get_active_session()
session = <SESSION_SETUP>
session.use_database("<DATABASE>")
session.use_schema("<SCHEMA>")

reg = Registry(session=session)
mv = reg.get_model("<MODEL_NAME>").version("<VERSION>")

input_df = session.table("<INPUT_TABLE>")
output_location = "@<DATABASE>.<SCHEMA>.<STAGE>/output/"

job = mv.run_batch(
    X=input_df,
    compute_pool="<COMPUTE_POOL>",
    output_spec=OutputSpec(
        stage_location=output_location,
        mode=SaveMode.OVERWRITE,
    ),
    job_spec=JobSpec(),  # Omit function_name if model has only one function
)

print(f"Job submitted. Waiting for completion...")
job.wait()
print(f"Job completed with status: {job.status}")

job = mv.run_batch(
    X=input_df,
    compute_pool="<COMPUTE_POOL>",
    output_spec=OutputSpec(
        stage_location=output_location,
        mode=SaveMode.OVERWRITE,
    ),
    job_spec=JobSpec(
        function_name="<FUNCTION_NAME>",  # Optional if model has only one function
        replicas=<N>,
        num_workers=2,
    ),
)

LS @<DATABASE>.<SCHEMA>.<STAGE>/output/;

results_df = session.read.option("pattern", ".*\\.parquet").parquet(output_location)
results_df.show(10)

output_table = "<OUTPUT_TABLE_NAME>"
results_df.write.mode("overwrite").save_as_table(output_table)
print(f"Results saved to {output_table}")

Batch inference completed!

- **Status**: DONE
- **Output Location**: @<DATABASE>.<SCHEMA>.<STAGE>/output/
- **Files**: <N> parquet files

Would you like me to:
1. Show sample results
2. Save results to a table
3. Clean up resources

# Input: DataFrame with feature columns matching model signature
input_df = session.table("MY_DB.MY_SCHEMA.FEATURES_TABLE")

job = mv.run_batch(
    X=input_df,
    compute_pool="CPU_POOL",
    output_spec=OutputSpec(stage_location=output_location, mode=SaveMode.OVERWRITE),
    job_spec=JobSpec(),
)

# Input: DataFrame with text column
input_df = session.create_dataframe([
    ("The quick brown fox",),
    ("Snowflake is great",),
], schema=["input_feature_0"])

job = mv.run_batch(
    X=input_df,
    compute_pool="CPU_POOL",
    output_spec=OutputSpec(stage_location=output_location, mode=SaveMode.OVERWRITE),
    job_spec=JobSpec(function_name="encode"),  # SentenceTransformer uses encode
)

# Safe production pattern
output_spec=OutputSpec(
    stage_location=output_location,
    mode=SaveMode.ERROR,  # Fail if output exists (prevents overwriting)
)

Model Type	Output Column	Format
XGBoost/sklearn classifiers	`output_feature_0`	Integer (class label)
XGBoost/sklearn regressors	`output_feature_0`	Float (predicted value)
SentenceTransformer	`output_feature_0`	Array of floats (embedding vector)

# List output files
session.sql(f"LS {output_location}").show()

# Read all parquet files
results_df = session.read.option("pattern", ".*\\.parquet").parquet(output_location)
results_df.show()

# Save to table for easier access
results_df.write.mode("overwrite").save_as_table("PREDICTION_RESULTS")

from snowflake.ml.jobs import list_jobs, delete_job, get_job

# View logs to troubleshoot
job.get_logs()

# Cancel a running job
job.cancel()

# List all jobs
list_jobs().show()

# Get handle to existing job by name
job = get_job("my_db.my_schema.job_name")

# Delete a job
delete_job(job)

print(f"Status: {job.status}")
print(f"Job ID: {job.id}")

Issue	Cause	Solution
`Model not found`	Wrong model name or schema	Verify with `SHOW MODELS IN SCHEMA`
`Compute pool not ready`	Pool is starting/suspended	Wait or run `ALTER COMPUTE POOL ... RESUME`
`Permission denied`	Missing grants	Grant usage on compute pool and stage
`Column mismatch`	Input doesn't match model signature	Check `mv.show_functions()` for expected inputs

# View model functions and their signatures
mv.show_functions()

Approach	API	Compute	Best For
Native SQL Batch	`mv.run()`	Virtual Warehouse	SQL pipelines, dbt, Dynamic Tables, Snowpark
Job-based Batch	`mv.run_batch()`	SPCS Compute Pool	Large-scale processing, unstructured data (images/audio/video)

Approach	API	Compute	Best For
Native SQL Batch	`mv.run()`	Virtual Warehouse	SQL pipelines, dbt, Dynamic Tables, Snowpark
Job-based Batch	`mv.run_batch()`	SPCS Compute Pool	Large-scale processing, unstructured data (images/audio/video)

SaveMode	Behavior
`OVERWRITE`	Replace existing output
`ERROR`	Fail if output directory not empty

Batch Inference Jobs

⚠️ CRITICAL: Environment Guide Check

Batch Inference Jobs

⚠️ CRITICAL: Environment Guide Check

Step 0: Choose Inference Approach

Job-based Batch Inference (`run_batch()`)

Prerequisites

Limitations

Workflow

Step 1: Identify Model and Version

Step 2: Identify Input Data

Step 3: Configure Output Stage

Step 4: Configure Compute Pool

Step 5: Configure Job Parameters

Step 6: Present Configuration Summary

Step 7: Generate and Execute Batch Inference Code

Step 8: Retrieve and Present Results

Common Use Cases

Classification/Regression (sklearn, xgboost, lightgbm)

Text Embeddings (SentenceTransformer)

Reading Output

Handling Partial Output

Output Structure

Reading Results

Troubleshooting

Job Management

Job Status

Common Issues

Checking Model Signature

Stopping Points

Clickhouse Io

Clickhouse Io

Claude Devfleet

Clickhouse Io

Ai First Engineering

Postgres Patterns

Batch Inference Jobs

⚠️ CRITICAL: Environment Guide Check

Batch Inference Jobs

⚠️ CRITICAL: Environment Guide Check

Step 0: Choose Inference Approach

Job-based Batch Inference (run_batch())

Prerequisites

Limitations

Workflow

Step 1: Identify Model and Version

Step 2: Identify Input Data

Step 3: Configure Output Stage

Step 4: Configure Compute Pool

Step 5: Configure Job Parameters

Step 6: Present Configuration Summary

Step 7: Generate and Execute Batch Inference Code

Step 8: Retrieve and Present Results

Common Use Cases

Classification/Regression (sklearn, xgboost, lightgbm)

Text Embeddings (SentenceTransformer)

Reading Output

Handling Partial Output

Output Structure

Reading Results

Troubleshooting

Job Management

Job Status

Common Issues

Checking Model Signature

Stopping Points

Clickhouse Io

Clickhouse Io

Claude Devfleet

Clickhouse Io

Ai First Engineering

Postgres Patterns

Job-based Batch Inference (`run_batch()`)