Name: Ml Development
Author: Heath-Moose

搵技能.../

Ml Development | Skills Pool

# Get row count
row_count = session.table("MY_TABLE").count()

# Preview first 5 rows
sample = session.table("MY_TABLE").limit(5).to_pandas()

from snowflake.snowpark.functions import col

# PREFERRED: Filter and aggregate in Snowflake
df = session.table("MY_TABLE").filter(col("STATUS") == "ACTIVE").select(["COL1", "COL2"]).limit(10000).to_pandas()

# AVOID: Loading entire large tables
# df = session.table("MY_TABLE").to_pandas()  # Only for small tables (<100k rows)

Would you like to track this experiment using Snowflake's experiment tracking framework?
1. Yes - Track this model training experiment
2. No - Just train and evaluate

SHOW EXPERIMENTS IN SCHEMA DATABASE.SCHEMA;

What experiment name should be used for this experiment?
1. EXAMPLE_EXP_1
2. EXAMPLE_EXP_2
3. EXAMPLE_EXP_3
...
N. Other - You will be prompted to provide a name

Would you like to save the trained model to a file (using `ask_user_question` tool if available)?
1. Yes - Save as pickle file (.pkl) for later use
2. No - Just train and evaluate

If yes, where should I save it? (default: ./model.pkl)

DESCRIBE TABLE <table_name>;
SELECT COUNT(*) FROM <table_name>;
SELECT * FROM <table_name> LIMIT 10;

# Session setup per environment guide
# ...

# Load data using Snowpark
df = session.table("MY_TABLE").to_pandas()
# OR with filtering
df = session.table("MY_TABLE").select(["COL1", "COL2"]).filter(...).to_pandas()

I've written the complete script with:
- [Summary of what it does]
- [Data: X rows, Y columns]
- [Model: algorithm choice]
- [Expected output: metrics to report]
- [Model serialization: Yes/No, path if yes]

Ready to execute? (Yes/No)

Report details:

Model saved successfully:
- File path: /absolute/path/to/model.pkl
- Framework: sklearn/xgboost/lightgbm/pytorch/tensorflow
- Sample input schema: [columns and types]

Offer next step:

The model has been saved locally. Would you like to register it to Snowflake Model Registry?

If user says yes:
- Load model-registry/SKILL.md
- Pass along context: model file path, framework, sample input schema
- Tell model-registry: "User just trained this model, use this context"

Ml Development

Data Science Expert Skill

⚠️ CRITICAL: Environment Guide Check

Core Workflow

1. UNDERSTAND the Request

Ml Development

Data Science Expert Skill

⚠️ CRITICAL: Environment Guide Check

Core Workflow

1. UNDERSTAND the Request

2. PLAN Your Approach

3. EXECUTE Incrementally

4. ITERATE When Needed

5. COMPLETE with Quality

Data Access Patterns

Efficient Data Access

CLI Workflow Steps

Step 1: Ask About Experiment Tracking (for model training)

Step 2: Ask About Model Serialization (for model training)

Step 3: Analyze Data First

Step 4: Plan and Present

Step 5: Write Complete Code

Data Visualization Notes

Step 6: Ask Before Executing

Step 7: Execute

Step 8: Report Model Artifacts and Offer Next Steps

For Model Tasks

Memory and Context

Track Your Progress

Reference Previous Work

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns