Investigate datasets from HuggingFace, CSV, or JSON files to understand their structure, fields, and data quality. Trigger whenever you need to explore or inspect a dataset yourself without using pre-written scripts.
This workflow helps you explore and understand datasets used in evaluations. It covers HuggingFace datasets, CSV files, and JSON/JSONL files.
For detailed information on Inspect's dataset types (datasets.Dataset vs inspect_ai.dataset.Dataset), the hf_dataset() pipeline, caching behaviour, and test utilities, see references/inspect-dataset-patterns.md.
Evals typically define:
DATASET_PATH: HuggingFace repo path (e.g., "qiaojin/PubMedQA")DATASET_REVISION: Optional git revision/tag for reproducibilityrecord_to_sample(): Function converting raw records to Sample objectsdatasets, pandas, and inspect_ai installedLook for these patterns in the evaluation code:
# HuggingFace dataset
DATASET_PATH = "org/dataset-name"
DATASET_REVISION = "v1.0" # optional
hf_dataset(path=DATASET_PATH, name="subset", split="train", ...)
# CSV dataset
csv_dataset("path/to/file.csv", ...)
load_csv_dataset("https://example.com/file.csv", eval_name="myeval", ...)
# JSON/JSONL dataset
json_dataset("path/to/file.json", ...)
load_json_dataset("https://example.com/file.jsonl", eval_name="myeval", ...)
For investigation, load the raw data directly (not through Inspect's sample_fields transformation). Use standard datasets.load_dataset() for HuggingFace, pd.read_csv() for CSV, or pd.read_json() for JSON/JSONL. For gated datasets, ensure HF_TOKEN is set or run huggingface-cli login.
Use standard pandas/datasets methods to explore:
ds.features (HF) or df.dtypes (pandas)len(ds), ds.column_names (HF) or df.info(), df.columns (pandas)ds[:3] (HF) or df.head() (pandas)None, empty strings, empty listsvalue_counts() for categorical columns, length stats for text fieldsFor converting an Inspect Dataset (which has no .to_pandas()) to a DataFrame, see references/inspect-dataset-patterns.md.
Look at the record_to_sample function to understand how raw data maps to Inspect samples. Key questions:
input? Are they combined/formatted?target format? (letter, text, JSON, etc.)choices for multiple choice?metadata?See references/inspect-dataset-patterns.md for the pattern to load through Inspect's hf_dataset() and verify sample conversion works correctly.
# View HF dataset info without downloading
uv run python -c "from datasets import load_dataset_builder; b = load_dataset_builder('org/name'); print(b.info)"
# List available configs/subsets
uv run python -c "from datasets import get_dataset_config_names; print(get_dataset_config_names('org/name'))"
# List available splits
uv run python -c "from datasets import load_dataset; print(load_dataset('org/name', split=None).keys())"
For cache locations (HuggingFace native, Inspect AI, Inspect Evals), force re-download commands, and test utilities, see references/inspect-dataset-patterns.md.
huggingface-cli login or set HF_TOKENhf_dataset wrapper in inspect_evals.utils.huggingface has built-in retry with backoffstreaming=True or split="train[:1000]" for sampling