Name: Huggingface Import
Author: coval-ai

HuggingFace to Coval Test Set Import

Import $ARGUMENTS from HuggingFace and convert it into Coval test sets with properly structured test cases.

Coval Context

Coval is an AI evaluation platform for testing voice and conversational AI agents. It runs simulations against AI agents and measures performance with configurable metrics.

Concept	Description
Test Set	A collection of test cases, grouped by category or evaluation purpose
Test Case	A single evaluation scenario with `input` (prompt) and optional `metadata`
Persona	High-level user character (system prompt) - separate from test cases
Agent	The AI system being evaluated

Key distinction:

Dataset	Description
`cais/mmlu`	15k+ multiple-choice questions across 57 subjects (STEM, humanities, law)
`nyu-mll/glue`	Sentence-level tasks: sentiment, entailment, linguistic acceptability
`tau/commonsense_qa`	Reasoning tests for everyday world knowledge
`Rowan/hellaswag`	Common-sense inference and completion

Dataset	Description
`openai/gsm8k`	~8k grade-school math word problems (multi-step arithmetic)
`ucinlp/drop`	Reading comprehension with discrete operations
`lukaemon/bbh`	BigBench Hard - challenging reasoning subset

Huggingface Import

HuggingFace to Coval Test Set Import

Coval Context

Huggingface Import

HuggingFace to Coval Test Set Import

Coval Context

Coval API

Workflow

Step 1: Identify the HuggingFace Source

Step 2: Analyze Data Structure

Step 3: Interactive Field Mapping

Step 4: Generate CSVs

Step 5: Upload to Coval

Common HuggingFace Sources

General Language Understanding

Reasoning & Problem-Solving

Supporting Files

Checklist

Test

Feature Flags

Unit Tests

Integration Tests

Write Frontend Tests

Golang Testing