Use this skill when the user wants examples where the final answer must be backed by concrete evidence, such as “make it cite what the tools found,” “test whether it hallucinates in the summary,” or “give me tasks where every claim has to come from retrieved results.” Trigger it for requests about grounded reports, evidence-backed answers, factual synthesis, source-sensitive summarization, or multi-tool briefs that must stay faithful to intermediate results. Example triggers: “no hand-wavy summaries,” “make it justify every number,” “test grounded synthesis,” and “give me realistic research or planning tasks that need traceable evidence.”
Extract 1 to 3 concrete cases from the actual benchmark papers to demonstrate this capability, aiming for a maximum of 3 cases. CRITICAL HANDLING FOR TEMPLATES: If the paper provides concrete examples, extract them directly. However, if the paper ONLY provides abstract templates, you MUST NOT just output the template. Instead, you MUST instantiate the template into a highly specific, realistic, and logically coherent example according to the papers. Based on the papers, invent a specific entity, a specific environment, a concrete trajectory, and a final answer that perfectly perfectly reflects the benchmark's exact intent and difficulty. Do NOT state that they are extracted from papers or benchmarks, and absolutely do NOT mention or reference the source papers.
[Case 1]
[Case 2]
To synthesize data for this capability, you must strictly follow a 3-phase pipeline. Do not hallucinate steps. Read the corresponding reference file for each phase sequentially:
Phase 1: Environment Exploration
Read the exploration guidelines to discover raw knowledge seeds:
references/EXPLORATION.md
Phase 2: Trajectory Selection
Once Phase 1 is complete, read the selection criteria to evaluate the trajectory:
references/SELECTION.md
Phase 3: Data Synthesis
Once a trajectory passes Phase 2, read the synthesis instructions to generate the final data:
references/SYNTHESIS.md