Use this skill when the user needs to create or curate labeled datasets for fine-tuning, instruction tuning, or RLHF alignment.

Scope

Alpaca: {&quot;instruction&quot;: &quot;...&quot;, &quot;input&quot;: &quot;...&quot;, &quot;output&quot;: &quot;...&quot;}
ShareGPT: {&quot;conversations&quot;: [{&quot;from&quot;: &quot;human&quot;, &quot;value&quot;: &quot;...&quot;}, ...]}
OpenAI chat: {&quot;messages&quot;: [{&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: &quot;...&quot;}, ...]}
DPO pairs: {&quot;prompt&quot;: &quot;...&quot;, &quot;chosen&quot;: &quot;...&quot;, &quot;rejected&quot;: &quot;...&quot;}

Dataset Formats

Support output in standard formats:

Always validate schema before export. Report and quarantine malformed entries.

Use this skill when the user needs to create or curate labeled datasets for fine-tuning, instruction tuning, or RLHF alignment.

Support output in standard formats:

Always validate schema before export. Report and quarantine malformed entries.