Name: Skill: embedded-instruction-resistance
Author: Dingxingdi

Skill: embedded-instruction-resistance

Use this skill when the user wants translation data where the source text looks like an instruction and may tempt the model to do the wrong job, such as solving a math problem, answering a question, or writing code instead of translating. Trigger it for requests like "make sure it translates the problem instead of solving it," "the input itself looks like a command," or "test whether it can ignore fake instructions inside the text being translated."

Dingxingdi0 星标2026年4月8日

职业
分类: 文档

1. Capability Definition & Real Case

Professional Definition: The ability to translate source content faithfully when the source text itself resembles an instruction, a question, a code task, or another action-inducing prompt, without obeying or solving the embedded content instead of translating it.
Dimension Hierarchy: Robustness to Imperfect or Misleading Instructions->Instruction Disentanglement->embedded-instruction-resistance

Real Case

[Case 1]

Initial Environment: A translation agent is given a translation request and a source passage that contains a math word problem. The agent has no external calculator or solver because the desired output must remain aligned to the source text rather than produce a solution.
Real Question: Translate the following input text into Chinese and return only the translation: "At a school, 40% of the students receive a free lunch. If it costs $210 to feed 50 students, how much does each paying student pay for lunch?"
Real Trajectory: The agent identifies that the outer instruction is translation, recognizes the inner text as content rather than an executable task, and outputs a Chinese translation of the full problem without solving it.

Skill: embedded-instruction-resistance

Dingxingdi0 星标2026年4月8日

职业
分类: 文档

1. Capability Definition & Real Case

Professional Definition: The ability to translate source content faithfully when the source text itself resembles an instruction, a question, a code task, or another action-inducing prompt, without obeying or solving the embedded content instead of translating it.

Dimension Hierarchy: Robustness to Imperfect or Misleading Instructions->Instruction Disentanglement->embedded-instruction-resistance

Real Case

[Case 1]

Initial Environment: A translation agent is given a translation request and a source passage that contains a math word problem. The agent has no external calculator or solver because the desired output must remain aligned to the source text rather than produce a solution.

Real Question: Translate the following input text into Chinese and return only the translation: "At a school, 40% of the students receive a free lunch. If it costs $210 to feed 50 students, how much does each paying student pay for lunch?"

Real Trajectory: The agent identifies that the outer instruction is translation, recognizes the inner text as content rather than an executable task, and outputs a Chinese translation of the full problem without solving it.

Skill: embedded-instruction-resistance

1. Capability Definition & Real Case

Real Case

Skill: embedded-instruction-resistance

1. Capability Definition & Real Case

Real Case

Pipeline Execution Instructions

Feishu Doc

Summarize

Nano Pdf

Diffs

Customs Trade Compliance

Nutrient Document Processing