Use when Phase 1 (research-direction-exploration) is complete and G1 gate has passed — subjects the research problem to adversarial defense-style questioning, classifies project intent, re-evaluates venue target, and produces a final feasibility ruling before any method design begins
"It's just data analysis, no need for validation" is the #1 reason Type D projects fail. DO NOT SKIP THIS PHASE. </IRON-LAW>
A direction that survived Phase 1 is a candidate, not a commitment. This skill stress-tests the research problem through adversarial questioning before resources are invested. Passing this skill is required to enter method design.
<HARD-GATE> Do NOT proceed to method design, experiment planning, or any Phase 3 activity until every item in the Feasibility Gate at the end of this skill is ✅ and the user has explicitly approved. No exceptions. </HARD-GATE>digraph problem_validation {
"G1 Passed" [shape=doublecircle];
"Pre-Step:\nKnown answer?\nUser-specified?" [shape=diamond, style=filled, fillcolor="#f0f0f0"];
"Tell user &\noffer alternatives" [shape=box];
"Novelty\nLitmus Test" [shape=box, style=filled, fillcolor="#ffe0e0"];
"Multi-Agent\nDeliberation\n(up to 5 rounds)" [shape=box, style=filled, fillcolor="#ffe0e0"];
"Converged?" [shape=diamond];
"Escalation:\nUser decides" [shape=box, style=filled, fillcolor="#ff9999"];
"Project Intent\nClassification" [shape=box, style=filled, fillcolor="#e0e0ff"];
"Venue Target\nPrecision" [shape=box, style=filled, fillcolor="#e0ffe0"];
"Feasibility\nAssessment" [shape=box, style=filled, fillcolor="#fff0d0"];
"All passed?" [shape=diamond];
"Phase 3:\nMethod Design" [shape=doublecircle];
"Return to\nPhase 1" [shape=doublecircle, style=filled, fillcolor="#ffcccc"];
"G1 Passed" -> "Pre-Step:\nKnown answer?\nUser-specified?";
"Pre-Step:\nKnown answer?\nUser-specified?" -> "Tell user &\noffer alternatives" [label="known answer"];
"Pre-Step:\nKnown answer?\nUser-specified?" -> "Novelty\nLitmus Test" [label="proceed"];
"Tell user &\noffer alternatives" -> "Novelty\nLitmus Test" [label="user chooses\nnew question"];
"Novelty\nLitmus Test" -> "Multi-Agent\nDeliberation\n(up to 5 rounds)";
"Multi-Agent\nDeliberation\n(up to 5 rounds)" -> "Converged?";
"Converged?" -> "Project Intent\nClassification" [label="all PASS"];
"Converged?" -> "Escalation:\nUser decides" [label="persistent FAIL\nor 5 rounds"];
"Escalation:\nUser decides" -> "Return to\nPhase 1" [label="return to\nexploration"];
"Escalation:\nUser decides" -> "Multi-Agent\nDeliberation\n(up to 5 rounds)" [label="try new\nformulation"];
"Escalation:\nUser decides" -> "Project Intent\nClassification" [label="proceed\nas-is"];
"Project Intent\nClassification" -> "Venue Target\nPrecision";
"Venue Target\nPrecision" -> "Feasibility\nAssessment";
"Feasibility\nAssessment" -> "All passed?";
"All passed?" -> "Phase 3:\nMethod Design" [label="all ✅"];
"All passed?" -> "Escalation:\nUser decides" [label="any ✗"];
}
If the user returns to Phase 2 from Phase 4a exploration (because the data or baseline results revealed the research question is wrong), do NOT start from scratch:
docs/05_execution/phase4a-exploration-report.md)This return path ensures that exploration insights directly feed into a better research question, rather than being wasted.
When the user explicitly states a specific research question (e.g., "帮我研究 GM12878 细胞是什么类型的", "help me determine the cell type composition of dataset X"), do NOT silently override their intent. Instead:
Case 1 — The question has a known answer: If the answer is already well-established in the literature (e.g., GM12878 is a well-characterized lymphoblastoid cell line), tell the user directly:
The question "[user's question]" has a well-established answer in
existing literature:
[State the known answer with citations]
This means a paper answering this exact question would not be publishable
— it would be confirming what is already known.
However, there may be related questions that ARE worth researching:
1. [Suggest a novel angle using the same system/data]
2. [Suggest a comparative or mechanistic extension]
3. [Suggest applying the system to a genuinely open question]
Alternatively, if you need this analysis for purposes OTHER than
publishing a research paper (e.g., as part of a larger project,
for educational purposes, or as preliminary exploration), I can
help with that — just let me know and we'll adjust the workflow
accordingly.
What would you prefer?
Case 2 — The question is valid but may be weak: Proceed to Step 0 (Novelty Litmus Test) and the multi-agent deliberation. The agents will evaluate and suggest improvements. The user's original intent is respected as the STARTING POINT for discussion.
Case 3 — The question is strong and specific: Proceed normally. The user's expertise should be respected — they may know their field better than the agents.
KEY PRINCIPLE: The system serves the user, not the other way around. If the user insists on a question after being informed of its limitations, respect their decision (see escalation mechanism in Step 1).
The question to ask yourself: "If this project succeeds perfectly, what NEW biological/scientific knowledge exists that didn't exist before?"
FAIL examples (NOT publishable):
PASS examples (potentially publishable):
For Type D specifically: The novelty MUST be biological, not methodological. "We used standard tools in a reproducible way" is NOT novel. The question that the analysis answers must be one whose answer is NOT already known.
If the proposed question fails this test, tell the user honestly:
The current research question — "[question]" — would likely be viewed
as a technical exercise rather than a scientific contribution by
reviewers at [venue]. The expected findings are already well-established.
To make this publishable, we need a question whose answer is genuinely
unknown. Some options:
1. [Suggest a more specific, novel question using the same data/tools]
2. [Suggest different data that could yield novel findings]
3. [Suggest a comparative angle that could produce new insights]
Which direction interests you?
Do NOT proceed to adversarial questioning with a question that fails this test. Restructure first.
CRITICAL: "We'll find the real question in Phase 3" is NOT acceptable.
If the user's request is vague (e.g., "analyze this data and find discoveries"), you MUST help them formulate a SPECIFIC, testable question NOW — before leaving Phase 2. Acceptable approaches:
A question is specific enough when it names: (a) the biological system or phenomenon, (b) the comparison or contrast, and (c) what kind of answer would be new. Example: "Do classical and non-classical monocytes show distinct transcriptional programs in untreated vs. treated patients?" — this names the system (monocytes), the comparison (untreated vs. treated), and the novelty (programs not previously compared in this context).
"Are there interesting subpopulations in this dataset?" is NOT specific — it is a fishing expedition. </IRON-LAW>
Over-Analyzed Dataset Warning List (non-exhaustive — use judgment for unlisted datasets):
If the user wants to use such a dataset, you MUST warn them:
⚠️ DATA NOVELTY WARNING
The dataset "[name]" is one of the most frequently analyzed datasets in
[field]. Standard analysis pipelines have been applied to it hundreds
of times, and its major features are well-documented.
Using this dataset with standard tools is very unlikely to produce
novel findings suitable for a research paper.
Options to proceed:
1. Use a DIFFERENT, less-studied dataset (I can help identify one)
2. Apply a genuinely NOVEL analytical approach to this dataset
3. Use this dataset for a COMPARATIVE study with other datasets
4. Proceed knowing the paper will likely be a technical note /
benchmark report, not a discovery paper
Which option do you prefer?
Do NOT silently proceed with an over-analyzed dataset. The user must make an informed choice. </IRON-LAW>
This step dispatches THREE specialist agents to deliberate on the research question through the amplify:multi-round-deliberation protocol (max 5 rounds). This is NOT optional.
</IRON-LAW>
Three agents with distinct, complementary perspectives (adapted by research type):
Type M / D / H:
| Agent | Role | What They Optimize For |
|---|---|---|
| Senior Professor | Domain authority with strategic vision | Is this question scientifically meaningful? Does it advance the field? |
| Journal/Conference Editor | Venue gatekeeper | Would I send this to reviewers, or desk-reject? Does it meet [venue]'s bar? |
| Research Scientist (PhD/Postdoc) | Hands-on execution expert | Is this feasible? Can I actually answer this question with the available data/tools/time? |
Type C (Tool/Software) — different agent composition:
| Agent | Role | What They Optimize For |
|---|---|---|
| Target User Representative | A domain practitioner who would use this tool | Does this solve a REAL pain point? Would I switch from my current tool to this? |
| Journal/Conference Editor | Venue gatekeeper (same as above) | Does the tool contribution meet [venue]'s bar? Is it more than engineering? |
| Software Architect | Experienced developer of research tools | Is the architecture sound? Can this scale? Is the API well-designed? How does this compare to existing tools technically? |
Prepare the context package (shared by all agents):
Agent 1 — Senior Professor:
Call Task tool with:
description: "Senior Professor — evaluate research question"
prompt: |
SHARED VALUES:
Target venue: [venue]. Research type: [type].
Optimization target: "Is this question worth publishing at [venue]?"
Scoring: PASS / CONDITIONAL / FAIL
You are a tenured full professor at a top-5 university in [FIELD],
with 20+ years of active publication. You have served on hundreds
of thesis committees and as area chair at [venue] or equivalent.
You have an instinct for which questions lead to impactful papers
and which lead to dead ends.
RESEARCH QUESTION: "[question]"
RESEARCH TYPE: [type]
CONTEXT:
- Literature review: [paste summary]
- Gap analysis: [paste summary]
- Available data/resources: [paste summary]
Evaluate this research question from your perspective:
1. SCIENTIFIC SIGNIFICANCE: Does this question address a genuine
gap in the field? Would answering it change how people think
about [subfield]? Or is it incremental/confirmatory?
2. NOVELTY: What NEW knowledge will exist after this project that
didn't exist before? Be specific. "We confirm known results with
a new tool" is NOT novel.
3. STRATEGIC POSITIONING: Where does this sit in the field's trajectory?
Is the timing right? Will this still matter in 12 months?
4. WORTH-DOING TEST: Is this question worth serious research investment?
If not, why not? What related question WOULD be worth pursuing?
5. SUGGESTED IMPROVEMENTS: If the question is weak, propose 2-3
alternative formulations that would be stronger. Be specific.
6. VERDICT: PASS / CONDITIONAL / FAIL
If CONDITIONAL: what SPECIFIC changes would make this a PASS?
If FAIL: what is fundamentally wrong?
subagent_type: "generalPurpose"
Agent 2 — Journal/Conference Editor:
Call Task tool with:
description: "Journal Editor — evaluate research question"
prompt: |
SHARED VALUES:
Target venue: [venue]. Research type: [type].
Optimization target: "Would I send this to reviewers at [venue]?"
Scoring: PASS / CONDITIONAL / FAIL
You are an associate editor at [TARGET VENUE]. You handle 200+
submissions per year and desk-reject ~40% before they reach
reviewers. You know exactly what [venue]'s reviewers expect
and what makes them excited vs. dismissive.
RESEARCH QUESTION: "[question]"
CONTEXT:
- Literature review: [paste summary]
- Gap analysis: [paste summary]
Evaluate from your editorial perspective:
1. DESK-REJECT TEST: Based on the research question alone, would
you send this to reviewers or desk-reject? Why?
2. VENUE FIT: Does this question match [venue]'s scope, impact
expectations, and recent publication trends? Cite specific
recent papers at this venue for comparison.
3. REVIEWER PREDICTION: What will Reviewer 2 complain about?
What's the most likely "fatal flaw" a reviewer would identify?
4. COMPETITIVE LANDSCAPE: How many similar papers has [venue]
published in the last 2 years? Is the community saturated
with this topic?
5. SUGGESTED IMPROVEMENTS: How could the question be reframed to
better fit [venue]? Would a different venue be more appropriate?
6. VERDICT: PASS / CONDITIONAL / FAIL
subagent_type: "generalPurpose"
Agent 3 — Research Scientist (PhD/Postdoc):
Call Task tool with:
description: "Research Scientist — evaluate feasibility"
prompt: |
SHARED VALUES:
Target venue: [venue]. Research type: [type].
Optimization target: "Can I actually answer this question rigorously?"
Scoring: PASS / CONDITIONAL / FAIL
You are a senior PhD student or postdoc who has actually conducted
research in [FIELD]. You have hands-on experience with the tools,
datasets, and methods involved. You know the difference between
a question that SOUNDS good in a proposal and one that can
actually be ANSWERED with available resources.
RESEARCH QUESTION: "[question]"
CONTEXT:
- Available data: [paste data description]
- Available compute/tools: [paste resource summary]
- Research type: [type]
Evaluate from your execution perspective:
1. FEASIBILITY: Can this question be answered with the available
data, tools, and compute? What's missing?
2. METHODOLOGY GAP: Do we have the right methods to answer this?
Are there methodological challenges the question glosses over?
3. DATA SUFFICIENCY: Is the data adequate in quantity, quality,
and diversity to support the claims this question implies?
4. TIMELINE REALISM: How long would this take to execute properly?
Is the scope reasonable?
5. HIDDEN TRAPS: What practical pitfalls do you foresee that the
professor and editor might miss? (e.g., data quality issues,
confounders, computational bottlenecks, reproducibility concerns)
6. SUGGESTED IMPROVEMENTS: How could the question be scoped better
for feasible execution?
7. VERDICT: PASS / CONDITIONAL / FAIL
subagent_type: "generalPurpose"
REQUIRED SUB-SKILL: Follow amplify:multi-round-deliberation protocol.
After Round 1 agents return:
Convergence criteria:
Non-convergence or persistent failure — Escalation to User:
If after 3+ rounds the question still cannot achieve consensus, or if fundamental problems persist (e.g., "not novel", "desk-reject", "infeasible"), present the user with explicit options:
⚠️ RESEARCH QUESTION ESCALATION
═══════════════════════════════════
After [N] rounds of deliberation, the research question has not achieved
full consensus. Here is the situation:
Current question: "[question]"
Agent verdicts: Professor [X], Editor [X], Scientist [X]
Core problem: [summarize the persistent issue]
YOUR OPTIONS:
1. 🔄 RETURN TO PHASE 1 — Go back to direction exploration. The current
direction may not support a strong research question. We can explore
alternative directions or sub-directions.
2. 🔧 CONTINUE REFINING — Stay in Phase 2 and try a different angle on
the question. I'll propose new formulations based on the agents'
suggestions:
[list 2-3 specific alternative questions]
3. ✅ PROCEED AS-IS — Accept the current question with its known
limitations. The paper may target a lower-tier venue or serve
as a technical report / preliminary study.
[State what the realistic outcome would be]
4. ⏸️ PAUSE — Take time to think. You can come back with a new
question or additional context.
Which option do you prefer?
This escalation is MANDATORY when:
If user chooses "Return to Phase 1":
If user chooses "Proceed as-is":
research-anchor.yamlSpecial rules:
| Answer pattern | Verdict |
|---|---|
| "We provide a complete evidence chain" | VAGUE — evidence chain of WHAT? Push for specifics. |
| "Our analysis is reproducible" | NOT A CONTRIBUTION — reproducibility is expected, not novel. |
| "We use standard tools on standard data" | NOT RESEARCH — this is an exercise. |
| "We fill the gap of discovery-oriented analysis" | VAGUE — what specific discovery? |
| "Our contribution is a template for others" | NOT NOVEL — a template is not a scientific finding. |
| "We show results consistent with known biology" | CONFIRMING KNOWN FACTS is not a discovery. |
If the final question still matches these patterns after deliberation, it needs fundamental rethinking. Tell the user. </IRON-LAW>
Phase 2 — Research Question Deliberation Results:
══════════════════════════════════════════════════
Rounds completed: [N] / max 5
FINAL RESEARCH QUESTION:
"[refined question]"
Agent verdicts:
Senior Professor: [PASS/COND/FAIL] — [1-line summary]
Journal Editor: [PASS/COND/FAIL] — [1-line summary]
Research Scientist: [PASS/COND/FAIL] — [1-line summary]
Key improvements from deliberation:
Round 1 → 2: [what changed and why]
Round 2 → 3: [what changed and why]
...
Remaining concerns (if any):
⚠️ [concern] — [which agent, what they recommend]
Original question: "[original]"
Final question: "[refined]"
Explicitly classify the project. Ask the user to confirm ONE primary intent:
Write the confirmed value to research-anchor.yaml field project_intent. If the user selects multiple, require a primary and mark others as secondary.
Re-evaluate the target venue chosen in Phase 1 against the validated problem:
If novelty or scope is insufficient for the original target → recommend a specific downgrade with justification (e.g., "Workshop paper at X" or "Tier-2 venue Y"). Record the decision in research-anchor.yaml field venue_target.
ALL items must be ✅ to proceed:
Present this checklist to the user. Any ✗ → resolve before advancing.
| Excuse | Reality |
|---|---|
| "The problem is obviously important" | Obvious to you ≠ obvious to reviewers. Articulate the importance with evidence. |
| "We already discussed this in Phase 1" | Phase 1 explored. Phase 2 attacks. Different purpose. Answer the challenges. |
| "Adversarial questioning slows us down" | Discovering a fatal flaw in Phase 2 saves months. Answer now. |
| "The venue choice is fine" | Fine is not justified. Compare against the venue's recent acceptance profile. |
| "I can address reviewer concerns later" | If you can't answer now, you can't answer in a rebuttal either. |
| "Multiple intents make the paper stronger" | Unfocused contributions get rejected. Pick one primary intent. |
| "Resources will probably work out" | Probably ≠ verified. Check compute, data, and timeline concretely. |
Untested conviction is not research confidence.
Survive the defense before you build the method.
Gate passed → invoke method-framework-design skill for Phase 3. Any item unresolved → resolve with user before proceeding.
After presenting the Final Feasibility Assessment checklist, END YOUR RESPONSE IMMEDIATELY.
Do NOT invoke method-framework-design or any other skill in this same response.
Do NOT say "let me proceed to Phase 3."
Do NOT begin method or analysis design.
STOP. WAIT. The user must reply with approval before you do anything else.
Your final output for this phase should be the feasibility checklist followed by: "Phase 2 validation complete. Do you approve proceeding to Phase 3 (Method/Analysis Design)?"
Then STOP. </IRON-LAW>