An assessment engineering skill used by the quiz-maker agent. Provides item type design guides, distractor psychology, rubric construction, and formative/summative assessment strategies. Used for 'quiz design,' 'assessment items,' 'rubrics,' 'exam creation,' and related topics.
Expert knowledge used by the quiz-maker agent when designing formative and summative assessments.
Good assessment accurately measures learner understanding while simultaneously reinforcing learning. Poor assessment rewards memorization; good assessment rewards understanding.
| Element | Rule |
|---|---|
| Stem | Must be a clear question when read independently |
| Key (Correct Answer) | Must be unambiguously the best answer |
| Distractors | Plausible but clearly incorrect (see below) |
| Number of Options | 4 (5 adds meaningless wrong answers; 3 gives a 33% guess rate) |
| Option Length | Keep similar length (longest option being correct is a clue leak) |
| Distractor Type | Purpose | Example (Q: "Which HTTP method deletes a resource in REST?") |
|---|---|---|
| Common Misconception | Diagnose incorrect knowledge | "POST" (confusing create and delete) |
| Partial Answer | Diagnose superficial understanding | "PUT" (confusing update and delete) |
| Related Concept | Diagnose concept discrimination | "PATCH" (partial update, similar concept) |
| Correct: | "DELETE" |
[Scenario Setup]
Write a function that meets the following requirements.
[Input/Output Specification]
- Input: [Type and range]
- Output: [Type and conditions]
- Constraints: [Time/space complexity, restrictions]
[Example]
Input: [Example input]
Output: [Example output]
[Grading Criteria]
- [ ] Correctness: Passes all test cases
- [ ] Code Quality: Readability, naming
- [ ] Efficiency: Time/space complexity
[Scenario]
You are in the following situation. [Concrete situation description]
[Question]
What is the most appropriate [action/design/solution] in this situation?
[Evaluation Perspective]
- Problem identification ability
- Solution strategy appropriateness
- Trade-off awareness
| Attribute | Formative | Summative |
|---|---|---|
| Timing | During learning (every lesson) | After learning (end of module) |
| Purpose | Check understanding, immediate correction | Measure achievement, assign grades |
| Atmosphere | Low-stakes, practice | Formal, final |
| Item Count | 3-5 | 10-20 |
| Feedback | Immediate, detailed | Post-completion, aggregate |
| Retakes | Unlimited | 1-2 |
| Bloom's Level | 1-3 (Remember, Understand, Apply) | 3-6 (Apply through Create) |
| Criterion | Excellent (4) | Good (3) | Satisfactory (2) | Needs Improvement (1) |
|---|---|---|---|---|
| Functionality | All requirements met | 90% met | 70% met | Below 50% |
| Code Quality | Clean, consistent style | Mostly clean | Some inconsistency | Low readability |
| Error Handling | All edge cases | Major cases | Basic cases only | No error handling |
| Documentation | Detailed comments + README | Key function comments | Minimal comments | No comments |
E (Evidence): "Your code lacks error handling"
E (Effect): "In this case, the program will crash if the user enters an empty value"
C (Change): "Wrap it in a try-except block and show the user a guidance message"