Designs and orchestrates a realistic interview simulation platform with voice AI, whiteboard evaluation, gaze-tracking proctoring, and mobile spaced repetition. Use for building mock interview infrastructure, configuring sessions, and adaptive difficulty. Activate on "interview simulator", "mock interview", "practice session", "voice mock". NOT for individual round-type coaching, resume writing, or prep timeline coordination.
Platform architecture and coaching system for realistic mock interview practice. This skill serves two purposes: (1) it coaches candidates on how to structure effective practice sessions, and (2) it specifies the full-stack architecture for building an automated interview simulation platform with voice AI, collaborative whiteboard, gaze-tracking proctoring, and mobile companion.
The other 7 interview skills define WHAT to practice. This skill defines HOW to practice it -- with realistic conditions, adaptive difficulty, and measurable progress.
Use for:
NOT for:
interview-loop-strategist)cv-creator or career-biographer)graph TB
subgraph Client["Client Layer"]
MOBILE["Mobile App<br/>React Native + Expo<br/>Flash cards, voice drills,<br/>progress dashboard"]
DESKTOP["Desktop Web<br/>Next.js<br/>Full sessions, whiteboard,<br/>proctoring"]
end
subgraph Engines["Engine Layer"]
VOICE["Voice Engine<br/>Hume AI EVI<br/>Emotion-sensitive<br/>interviewer voice"]
BOARD["Whiteboard Engine<br/>tldraw + Claude Vision<br/>Diagram evaluation<br/>and scoring"]
PROCTOR["Proctor Engine<br/>MediaPipe Face Mesh<br/>Gaze tracking,<br/>attention monitoring"]
end
subgraph Orchestrator["Session Orchestrator — Node.js"]
ROUND["Round Selector<br/>Weakness-weighted<br/>random selection"]
ADAPT["Adaptive Difficulty<br/>Performance-based<br/>question scaling"]
DEBRIEF["Debrief Generator<br/>Transcript + emotion +<br/>proctor + whiteboard<br/>scored rubric"]
SM2["SM-2 Scheduler<br/>Spaced repetition<br/>for concepts and stories"]
end
subgraph Data["Data Layer — Supabase"]
SESSIONS[("sessions<br/>recordings, transcripts")]
SCORES[("scores<br/>per-dimension breakdowns")]
STORIES[("story_bank<br/>STAR-L entries")]
CARDS[("flash_cards<br/>SM-2 intervals")]
end
MOBILE --> Orchestrator
DESKTOP --> Orchestrator
Orchestrator --> VOICE
Orchestrator --> BOARD
Orchestrator --> PROCTOR
Orchestrator --> Data
VOICE --> DEBRIEF
BOARD --> DEBRIEF
PROCTOR --> DEBRIEF
flowchart TD
V{Voice AI?}
V -->|"Emotion detection needed"| HUME["Hume AI EVI<br/>Emotion callbacks,<br/>adaptive persona,<br/>WebSocket streaming"]
V -->|"Voice only, no emotion"| ELEVEN["ElevenLabs<br/>Fallback: high-quality<br/>TTS, no affect reading"]
V -->|"Cost-constrained"| OPENAI_RT["OpenAI Realtime API<br/>Cheaper per minute,<br/>no emotion detection"]
W{Whiteboard?}
W -->|"React ecosystem, extensible"| TLDRAW["tldraw<br/>MIT license, React native,<br/>rich API, snapshot export"]
W -->|"Simpler, self-hosted"| EXCALI["Excalidraw<br/>Good but harder to<br/>integrate programmatic<br/>screenshot capture"]
P{Proctoring?}
P -->|"Privacy-first, free"| MEDIAPIPE["MediaPipe Face Mesh<br/>Browser-based, 468 landmarks,<br/>iris tracking, no cloud"]
P -->|"Commercial accuracy"| COMMERCIAL["Commercial proctoring<br/>Expensive, privacy concerns,<br/>overkill for self-practice"]
style HUME fill:#2d5016,stroke:#333,color:#fff
style TLDRAW fill:#2d5016,stroke:#333,color:#fff
style MEDIAPIPE fill:#2d5016,stroke:#333,color:#fff
Why Hume over OpenAI Realtime API: Hume's EVI provides emotion callbacks (nervousness, confidence, hesitation) that enable adaptive interviewer behavior. OpenAI's Realtime API is voice-only with no affect detection. For interview simulation, emotion awareness is the differentiator -- a real interviewer adjusts based on your emotional state.
Why tldraw over Excalidraw: tldraw is a React component with a rich programmatic API. You can call editor.getSnapshot() to capture the canvas state, export to image, and send to Claude Vision for evaluation. Excalidraw's API is more limited for programmatic interaction.
Why MediaPipe over commercial proctoring: This is self-practice, not exam proctoring. MediaPipe runs entirely in the browser (no cloud), processes 468 face landmarks including iris position for gaze estimation, and costs nothing. Commercial proctoring (ProctorU, ExamSoft) is designed for adversarial exam settings with privacy trade-offs that make no sense for personal practice.
sequenceDiagram
participant U as User
participant O as Orchestrator
participant V as Voice Engine
participant W as Whiteboard
participant P as Proctor
participant D as Debrief
U->>O: Start session
O->>O: Select round type<br/>(weakness-weighted)
O->>U: Confirm: ML Design, Difficulty 3/5,<br/>Persona: Collaborative
U->>O: Accept / override
O->>V: Initialize interviewer persona
O->>P: Activate gaze tracking
alt Design or Coding Round
O->>W: Open whiteboard
end
loop During Session (30-45 min)
V->>U: Ask question / follow-up
U->>V: Respond (voice)
V->>O: Emotion data (confidence, hesitation)
O->>V: Adjust difficulty / tone
P->>O: Gaze flags (second monitor, notes)
alt Design Round
W-->>O: Periodic screenshot (every 30s active)
O-->>W: Evaluate diagram (Claude Vision)
end
end
U->>O: End session
O->>D: Compile transcript + emotion<br/>timeline + proctor flags +<br/>whiteboard evaluations
D->>U: Scored debrief with<br/>strengths, weaknesses,<br/>specific improvement actions
O->>O: Update weakness tracker,<br/>adjust next session focus
| Parameter | Options | Default |
|---|---|---|
| Round type | Coding, ML Design, Behavioral, Tech Presentation, HM, Technical Deep Dive | Auto (weakness-weighted) |
| Difficulty | 1 (warm-up) to 5 (adversarial) | 3 |
| Interviewer persona | Friendly, Neutral, Adversarial, Socratic | Neutral |
| Proctor strictness | Off, Training (lenient), Simulation (strict) | Training |
| Session length | 15 / 30 / 45 / 60 min | 45 min |
| Whiteboard | On / Off | Auto (on for design rounds) |
| Recording | Audio only / Audio + Video / Off | Audio only |