Activate when the user needs to analyze qualitative data — interview transcripts, field notes, or open-ended survey responses. Handles structured summarization, thematic coding, cross-case analysis, theme matrices, and evidence retrieval. Designed to solve the context-window problem: generates compact summaries first, then works from summaries instead of full transcripts. Only loads full text when specific quotes are needed. Supports Gioia, Mayring, Grounded Theory, and general thematic analysis workflows.
Orchestration Log: When this skill is activated, append a log entry to
outputs/orchestration_log.md:### Skill Activation: Qualitative Engine **Timestamp:** [current date/time] **Actor:** AI Agent (qualitative-engine) **Input:** [brief description of the analysis request] **Output:** [brief description — e.g., "23 interviews summarized, 14 first-order codes identified"]
CRITICAL: Qualitative data (interview transcripts) can easily overflow the context window. Follow this strict protocol:
The scripts/process_interviews.py script provides:
from scripts.process_interviews import (
load_interviews, # Read all .md files from interviews/
build_index, # Generate INDEX.md with metadata
chunk_interview, # Split long transcripts into chunks
search_interviews, # Keyword search across all interviews
save_index, # Save index file
save_summary, # Save individual summaries
)
Transform each full transcript into a compact structured summary (~300 words) that preserves analytical value while reducing context consumption by 80-90%.
interviews = load_interviews("interviews/")
index = build_index(interviews)
save_index(index, "interviews/INDEX.md")
For EACH interview, read the full transcript and produce a summary in this exact format:
# Summary: [Interviewee Name / Title]
**Date:** [date] | **Role:** [professional role] | **Organization:** [org] | **Duration:** [if available]
## Context
[1-2 sentences: Who is this person? Why were they interviewed? What is their relevance?]
## Key Statements (verbatim quotes)
1. "[Direct quote — max 2 sentences]" — on [topic]
2. "[Direct quote — max 2 sentences]" — on [topic]
3. "[Direct quote — max 2 sentences]" — on [topic]
[3-5 quotes that capture the most analytically valuable statements]
## Core Themes Discussed
- **[Theme A]:** [2-3 sentence summary of their position/experience]
- **[Theme B]:** [2-3 sentence summary]
- **[Theme C]:** [2-3 sentence summary]
## Unique Insights
[1-2 sentences: What does this interviewee say that NO other interviewee says?
What is their unique contribution to the data set?]
## Relevance to Research Questions
- **RQ1:** [How does this interview inform RQ1? One sentence.]
- **RQ2:** [How does this interview inform RQ2? One sentence.]
- **RQ3:** [How does this interview inform RQ3? One sentence.]
[Adapt RQs from framing.md]
save_summary(summary_text, "interviews/summaries/[filename]_summary.md")
Identify empirical codes grounded in the data — what interviewees actually say.
Load ALL summary files (not full transcripts):
interviews/summaries/*.md
# Codebook v1 — First-Order Codes
**Date:** [date]
**Interviews coded:** [N]
**Total codes:** [N]
| Code ID | Code Label | Description | Example Quote | Frequency |
|---------|-----------|-------------|---------------|-----------|
| C01 | [label] | [what this code captures] | "[short quote]" — [interviewee] | [N interviews] |
| C02 | [label] | [what this code captures] | "[short quote]" — [interviewee] | [N interviews] |
| ... | | | | |
Save to: outputs/codebook_v1.md
| Code | Interview 1 | Interview 2 | Interview 3 | ... | Total |
|------|------------|------------|------------|-----|-------|
| C01 | ✓ | ✓ | | ... | N |
| C02 | | ✓ | ✓ | ... | N |
Save to: outputs/code_matrix.md
Group first-order codes into higher-level analytical themes.
If using the Gioia methodology, produce the three-level data structure:
# Gioia Data Structure
| First-Order Codes (Informant) | Second-Order Themes (Researcher) | Aggregate Dimensions |
|------------------------------|----------------------------------|---------------------|
| "We just got the tool and figured it out" | Ad-hoc AI adoption | **Unstructured Implementation** |
| "Nobody trained us on how to use it" | Missing capability building | |
| "We spent 2 hours on what used to take 2 days" | Efficiency gains from AI | **Process Transformation** |
| "The routine work basically disappeared" | Routine task elimination | |
| ... | | |
Save to: outputs/gioia_data_structure.md
# Theme Map
## Theme 1: [Name]
**Definition:** [What this theme captures]
**Codes included:** C01, C05, C12
**Prevalence:** [N] of [N] interviews
**Key insight:** [1 sentence]
## Theme 2: [Name]
...
Save to: outputs/theme_map.md
# Category System
## Main Category 1: [Name]
### Sub-Category 1.1: [Name]
**Definition:** [precise definition]
**Anchor example:** "[quote]" — [interviewee]
**Coding rule:** [when to apply this code]
### Sub-Category 1.2: [Name]
...
Save to: outputs/category_system.md
Systematic comparison across interviews to identify patterns and outliers.
# Cross-Case Analysis
| Interviewee | Role | Theme 1 | Theme 2 | Theme 3 | Theme 4 | Theme 5 |
|------------|------|---------|---------|---------|---------|---------|
| [Name 1] | [Role] | Strong | Moderate | Absent | Strong | Weak |
| [Name 2] | [Role] | Weak | Strong | Strong | Absent | Moderate |
| ... | | | | | | |
## Pattern Analysis
- **Universal themes** (present in >80% of interviews): [list]
- **Majority themes** (50-80%): [list]
- **Minority/emerging themes** (20-50%): [list]
- **Outlier insights** (<20%, but analytically important): [list]
## Role-Based Patterns
- **Analysts** tend to emphasize: [themes]
- **Senior management** tends to emphasize: [themes]
- **Technology roles** tend to emphasize: [themes]
## Contradictions and Tensions
- [Theme X] vs. [Theme Y]: [describe the tension and which interviewees represent each side]
Save to: outputs/cross_case_analysis.md
Go back to FULL transcripts to extract verbatim quotes for specific themes.
results = search_interviews(interviews, ["AI transformation", "copilot", "tool adoption"])
for r in results[:10]:
print(f"[{r['interview']}]: ...{r['context']}...")
For each theme, select 2-3 quotes that are:
# Evidence Table
## Theme 1: [Name]
| Quote | Interviewee | Role | Relevance |
|-------|------------|------|-----------|
| "[Verbatim quote]" | [Name] | [Role] | Representative — captures majority view on [topic] |
| "[Verbatim quote]" | [Name] | [Role] | Contrasting — shows alternative perspective |
| "[Verbatim quote]" | [Name] | [Role] | Rich — connects to [theoretical concept] |
## Theme 2: [Name]
...
Save to: outputs/evidence_table.md
When the writing-engine needs qualitative findings, provide:
## Findings
### [Theme 1 Name]
[Interpretive paragraph linking theme to research question. 2-3 sentences.]
Interviewees consistently described [pattern]. As [Interviewee Name], a [role], explained:
> "[Verbatim quote — 1-3 sentences]"
This pattern was echoed across [N] of [N] interviews, particularly among [role group].
[Additional interpretation, 2-3 sentences.]
### [Theme 2 Name]
...
Before presenting qualitative analysis to the user, verify:
interviews/summaries/)Activate this engine when the user: