Open coding is the first major analytic move in classic grounded theory: you fracture qualitative data into incidents and label them with codes that stand for conceptual meanings. The goal is not to summarize paragraphs but to generate concepts that can be compared, refined, and later integrated around an emergent core category.
Use this skill whenever you are beginning analysis, returning to fresh data after a break, or deliberately re-opening coding after discovering a better fit.
Purpose and outcomes
Purpose: Turn raw data into discrete conceptual indicators that can enter constant comparison.
Typical outputs:
A working code list (codes with short definitions and examples).
Annotated data (tags, margin notes, or software-linked segments).
관련 스킬
Memos capturing hypotheses about meaning, relationships, and puzzles.
Early categories (higher-level groupings of codes with properties/dimensions beginning to appear).
Line-by-line and incident-to-incident technique
Line-by-line (dense passages)
For rich narrative segments, move line by line (or sentence by sentence) asking:
What is going on here? (substantive meaning)
What category(ies) does this indicate? (conceptual label)
What other incidents compare/contrast? (comparison cue)
Do not code “the whole paragraph” as one blob unless it truly expresses one incident.
Incident-to-incident comparison
An incident is a meaningful chunk of data that can be compared: an event, action, feeling, statement, turning point, or interactional move.
After coding an incident:
Ask: “What is this incident like? Unlike?”
Pull another incident (same interview, different interview, different source) and compare at the conceptual level.
Refine code definitions when comparisons reveal properties (attributes) and dimensions (ranges).
Types of codes you will use
Substantive codes
Conceptual labels derived from what the data is about. Examples (illustrative only): shielding credibility, deferring decisions, patching workflow. These should be abstract enough to travel across interviews, yet faithful to data.
In vivo codes
Use participants’ own terms when they condense meaning powerfully. In vivo codes keep you close to emic language while still enabling comparison.
Caution: Not every catchy phrase is conceptual. If an in vivo term is too idiosyncratic, translate it into a more general substantive code while preserving the participant language in a memo.
Gerund coding (“-ing”)
Glaser often recommends gerunds to keep process in view: managing time, containing conflict, signaling competence. This helps you see action/interaction rather than static nouns.
Provisional “hypothesis codes”
Early codes that may be wrong are fine. Classic GT expects code churn. Rename, split, merge—document changes in memos.
Fracturing data (practical moves)
Break on shifts: new topic, new actor, new emotion, new tactic, new consequence.
Break on strong lines: especially vivid, repeated, or consequential moments.
Break on comparisons you can already imagine: “This sounds like X, but also unlike X because…”
Tag “don’t know yet”: if unsure, use a temporary label + memo the uncertainty.
Worked example (abbreviated)
Data excerpt (fictional):
“I didn’t tell my manager about the side project at first. I needed to see if it would even work. Once it looked real, I scheduled a short chat and framed it as ‘experiment’ not ‘commitment.’”
Compare to another incident where the participant disclosed immediately—what conditions differ?
Compare to an incident of never disclosing—what consequences differ?
Memo seed:
“Hypothesis: disclosure timing is managed through viability thresholds + linguistic risk reduction. Need more negative cases where disclosure backfired.”
Output format (recommended)
1) Code entry template
Code name:
Definition (1–3 sentences):
Inclusion criteria:
Exclusion criteria (common confusions):
Example incidents (2–3 brief quotes with source IDs):
Related codes (merge/split notes):
Open questions:
2) Session log (audit trail)
Date / dataset:
Incidents coded (count) + IDs:
New codes added:
Codes renamed/merged/split (why):
Memos written (titles):
Next comparison targets:
Quality checks during open coding
Abstract enough to compare across cases, concrete enough to trace to incidents.
Avoid laundry lists of topics; aim for analytic codes.
Memo whenever you claim a relationship (“because,” “in order to,” “leads to”).
Seek negative cases early: incidents that seem to contradict your “favorite” code.
When to move toward selective coding
You are not finished with open coding in a single week; it matures through comparison. Move toward selective coding when:
A core category candidate recurs, explains a wide swath of variation, and connects many other categories.
You can delimit: additional open coding feels redundant relative to the emerging integrated story.
Theoretical sampling questions become targeted at the core and its related conditions/strategies/consequences.
If you move too early, you risk forcing a core. If you move too late, you risk endless sprawl. Let comparison + memo sorting guide the transition.
Common pitfalls
Paraphrasing instead of conceptualizing.
Over-coding with synonyms; merge aggressively after comparison.
Under-memoing; losing the trail of theoretical reasoning.
Theme sorting without comparison (categories that “sound nice” but don’t explain).
Premature literature import that names your phenomenon before you’ve earned concepts.
Key references
Glaser, B. G. (1992). Basics of grounded theory analysis. Sociology Press.
Glaser, B. G. (1978). Theoretical sensitivity. Sociology Press.
Glaser, B. G. (1998). Doing grounded theory. Sociology Press.