Questionnaire and survey design with Likert scales and coding
Design valid and reliable survey instruments with proper question types, Likert scale construction, response coding, and data preparation for analysis.
| Type | Example | Best For | Analysis |
|---|---|---|---|
| Likert scale | "Rate your agreement: 1-5" | Attitudes, perceptions | Ordinal/interval statistics |
| Multiple choice | "Select your field" | Demographics, categories | Frequencies, chi-square |
| Ranking | "Rank these 5 options" | Preferences, priorities | Rank correlations |
| Open-ended | "Describe your experience" | Exploratory, rich data | Qualitative coding |
| Matrix/grid | Multiple items, same scale |
| Efficient battery of items |
| Factor analysis, reliability |
| Slider/VAS | 0-100 visual analog scale | Continuous measures | Parametric statistics |
| Semantic differential | "Easy __ __ __ __ __ Difficult" | Bipolar attitudes | Factor analysis |
| Points | Scale Example | Recommended Use |
|---|---|---|
| 4-point | Strongly Disagree to Strongly Agree | Forces choice (no neutral), less discriminating |
| 5-point | SD, D, Neutral, A, SA | Most common, good balance of simplicity and discrimination |
| 7-point | SD, D, Somewhat D, Neutral, Somewhat A, A, SA | More discriminating, better for experienced respondents |
| 11-point (0-10) | Not at all to Completely | NPS, continuous-like measures |
5-Point Agreement Scale:
1 = Strongly Disagree
2 = Disagree
3 = Neither Agree nor Disagree
4 = Agree
5 = Strongly Agree
5-Point Frequency Scale:
1 = Never
2 = Rarely
3 = Sometimes
4 = Often
5 = Always
5-Point Satisfaction Scale:
1 = Very Dissatisfied
2 = Dissatisfied
3 = Neutral
4 = Satisfied
5 = Very Satisfied
Include 2-3 reverse-coded items per construct to detect acquiescence bias:
Regular: "I find research methods interesting." (1-5: SD to SA)
Reversed: "I find research methods tedious and dull." (1-5: SD to SA)
# Recode reversed items before analysis:
# reversed_score = (max_scale + 1) - raw_score
# For a 5-point scale: reversed_score = 6 - raw_score
Construct: Belief in one's ability to conduct academic research
Items (5-point Likert, Strongly Disagree to Strongly Agree):
RSE1: I can formulate clear research questions.
RSE2: I can design an appropriate research methodology.
RSE3: I can analyze data using statistical software.
RSE4: I can write a publishable research paper.
RSE5: I can critically evaluate published research.
RSE6: I can present research findings at a conference.
RSE7R: I struggle to interpret statistical results. [REVERSED]
RSE8R: I find it difficult to synthesize literature. [REVERSED]
import pandas as pd
import numpy as np
# Define coding scheme
likert_coding = {
"Strongly Disagree": 1,
"Disagree": 2,
"Neither Agree nor Disagree": 3,
"Agree": 4,
"Strongly Agree": 5
}
# Apply coding
df["Q1_coded"] = df["Q1_raw"].map(likert_coding)
# Reverse code specific items
reverse_items = ["RSE7R", "RSE8R"]
max_scale = 5
for item in reverse_items:
df[f"{item}_recoded"] = (max_scale + 1) - df[item]
# Calculate composite score (mean of items)
scale_items = ["RSE1", "RSE2", "RSE3", "RSE4", "RSE5", "RSE6",
"RSE7R_recoded", "RSE8R_recoded"]
df["RSE_mean"] = df[scale_items].mean(axis=1)
# Check missing data patterns
print(df[scale_items].isnull().sum())
print(f"Complete cases: {df[scale_items].dropna().shape[0]} / {df.shape[0]}")
# Common strategies:
# 1. Listwise deletion (if < 5% missing)
df_complete = df.dropna(subset=scale_items)
# 2. Mean imputation per item (simple but biased)
df[scale_items] = df[scale_items].fillna(df[scale_items].mean())
# 3. Person-mean imputation (if < 20% of items missing per person)
def person_mean_impute(row, items, max_missing=2):
if row[items].isnull().sum() <= max_missing:
return row[items].fillna(row[items].mean())
return row[items] # leave as NaN if too many missing
df[scale_items] = df.apply(lambda r: person_mean_impute(r, scale_items), axis=1)
import pingouin as pg
# Calculate Cronbach's alpha
alpha = pg.cronbach_alpha(df[scale_items])
print(f"Cronbach's alpha: {alpha[0]:.3f}")
# Interpretation: >= 0.70 acceptable, >= 0.80 good, >= 0.90 excellent
library(psych)
# Cronbach's alpha with item-level diagnostics
alpha_result <- alpha(data[, scale_items])
print(alpha_result)
# Check "raw_alpha if item dropped" to identify weak items
# Corrected item-total correlations (should be > 0.30)
item_stats <- alpha_result$item.stats
print(item_stats[, c("r.drop", "raw.alpha")])
# r.drop < 0.30: consider removing the item
# raw.alpha increases if dropped: item is weakening the scale
| Validity Type | Method | Criterion |
|---|---|---|
| Content validity | Expert panel rating (CVI) | I-CVI >= 0.78, S-CVI/Ave >= 0.90 |
| Construct validity | Exploratory Factor Analysis (EFA) | Eigenvalue > 1, loadings > 0.40 |
| Convergent validity | Correlation with related construct | r > 0.30 |
| Discriminant validity | Correlation with unrelated construct | r < 0.30 |
| Criterion validity | Correlation with external criterion | Significant correlation |
| Test-retest reliability | ICC or Pearson r over 2-4 weeks | ICC > 0.70 |
| Mistake | Example | Fix |
|---|---|---|
| Double-barreled question | "This course is interesting and useful" | Split into two separate items |
| Leading question | "Don't you agree that X is important?" | "How important is X to you?" |
| Absolute terms | "Do you always check citations?" | "How often do you check citations?" |
| Missing option | No "Not Applicable" when needed | Add N/A option or filter logic |
| Inconsistent scale direction | Some items 1=good, others 1=bad | Standardize direction; clearly mark reversed items |
| Too many items | 100-item survey | Aim for 5-8 items per construct, 15-30 min total |
| No pilot test | Skip straight to full deployment | Always pilot with 30-50 respondents |
| Platform | Cost | Features | Best For |
|---|---|---|---|
| Qualtrics | Institutional | Advanced logic, panels, API | Large academic studies |
| SurveyMonkey | Freemium | Easy to use, basic analysis | Quick surveys |
| Google Forms | Free | Simple, integrates with Sheets | Classroom, pilot testing |
| LimeSurvey | Free/self-hosted | Open source, full control | Privacy-sensitive research |
| REDCap | Free (academic) | Clinical data, HIPAA compliant | Medical/clinical research |
| Prolific | Per-response | Participant recruitment | Online experiments |