Name: Questionnaire Design Guide
Author: wentorai

Skills suchen.../

Questionnaire Design Guide | Skills Pool

Points	Scale Example	Recommended Use
4-point	Strongly Disagree to Strongly Agree	Forces choice (no neutral), less discriminating
5-point	SD, D, Neutral, A, SA	Most common, good balance of simplicity and discrimination
7-point	SD, D, Somewhat D, Neutral, Somewhat A, A, SA	More discriminating, better for experienced respondents
11-point (0-10)	Not at all to Completely	NPS, continuous-like measures

5-Point Agreement Scale:
1 = Strongly Disagree
2 = Disagree
3 = Neither Agree nor Disagree
4 = Agree
5 = Strongly Agree

5-Point Frequency Scale:
1 = Never
2 = Rarely
3 = Sometimes
4 = Often
5 = Always

5-Point Satisfaction Scale:
1 = Very Dissatisfied
2 = Dissatisfied
3 = Neutral
4 = Satisfied
5 = Very Satisfied

Regular:  "I find research methods interesting."        (1-5: SD to SA)
Reversed: "I find research methods tedious and dull."   (1-5: SD to SA)

# Recode reversed items before analysis:
# reversed_score = (max_scale + 1) - raw_score
# For a 5-point scale: reversed_score = 6 - raw_score

Construct: Belief in one's ability to conduct academic research

Items (5-point Likert, Strongly Disagree to Strongly Agree):
RSE1: I can formulate clear research questions.
RSE2: I can design an appropriate research methodology.
RSE3: I can analyze data using statistical software.
RSE4: I can write a publishable research paper.
RSE5: I can critically evaluate published research.
RSE6: I can present research findings at a conference.
RSE7R: I struggle to interpret statistical results. [REVERSED]
RSE8R: I find it difficult to synthesize literature. [REVERSED]

import pandas as pd
import numpy as np

# Define coding scheme
likert_coding = {
    "Strongly Disagree": 1,
    "Disagree": 2,
    "Neither Agree nor Disagree": 3,
    "Agree": 4,
    "Strongly Agree": 5
}

# Apply coding
df["Q1_coded"] = df["Q1_raw"].map(likert_coding)

# Reverse code specific items
reverse_items = ["RSE7R", "RSE8R"]
max_scale = 5
for item in reverse_items:
    df[f"{item}_recoded"] = (max_scale + 1) - df[item]

# Calculate composite score (mean of items)
scale_items = ["RSE1", "RSE2", "RSE3", "RSE4", "RSE5", "RSE6",
               "RSE7R_recoded", "RSE8R_recoded"]
df["RSE_mean"] = df[scale_items].mean(axis=1)

# Check missing data patterns
print(df[scale_items].isnull().sum())
print(f"Complete cases: {df[scale_items].dropna().shape[0]} / {df.shape[0]}")

# Common strategies:
# 1. Listwise deletion (if < 5% missing)
df_complete = df.dropna(subset=scale_items)

# 2. Mean imputation per item (simple but biased)
df[scale_items] = df[scale_items].fillna(df[scale_items].mean())

# 3. Person-mean imputation (if < 20% of items missing per person)
def person_mean_impute(row, items, max_missing=2):
    if row[items].isnull().sum() <= max_missing:
        return row[items].fillna(row[items].mean())
    return row[items]  # leave as NaN if too many missing

df[scale_items] = df.apply(lambda r: person_mean_impute(r, scale_items), axis=1)

import pingouin as pg

# Calculate Cronbach's alpha
alpha = pg.cronbach_alpha(df[scale_items])
print(f"Cronbach's alpha: {alpha[0]:.3f}")
# Interpretation: >= 0.70 acceptable, >= 0.80 good, >= 0.90 excellent

library(psych)

# Cronbach's alpha with item-level diagnostics
alpha_result <- alpha(data[, scale_items])
print(alpha_result)
# Check "raw_alpha if item dropped" to identify weak items

# Corrected item-total correlations (should be > 0.30)
item_stats <- alpha_result$item.stats
print(item_stats[, c("r.drop", "raw.alpha")])
# r.drop < 0.30: consider removing the item
# raw.alpha increases if dropped: item is weakening the scale

Validity Type	Method	Criterion
Content validity	Expert panel rating (CVI)	I-CVI >= 0.78, S-CVI/Ave >= 0.90
Construct validity	Exploratory Factor Analysis (EFA)	Eigenvalue > 1, loadings > 0.40
Convergent validity	Correlation with related construct	r > 0.30
Discriminant validity	Correlation with unrelated construct	r < 0.30
Criterion validity	Correlation with external criterion	Significant correlation
Test-retest reliability	ICC or Pearson r over 2-4 weeks	ICC > 0.70

Mistake	Example	Fix
Double-barreled question	"This course is interesting and useful"	Split into two separate items
Leading question	"Don't you agree that X is important?"	"How important is X to you?"
Absolute terms	"Do you always check citations?"	"How often do you check citations?"
Missing option	No "Not Applicable" when needed	Add N/A option or filter logic
Inconsistent scale direction	Some items 1=good, others 1=bad	Standardize direction; clearly mark reversed items
Too many items	100-item survey	Aim for 5-8 items per construct, 15-30 min total
No pilot test	Skip straight to full deployment	Always pilot with 30-50 respondents

Platform	Cost	Features	Best For
Qualtrics	Institutional	Advanced logic, panels, API	Large academic studies
SurveyMonkey	Freemium	Easy to use, basic analysis	Quick surveys
Google Forms	Free	Simple, integrates with Sheets	Classroom, pilot testing
LimeSurvey	Free/self-hosted	Open source, full control	Privacy-sensitive research
REDCap	Free (academic)	Clinical data, HIPAA compliant	Medical/clinical research
Prolific	Per-response	Participant recruitment	Online experiments

Type	Example	Best For	Analysis
Likert scale	"Rate your agreement: 1-5"	Attitudes, perceptions	Ordinal/interval statistics
Multiple choice	"Select your field"	Demographics, categories	Frequencies, chi-square
Ranking	"Rank these 5 options"	Preferences, priorities	Rank correlations
Open-ended	"Describe your experience"	Exploratory, rich data	Qualitative coding
Matrix/grid	Multiple items, same scale

Type	Example	Best For	Analysis
Likert scale	"Rate your agreement: 1-5"	Attitudes, perceptions	Ordinal/interval statistics
Multiple choice	"Select your field"	Demographics, categories	Frequencies, chi-square
Ranking	"Rank these 5 options"	Preferences, priorities	Rank correlations
Open-ended	"Describe your experience"	Exploratory, rich data	Qualitative coding
Matrix/grid	Multiple items, same scale

Questionnaire Design Guide

Survey Design Principles

Question Types

Questionnaire Design Guide

Survey Design Principles

Question Types

The Four C's of Good Questions

Likert Scale Design

Scale Points

Anchoring Labels

Reverse-Coded Items

Constructing a Multi-Item Scale

Step-by-Step Process

Example: Research Self-Efficacy Scale

Data Coding and Preparation

Coding Scheme

Missing Data Handling

Reliability Analysis

Cronbach's Alpha

Item-Total Correlations

Validity Assessment

Common Design Mistakes

Survey Platform Comparison

Update Skills

Eval Harness

Ecc Tools Cost Audit

Code Tour

Rules Distill

Design System