Core framework for evaluating research papers against a structured factor glossary. Supports any domain via DomainConfig JSON files. Use when setting up a new evaluation domain or understanding the framework.
This skill provides background knowledge about the domain-agnostic study evaluation framework. It is not directly invocable — use domain-specific skills (e.g., study-eval-neuro, study-eval-electronics) for actual evaluations.
The framework evaluates research papers by scoring them against a structured glossary of factors, across one or more contexts, organized into theory groups. Everything domain-specific is defined in a DomainConfig JSON file.
A single JSON file that fully parameterizes the framework for a specific research domain. See domain-config-schema.md for the full schema.
Independent evaluation dimensions (e.g., "Local Oddball" and "Global Oddball" in neuroscience, or "High Frequency" and "Low Frequency" in electronics). Each paper is scored separately for each context.
Logical groupings of factors (e.g., H1/H2/H3 hypotheses in neuroscience). Used for:
Individual measurable/scorable items within the domain. Each factor has:
id: Numeric identifierdef: Human-readable definitionrel: Relationship keywords (evidence cues)tag: "Quantitative", "Qualitative", or "Methodological"modes: Which contexts the factor applies to (e.g., ["LO", "GO"])theory_group: Which theory group this factor belongs toConfigurable per domain. Default: -1.0 to +1.0 with np.nan for not-addressed.
study-eval/
├── domains/
│ └── <domain_id>.json # DomainConfig files
├── core/
│ ├── config.py # DomainConfig loader + validator
│ ├── columns.py # Column naming utilities
│ ├── prompts.py # Prompt template generation
│ ├── evaluation.py # AI evaluation pipeline
│ ├── dashboard.py # Interactive dashboard
│ └── visualization.py # Plotting functions
└── notebooks/
└── generic_eval_demo.ipynb # Demo notebook
domains/neuroscience_predictive_coding.json).claude/skills/study-eval-<domain>/Factor columns follow the pattern: {context.column_prefix}_{sanitized_factor_name}
Sanitization: spaces → underscores, parentheses removed.
Example: Factor "Subtractive Inhibition (SST)" in context "Local Oddball" (prefix: Local_Oddball) → Local_Oddball_Subtractive_Inhibition_SST
The framework selects visualization type based on theory group count: