Name: Evidence Grading
Author: yzxoi

Overview

Evaluates the strength of evidence behind scientific claims based on study design, sample size, replication status, venue quality, and recency. Produces structured evidence grades that help teams know how much weight to put on any given finding.

When to Use

User asks "how strong is the evidence for [claim]?"
Lab needs to decide how confidently to build on a published result
Writing a manuscript and need to calibrate hedging language
Grant reviewers will scrutinize the strength of the preliminary evidence
User wants to rank a set of papers by evidentiary weight

Key Capabilities

Grade individual papers on study design quality (RCT > observational > case study, etc.)
Assess replication status: single study, replicated, widely replicated, meta-analyzed
Factor in sample size, effect size, and statistical rigor
Consider venue quality (Nature/Science/Cell vs. preprint vs. workshop)
Produce Oxford CEBM-style evidence levels or custom grade schema

Overview

When to Use

User asks "how strong is the evidence for [claim]?"
Lab needs to decide how confidently to build on a published result
Writing a manuscript and need to calibrate hedging language
Grant reviewers will scrutinize the strength of the preliminary evidence
User wants to rank a set of papers by evidentiary weight

Key Capabilities

Grade individual papers on study design quality (RCT > observational > case study, etc.)
Assess replication status: single study, replicated, widely replicated, meta-analyzed
Factor in sample size, effect size, and statistical rigor
Consider venue quality (Nature/Science/Cell vs. preprint vs. workshop)
Produce Oxford CEBM-style evidence levels or custom grade schema

Evidence Grading

Overview

When to Use

Key Capabilities

Evidence Grading

Overview

When to Use

Key Capabilities

Usage Examples

Grade a specific paper

Grade evidence for a claim across all supporting papers

Generate calibrated hedging language

Output Format

Best Paired With

Notes

Automation Audit Ops

Github Qa Labels

Jupyter Notebook

Tidb Integrationtest Recorder

Quality Nonconformance

Hugging Face Trackio