Use when analyzing FASTQC quality reports from sequencing data, identifying quality issues in NGS datasets, or troubleshooting sequencing problems. Interprets quality metrics and provides actionable recommendations for RNA-seq, DNA-seq, and ChIP-seq data.
Analyze FASTQC quality control reports for Next-Generation Sequencing (NGS) data to assess data quality and identify issues.
from scripts.fastqc_interpreter import FASTQCInterpreter
interpreter = FASTQCInterpreter()
# Analyze report
analysis = interpreter.analyze("sample_fastqc.html")
print(f"Overall Quality: {analysis.quality_status}")
print(f"Issues Found: {analysis.issues}")
metrics = interpreter.parse_metrics("fastqc_data.txt")
Key Metrics:
| Metric | Good | Warning | Fail |
|---|
| Per base sequence quality | Q > 28 | Q 20-28 | Q < 20 |
| Per sequence quality scores | Peak at Q30 | Peak Q20-30 | Peak < Q20 |
| Per base N content | < 5% | 5-20% | > 20% |
| Sequence duplication | < 20% | 20-50% | > 50% |
| Adapter content | < 5% | 5-10% | > 10% |
issues = interpreter.diagnose_issues(metrics)
for issue in issues:
print(f"{issue.severity}: {issue.description}")
print(f"Recommendation: {issue.recommendation}")
Common Issues:
Low Quality at Read Ends
Adapter Contamination
High Duplication
Per Base Sequence Content Bias
batch_results = interpreter.analyze_batch(
fastqc_files=["sample1_fastqc.html", "sample2_fastqc.html", ...],
output_summary="batch_summary.csv"
)
recommendations = interpreter.get_recommendations(
analysis,
application="rna_seq", # or "dna_seq", "chip_seq"
quality_threshold="high"
)
Application-Specific Thresholds:
# Analyze single report
python scripts/fastqc_interpreter.py --input sample_fastqc.html
# Batch analysis
python scripts/fastqc_interpreter.py --batch "*fastqc.html" --output report.pdf
# With custom thresholds
python scripts/fastqc_interpreter.py --input fastqc.html --application rna_seq
PASS (Green): Proceed with analysis WARNING (Yellow): Review but likely acceptable FAIL (Red): Requires action before downstream analysis
See references/troubleshooting.md for:
Skill ID: 205 | Version: 1.0 | License: MIT