Name: Ai Reliability
Author: HemantSudarshan

搵技能.../

## Reliability Audit

### Testing
- [ ] Unit tests exist for core logic
- [ ] Integration tests exist for API
- [ ] Model quality tests exist (accuracy regression)
- [ ] Load tests have been run
- Total test coverage: ____%

### Monitoring
- [ ] Health check endpoint exists
- [ ] Latency tracked (p50, p95, p99)
- [ ] Error rate tracked
- [ ] GPU/CPU utilization tracked
- [ ] Model confidence distribution tracked
- [ ] Alerting configured

### Scaling
- [ ] Horizontal scaling configured
- [ ] Auto-scaling policies defined
- [ ] Resource limits set
- [ ] Load tested at expected peak

### Incident Response
- [ ] Runbook exists
- [ ] Rollback procedure documented
- [ ] On-call schedule defined
- [ ] Post-mortem process established

Test Type	What It Validates	Example
Unit Tests	Individual functions	Data preprocessing logic
Integration Tests	API + model together	POST /predict returns valid response
Model Quality Tests	Model accuracy hasn't regressed	F1 > 0.85 on test set
Contract Tests	API schema compliance	Response matches OpenAPI spec
Load Tests	Performance under stress	p99 < 2s at 100 RPS
Shadow Tests	New model vs old model comparison	Run both, compare outputs

# tests/test_unit.py
def test_preprocess_handles_empty_string():
    result = preprocess("")
    assert result == ""

def test_preprocess_removes_html():
    result = preprocess("<b>hello</b>")
    assert result == "hello"

# tests/test_integration.py
def test_predict_valid_input(client):
    response = client.post("/predict", json={"text": "great product"})
    assert response.status_code == 200
    assert "prediction" in response.json()

def test_predict_returns_confidence(client):
    response = client.post("/predict", json={"text": "test"})
    data = response.json()
    assert 0 <= data["confidence"] <= 1

# tests/test_model_quality.py
def test_model_accuracy_above_threshold():
    predictions = [model.predict(x) for x in test_data]
    accuracy = sum(p == l for p, l in zip(predictions, labels)) / len(labels)
    assert accuracy >= 0.85, f"Model accuracy {accuracy} below threshold 0.85"

def test_no_accuracy_regression():
    current_f1 = evaluate_model(model, test_set)
    baseline_f1 = load_baseline_metrics()["f1"]
    assert current_f1 >= baseline_f1 - 0.02, \
        f"F1 regressed: {current_f1} < {baseline_f1} - 0.02"

Pattern	When to Use	Implementation
Horizontal Scaling	Stateless APIs	Multiple replicas behind load balancer
Dynamic Batching	GPU inference	Collect N requests, batch inference
Response Caching	Repeated queries	Redis/in-memory LRU cache
Async Processing	Long-running inference	Queue (Redis/RabbitMQ) + workers
Auto-scaling	Variable traffic	K8s HPA, cloud auto-scale groups

apiVersion: autoscaling/v2

Ai Reliability | Skills Pool

Ai Reliability

Ai Reliability

Quality, Scalability, Reliability

Overview

Supporting References

Required Inputs

Step-by-Step Workflow

Phase 1 — Quality Assessment (Current State Audit)

Phase 2 — Test Suite Design & Implementation

Phase 3 — Scalability Architecture

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns