Name: Real Pytest No Mocks Real Tests
Author: robertcope

Skills suchen.../

Real Pytest No Mocks Real Tests | Skills Pool

# Read THIS (public interface)
class ReminderTool:
    def run(self, operation: str, **kwargs) -> Dict[str, Any]:
        """Execute reminder operations."""
        pass

# DO NOT read implementation details
# DO NOT look at internal methods
# DO NOT read how it's implemented

MODULE CONTRACT ANALYSIS
========================

1. What is this module's PURPOSE?
   - What problem does it solve?
   - Why does it exist?

2. What GUARANTEES does it provide?
   - What promises does the API make?
   - What invariants must hold?
   - What post-conditions are guaranteed?

3. What should SUCCEED?
   - Valid inputs
   - Happy path scenarios
   - Boundary cases that should work

4. What should FAIL?
   - Invalid inputs
   - Boundary conditions that should error
   - Security violations
   - Resource constraints

5. What are the DEPENDENCIES?
   - What does this module depend on?
   - Are there too many dependencies?
   - Could this be simpler?

6. ARCHITECTURAL CONCERNS:
   - Is this module doing too much?
   - Is it papering over design failures elsewhere?
   - Does the contract make sense or is it convoluted?
   - Should this module even exist?

# Use Task tool to invoke the agent
Task(
    subagent_type="contract-extractor",
    description="Extract contract from module",
    prompt="""Extract the contract from: path/to/module.py

Return:
- Public interface (methods, signatures, types)
- Actual return structures (dict keys, types)
- Exception contracts (what raises what, when)
- Edge cases handled
- Dependencies and architectural concerns"""
)

EXPECTATION vs REALITY
======================

Expected return structure:
{
    "status": str,
    "results": list
}

Actual return structure (from agent):
{
    "status": str,
    "confidence": float,  # I MISSED THIS
    "results": list,
    "result_count": int   # I MISSED THIS
}

Expected exceptions:
- ValueError for empty query

Actual exceptions (from agent):
- ValueError for empty query ✓
- ValueError for negative max_results  # I MISSED THIS

Expected edge cases:
- Empty results returns []

Actual edge cases (from agent):
- Empty results returns status="low_confidence", confidence=0.0, results=[]
  # More nuanced than I expected

DISCREPANCY: Agent reports confidence field in return, I didn't expect it
IMPLICATION: This is part of the contract - add test to verify confidence in [0.0, 1.0]

DISCREPANCY: Agent reports ValueError for negative max_results, I didn't expect it
IMPLICATION: Good edge case handling - add negative test

DISCREPANCY: Agent reports 8 dependencies, I expected 3-4
IMPLICATION: ARCHITECTURAL CONCERN - too many deps, report to human

# Based on VERIFIED contract (not assumptions):

# Positive tests
- test_search_returns_exact_structure  # Verify all keys agent reported
- test_search_confidence_in_valid_range  # Agent said 0.0-1.0
- test_search_respects_max_results  # Agent confirmed this guarantee

# Negative tests
- test_search_rejects_empty_query  # Agent confirmed ValueError
- test_search_rejects_negative_max_results  # Agent revealed this

# Edge cases
- test_search_empty_results_structure  # Agent showed exact structure
- test_search_with_no_user_data  # Based on RLS info from agent

# Architectural concerns
- Report to human: "Module has 8 dependencies - possible SRP violation"

def test_search_returns_confidence_score(search_tool, authenticated_user):
    """Contract: search must return confidence score between 0.0 and 1.0"""
    user_id = authenticated_user["user_id"]
    set_current_user_id(user_id)

    # Based on contract, not implementation
    result = search_tool.run(
        operation="search",
        query="Python async patterns",
        max_results=5
    )

    # Contract expectations
    assert "confidence" in result
    assert 0.0 <= result["confidence"] <= 1.0
    assert "results" in result
    assert len(result["results"]) <= 5

pytest tests/test_search_tool.py::test_search_returns_confidence_score -v

# In the actual implementation, temporarily break it:
def run(self, operation, **kwargs):
    return {"confidence": 2.5}  # INTENTIONAL BUG: exceeds 1.0

Anti-Pattern	Why It's Wrong	What To Do Instead
Mocking	Tests mocks, not code. Hides integration issues.	Use real services (sqlite_test_db, test_db). If hard to test, fix design.
Reading implementation first	Tests mirror HOW instead of WHAT. Confirms current behavior, doesn't catch regressions.	Analyze contract WITHOUT reading code. Use contract-extractor agent.
Tests that mirror implementation	Testing that method calls BM25 then embeddings (HOW) vs testing returns relevant results (WHAT).	Test observable contract behavior, not internal paths.
Weak assertions	`assert result is not None` says nothing.	Precise: `assert 0.0 <= result["confidence"] <= 1.0`
Only happy paths	Missing adversarial cases means bugs slip through.	Test failure cases: empty inputs, invalid values, boundary conditions.
Missing negative tests	Only testing what should succeed.	Test what should FAIL with pytest.raises and match=
Testing private methods	`tool._internal()` means public interface insufficient.	Report: "Public interface doesn't expose needed contract."
Papering over design problems	Mocking 8 dependencies instead of reporting.	Report: "Module has 8 dependencies - violates SRP."
Complex test setup	Need 5 fixtures for one test = tight coupling.	Report: "Module too coupled - consider interface segregation."
Unclear contract	Can't answer "what SHOULD this return?"	Report: "Contract doesn't specify behavior for None values."
Module papering over upstream failures	Tool validates/fixes data from another module.	Report: "Fix upstream module, don't compensate downstream."

ARCHITECTURAL CONCERN: ToolName

PROBLEM: [Specific issue]
EVIDENCE: [What you observed]
IMPACT: [Why this matters]
RECOMMENDATION: [Specific fix]

# This tests nothing about real behavior:
@mock.patch('tool.database')
def test_reminder_tool(mock_db):
    mock_db.query.return_value = [{"id": 1, "title": "Test"}]
    result = tool.get_reminders()
    # You have NO IDEA if real code works

# This will fail if database schema changes:
def test_reminder_tool(sqlite_test_db):
    tool = ReminderTool()
    tool.run("add_reminder", title="Test", date="2025-01-01")

    # Real database query
    rows = sqlite_test_db.execute("SELECT * FROM reminders WHERE title = ?", ("Test",))
    assert len(rows) == 1

def test_anything(authenticated_user):
    # These are ALL set up and ready:
    user_id = authenticated_user["user_id"]           # Test user ID
    continuum_id = authenticated_user["continuum_id"] # Test user's continuum (ALREADY EXISTS)
    email = authenticated_user["email"]               # [email protected]
    token = authenticated_user["access_token"]        # Valid session token

    # User context is ALREADY SET - just use user_id
    # Continuum ALREADY EXISTS - just add messages to it
    # Cleanup happens AUTOMATICALLY - no manual teardown needed

# ✅ CORRECT - Just use the continuum_id that's already there
def test_add_messages(authenticated_user, test_db):
    user_id = authenticated_user["user_id"]
    continuum_id = authenticated_user["continuum_id"]  # Use this!
    set_current_user_id(user_id)  # Set context

    repo = get_continuum_repository()
    msg = Message(role="user", content="Test message")
    repo.save_message(msg, continuum_id, user_id)  # Just save directly

    # Message is in the database, ready to test

# ❌ WRONG - Don't create new continuums (test user already has one)
def test_add_messages_wrong(authenticated_user):
    user_id = authenticated_user["user_id"]
    repo = get_continuum_repository()
    continuum = repo.create_continuum(user_id)  # DON'T DO THIS!
    # This creates a SECOND continuum - user should only have ONE

# Most common: Just add messages to test user's continuum
def test_tool(authenticated_user):
    user_id = authenticated_user["user_id"]
    continuum_id = authenticated_user["continuum_id"]
    set_current_user_id(user_id)

    # Add test data
    repo = get_continuum_repository()
    msg = Message(role="user", content="Test")
    repo.save_message(msg, continuum_id, user_id)

    # Test your code
    result = tool.run("search", query="Test")
    assert result["status"] == "success"

# API testing: authenticated_client has headers pre-set
def test_api(authenticated_client):
    response = authenticated_client.get("/v0/api/endpoint")
    assert response.status_code == 200

TEST_USER_EMAIL = "[email protected]"
SECOND_TEST_USER_EMAIL = "[email protected]"
# User IDs vary, always use authenticated_user["user_id"]

def test_user_isolation(authenticated_user, second_authenticated_user):
    """Verify RLS prevents cross-user data access."""
    user1_id = authenticated_user["user_id"]
    user1_continuum_id = authenticated_user["continuum_id"]

    user2_id = second_authenticated_user["user_id"]
    user2_continuum_id = second_authenticated_user["continuum_id"]

    # User 1 creates private data
    set_current_user_id(user1_id)
    repo = get_continuum_repository()
    msg1 = Message(role="user", content="User 1 secret data")
    repo.save_message(msg1, user1_continuum_id, user1_id)

    # User 2 tries to access User 1's data
    set_current_user_id(user2_id)
    result = search_tool.run("search", query="secret", max_results=10)

    # Verify User 2 cannot see User 1's data
    assert len(result["results"]) == 0, "RLS violation: User 2 can see User 1's data"

authenticated_user = {
    "user_id": str,           # First test user ID
    "continuum_id": str,      # First test user's continuum
    "email": str,             # [email protected]
    "access_token": str       # Valid session token
}

second_authenticated_user = {
    "user_id": str,           # Second test user ID (different UUID)
    "continuum_id": str,      # Second test user's continuum (different UUID)
    "email": str,             # [email protected]
    "access_token": str       # Valid session token
}

@pytest.mark.schema_files(['tools/implementations/reminder_tool_schema.sql'])
def test_reminder_tool(sqlite_test_db):
    tool = ReminderTool()
    tool.run("add_reminder", title="Test", date="2025-01-01")

    rows = sqlite_test_db.execute("SELECT * FROM reminders WHERE title = ?", ("Test",))
    assert len(rows) == 1

Real Pytest No Mocks Real Tests

Real Testing Philosophy

CRITICAL MINDSET SHIFT

🚨 NEVER SKIP TESTS

Real Pytest No Mocks Real Tests

Real Testing Philosophy

CRITICAL MINDSET SHIFT

🚨 NEVER SKIP TESTS

PHASE 1: Contract-First Analysis (DO THIS FIRST)

Protocol: Analyze Contract Without Reading Implementation

PHASE 1.5: Contract Verification (VALIDATE YOUR ASSUMPTIONS)

Why This Phase Exists

Protocol: Invoke Agent → Compare → Identify Gaps

When to Read Implementation

PHASE 2: Fail-First Verification (PROVE TESTS CAN FAIL)

Protocol: Write → Fail → Verify

Common Testing Anti-Patterns

Core Principle: NEVER MOCK

Why No Mocking?

MIRA Test Infrastructure

🚨 USE authenticated_user - IT'S FULLY SET UP

🔒 Using second_authenticated_user for RLS Testing

Test Organization - Mirror the Codebase

Test

Feature Flags

Unit Tests

Integration Tests

Write Frontend Tests

Golang Testing