Replaces Phoenix observability with Langfuse Cloud (EU) traceability for pharmaceutical test generation. Adds @observe decorators to existing code, configures LlamaIndex callbacks, propagates GAMP-5 compliance attributes, and removes Phoenix dependencies. Use PROACTIVELY when implementing Task 2.3 (LangFuse setup), migrating observability systems, or ensuring ALCOA+ trace attribution. MUST BE USED for pharmaceutical compliance monitoring requiring persistent cloud storage.
Purpose: Replace Phoenix observability with Langfuse Cloud (EU) for pharmaceutical-grade traceability and monitoring.
Target Architecture:
✅ Use when:
❌ Do NOT use when:
langfuse-extraction skill)langfuse-dashboard skill)Before invoking this skill, verify:
Langfuse Cloud (EU) Account:
https://cloud.langfuse.com/project/cmhuwhcfe006yad06cqfub107Environment Variables:
export LANGFUSE_PUBLIC_KEY="pk-lf-..."
export LANGFUSE_SECRET_KEY="sk-lf-..."
export LANGFUSE_HOST="https://cloud.langfuse.com"
Dependencies:
langfuse Python package (will be installed if missing)llama-index-core>=0.12.0 (for callback handler)Objective: Understand current Phoenix instrumentation and identify migration points.
Steps:
Locate Phoenix Configuration:
# Search for Phoenix setup
grep -r "phoenix" main/src/monitoring/ --include="*.py"
grep -r "from phoenix" main/src/ --include="*.py"
grep -r "import phoenix" main/src/ --include="*.py"
Identify Instrumentation Points:
main/src/core/unified_workflow.py - identify workflow entry pointsmain/src/agents/ - identify agent methods needing tracingAnalyze Compliance Attributes:
Generate Assessment Report:
# Phoenix → Langfuse Migration Assessment
## Current Phoenix Instrumentation
- Configuration file: <path>
- Instrumented files: <count>
- Span count per workflow: <number>
- Compliance attributes: <present/missing>
## Migration Scope
- Files requiring decorator addition: <list>
- Phoenix imports to remove: <count>
- Callback handlers to replace: <list>
- Estimated migration time: <minutes>
## Risk Assessment
- Breaking changes: <yes/no>
- Test coverage: <percentage>
- Rollback complexity: <low/medium/high>
Quality Gate: Assessment report generated with complete file inventory and attribute analysis.
Objective: Create Langfuse configuration module and verify cloud connectivity.
Steps:
Install Langfuse SDK:
# Add to pyproject.toml
uv add langfuse
# For LlamaIndex integration
uv add llama-index-instrumentation-langfuse
Create Langfuse Configuration Module:
main/src/monitoring/langfuse_config.pyreference/decorator-patterns.md for templatesetup_langfuse(): Initialize client with EU cloud configget_langfuse_client(): Singleton accessorget_langfuse_callback_handler(): LlamaIndex integrationadd_compliance_attributes(): GAMP-5/ALCOA+ attribute helperVerify Cloud Connectivity:
# Test script (temporary)
from main.src.monitoring.langfuse_config import setup_langfuse
client = setup_langfuse()
client.trace(name="connectivity-test", input={"test": True})
client.flush()
# Verify trace appears at:
# https://cloud.langfuse.com/project/cmhuwhcfe006yad06cqfub107/traces
Update Environment Configuration:
.env.examplemain/src/config.py to load Langfuse settingsObservabilityConfig dataclassQuality Gate:
langfuse_config.py created and testedObjective: Add @observe decorators and replace Phoenix callbacks with Langfuse.
Steps:
Add Decorators to Workflow Entry Points:
Use the automated script for systematic instrumentation:
python .claude/skills/langfuse-integration/scripts/add_instrumentation.py \
--target main/src/core/unified_workflow.py \
--dry-run # Preview changes first
Manual pattern (if script unavailable):
# main/src/core/unified_workflow.py
from langfuse import observe
class UnifiedWorkflow(Workflow):
@observe(name="unified-workflow-run", as_type="span")
async def run(self, ctx: Context, ev: StartEvent) -> StopEvent:
# Existing code unchanged
...
Instrument Agent Methods:
Target key agent operations:
# main/src/agents/categorizer.py
from langfuse import observe
@observe(name="gamp5-categorization", as_type="span")
async def categorize_urs(self, urs_content: str) -> dict:
# Add compliance attributes
from langfuse import get_current_observation
obs = get_current_observation()
if obs:
obs.update(metadata={
"compliance.gamp5.applicable": True,
"compliance.alcoa_plus.attributable": True
})
# Existing categorization logic
result = await self._categorize(urs_content)
# Tag with category
if obs:
obs.update(metadata={
"compliance.gamp5.category": result["category"]
})
return result
Replace LlamaIndex Callback Handler:
# main/src/core/unified_workflow.py or main/main.py
# OLD (Phoenix):
# from phoenix.otel import register
# tracer_provider = register()
# NEW (Langfuse):
from langfuse.llama_index import LlamaIndexCallbackHandler
langfuse_handler = LlamaIndexCallbackHandler(
public_key=os.getenv("LANGFUSE_PUBLIC_KEY"),
secret_key=os.getenv("LANGFUSE_SECRET_KEY"),
host=os.getenv("LANGFUSE_HOST")
)
# Register with workflow
workflow = UnifiedWorkflow(
callbacks=[langfuse_handler],
timeout=600
)
Propagate User/Session Attributes:
Quality Gate:
@observe decorators added to all workflow entry pointsObjective: Remove all Phoenix dependencies without breaking functionality.
Steps:
Remove Phoenix Configuration File:
# Backup first (optional)
cp main/src/monitoring/phoenix_config.py main/src/monitoring/phoenix_config.py.bak
# Remove
rm main/src/monitoring/phoenix_config.py
Update Imports:
Use automated script:
python .claude/skills/langfuse-integration/scripts/remove_phoenix.py \
--target main/src/ \
--dry-run # Preview changes
Manual pattern:
# Remove all instances of:
# - from phoenix.otel import register
# - from phoenix import ...
# - import phoenix
# - Any calls to phoenix.trace(), register(), etc.
Remove Phoenix from Dependencies:
# Remove from pyproject.toml
uv remove arize-phoenix arize-phoenix-otel
Update Monitoring Module Init:
# main/src/monitoring/__init__.py
# OLD:
# from .phoenix_config import setup_phoenix, PhoenixManager
# NEW:
from .langfuse_config import setup_langfuse, get_langfuse_client
__all__ = ["setup_langfuse", "get_langfuse_client"]
Remove Phoenix Server Command (if applicable):
# Check if phoenix serve is in any scripts
grep -r "phoenix serve" . --include="*.sh" --include="*.py" --include="*.md"
# Remove or comment out
Quality Gate:
phoenix_config.py removedObjective: Verify Langfuse integration works correctly and traces appear in dashboard.
Steps:
Run Integration Health Check:
python .claude/skills/langfuse-integration/scripts/validate_integration.py
Expected output:
✅ Langfuse SDK installed
✅ API keys configured
✅ Cloud connectivity successful
✅ Test trace created: trace_id=xxx
✅ @observe decorators found: 15
✅ Callback handler configured
❌ No Phoenix imports found (expected)
Run End-to-End Workflow:
# Execute test workflow with real URS
uv run python main/main.py --urs examples/test_urs_001.md
Verify Trace in Dashboard:
https://cloud.langfuse.com/project/cmhuwhcfe006yad06cqfub107/tracesCompare Span Structure:
# If Phoenix baseline available, compare span counts
echo "Phoenix baseline: 131 spans/workflow"
echo "Langfuse actual: <count from dashboard>"
# Acceptable range: 120-140 (some variation expected)
Test Compliance Attributes:
compliance.gamp5.category: 1-5compliance.alcoa_plus.attributable: trueuser.clerk_id: <actual user ID>job.id: <actual job ID>Run Existing Tests:
Quality Gate:
Objective: Document the migration and update project references.
Steps:
Update Quick Start Guide:
main/docs/guides/QUICK_START_GUIDE.mdUpdate README:
Create Migration Notes:
# Phoenix → Langfuse Migration Summary
**Date**: <YYYY-MM-DD>
**Scope**: Complete Phoenix replacement
## Changes Made
- Removed: phoenix_config.py, Phoenix dependencies
- Added: langfuse_config.py, Langfuse SDK
- Instrumented: 15 functions with @observe decorators
- Replaced: LlamaIndex callback handler
## Verification
- Trace count: 131 spans/workflow (matches Phoenix baseline)
- Dashboard URL: https://cloud.langfuse.com/project/cmhuwhcfe006yad06cqfub107
- Compliance: GAMP-5 + ALCOA+ attributes preserved
## Rollback (if needed)
- Restore phoenix_config.py.bak
- Run: uv add arize-phoenix arize-phoenix-otel
- Remove @observe decorators
Update CLAUDE.md:
Commit Changes:
git add -A
git status # Review changes
# Commit with detailed message
git commit -m "$(cat <<'EOF'
feat: Replace Phoenix with Langfuse Cloud (EU) observability
- Add Langfuse SDK and LlamaIndex instrumentation
- Add @observe decorators to 15 workflow/agent functions
- Configure Langfuse Cloud (EU) with GAMP-5 compliance attributes
- Remove Phoenix dependencies and configuration
- Verify trace parity: 131 spans/workflow maintained
- Update documentation (Quick Start, README, CLAUDE.md)
Task: PRP 2.3 (LangFuse Integration and Dashboard)
Validation: All tests passing, traces visible in dashboard
🤖 Generated with Claude Code
Co-Authored-By: Claude <[email protected]>
EOF
)"
Quality Gate:
Before marking this skill complete, verify ALL criteria:
langfuse_config.py created with setup functions@observe decorators added to all critical pathsSymptom:
ModuleNotFoundError: No module named 'langfuse'
Solution:
uv add langfuse llama-index-instrumentation-langfuse
uv sync
Symptom: Workflow runs successfully but no traces in Langfuse Cloud.
Diagnosis:
Check API keys:
import os
print(f"Public key: {os.getenv('LANGFUSE_PUBLIC_KEY')[:10]}...")
print(f"Secret key configured: {bool(os.getenv('LANGFUSE_SECRET_KEY'))}")
Check flush call:
from langfuse import get_client
client = get_client()
client.flush() # CRITICAL: Must flush before exit
Check network connectivity:
curl -I https://cloud.langfuse.com
Solution:
client.flush() before process exitSymptom: Traces appear but lack GAMP-5 metadata.
Solution:
# Ensure get_current_observation() is called inside decorated function
from langfuse import observe, get_current_observation
@observe()
def my_function():
obs = get_current_observation()
if obs: # CRITICAL: Check if obs exists
obs.update(metadata={"compliance.gamp5.category": 5})
Symptom: Langfuse shows fewer spans than Phoenix baseline.
Diagnosis:
@observe decorators are appliedSolution:
# Find missing decorators
grep -r "async def" main/src/agents/ --include="*.py" | \
grep -v "@observe"
Symptom: Workflows slower with Langfuse vs Phoenix.
Diagnosis:
Solution:
# Tune batch settings
from langfuse import Langfuse
client = Langfuse(
flush_interval=5, # Flush every 5 seconds instead of 1
flush_at=50, # Batch 50 events before flushing
)
See reference/decorator-patterns.md for:
See reference/phoenix-migration-guide.md for:
See reference/compliance-attributes.md for:
For more control than decorators provide:
from langfuse import get_client
langfuse = get_client()
def complex_workflow():
with langfuse.start_as_current_span(
name="complex-workflow",
as_type="span"
) as span:
span.update(input={"mode": "batch"})
# Manual sub-span creation
with langfuse.start_as_current_span(
name="data-validation",
as_type="span"
) as sub_span:
validate_data()
sub_span.update(output={"valid": True})
# Main logic
result = process_data()
span.update(output=result)
For discrete events (not spans):
from langfuse import get_current_observation
obs = get_current_observation()
if obs:
obs.event(
name="gamp5-category-assigned",
metadata={
"category": 5,
"confidence": 0.95,
"timestamp": datetime.now().isoformat()
}
)
For pharmaceutical companies with multiple users:
from langfuse import observe, get_current_trace
@observe()
async def multi_tenant_workflow(org_id: str, user_id: str):
trace = get_current_trace()
if trace:
trace.update(
user_id=user_id,
tags=[f"org:{org_id}", "gamp5"],
metadata={
"organization.id": org_id,
"organization.name": get_org_name(org_id),
"compliance.data_residency": "EU"
}
)
# Workflow logic
...
Before reporting success to the user, verify:
IMPORTANT: NEVER claim success without user verification. Always ask: "Can you confirm you see traces appearing in the Langfuse dashboard at https://cloud.langfuse.com/project/cmhuwhcfe006yad06cqfub107/traces?"
After successful migration:
Use langfuse-extraction skill to:
Use langfuse-dashboard skill to:
Proceed with PRP tasks:
Skill Version: 1.0.0 Last Updated: 2025-01-17 Compatibility: LlamaIndex 0.12.0+, Langfuse SDK 3.0+ Data Residency: EU (cloud.langfuse.com) Compliance: GAMP-5, ALCOA+, 21 CFR Part 11 ready
# In API endpoint or workflow entry point
from langfuse import observe, get_current_trace
@observe()
async def generate_test_suite(user_id: str, urs_file: str, job_id: str):
# Set trace-level attributes
trace = get_current_trace()
if trace:
trace.update(
user_id=user_id,
session_id=job_id,
tags=["pharmaceutical", "gamp5"],
metadata={
"compliance.alcoa_plus.attributable": True,
"user.clerk_id": user_id,
"job.id": job_id
}
)
# All nested operations inherit these attributes
result = await unified_workflow.run(urs_file)
return result
Verify Decorator Coverage:
# Check all instrumentation points have decorators
grep -r "@observe" main/src/ --include="*.py" | wc -l
# Compare to Phoenix span count (should match or exceed)
# Ensure no regressions
pytest main/tests/ -v
# Check for import errors
mypy main/src/
# Check for Phoenix references
ruff check main/src/