Review backend code for quality, security, maintainability, and best practices based on established checklist rules. Use when the user requests a review, analysis, or improvement of backend files (e.g., `.py`) under the `src/backend/` directory. Do NOT use for frontend files (e.g., `.tsx`, `.ts`, `.js`). Supports pending-change review, code snippets review, and file-focused review.
Use this skill whenever the user asks to review, analyze, or improve backend code (e.g., .py) under the src/backend/ directory. Supports the following review modes:
src/backend/base/langflow/api/v1/flows.py).Do NOT use this skill when:
.tsx, .ts, .js, src/frontend/).src/backend/ (unless the user explicitly asks to review backend-related changes outside ).src/backend/Follow these steps when using this skill:
Notes when using this skill:
File:Line references when a file path and line numbers are available; otherwise, use the most specific identifier you can.src/backend/base/langflow/services/database/models/ or Alembic migrations under src/backend/base/langflow/alembic/versions/, follow references/db-schema-rule.md to perform the reviewselect(...), session.execute(...), joins, CRUD) and is not already inside a service under src/backend/base/langflow/services/, follow references/repositories-rule.md to perform the reviewsession_scope() usage, or raw SQL usage, follow references/sqlalchemy-rule.md to perform the reviewCheck for:
text() queries with string interpolation). Consequence: attacker can read/modify/delete any data in the database.CurrentActiveUser dependency). Consequence: unauthenticated users can access protected endpoints.user_id scoping on queries). Consequence: user A can read/modify user B's flows, variables, API keys./etc/passwd, .env).Check for:
session.execute()). Consequence: 100 flows = 101 DB queries instead of 2; page load goes from 50ms to 5s.time.sleep(), synchronous I/O, CPU-bound work without run_in_executor). Consequence: entire event loop stalls; all concurrent requests hang until the blocking call completes.Check for:
data, result, obj, temp). Functions should use verbs (get, create, validate). Booleans should use prefixes (is_, has_, can_, should_).except, swallowed exceptions, silent failures)Any where a concrete type is known)X | Y not Union[X, Y], X | None not Optional[X])TYPE_CHECKING guard for imports only needed for type annotations (prevents circular imports)Annotated[Type, Depends(...)] with project aliases (CurrentActiveUser, DbSession, DbSessionReadOnly) for FastAPI DIArgs:, Returns:, Raises: sections for public functionsCheck for:
DatabaseService that grow beyond this limit should have their CRUD operations extracted to dedicated modules.utils.py, helpers.py, misc.py, common.py as standalone files. Why: A file named utils.py becomes a dumping ground for unrelated functions. Within months it has 50+ functions covering formatting, validation, parsing, and HTTP calls — violating SRP. Each function group should be in a file named after its responsibility (formatting.py, validation.py).Check for:
pytest.mark.asyncio for async testsHappy path tests are the foundation but are NOT enough. Tests MUST also challenge the code to find real defects:
None, "", [], {}, 0, -1, UUID("00000000-0000-0000-0000-000000000000")endpoint_name is rejected with 422.Write tests based on REQUIREMENTS/SPEC, not on what the source code currently does. This is how you catch bugs where the code diverges from expected behavior.
When a test fails: first ask if the CODE is wrong, not the test. Do NOT silently change a failing assertion to match the current code without understanding WHY.
Check for:
lfx.log.logger with a-prefixed methods (adebug, ainfo, awarning, aerror, aexception). Never use print() or stdlib logging.user_id, flow_id, session_idprint() statements — these go to production logs{e!s} for string representation of exceptions in log messagesFor pending-change reviews, verify the author has run:
make format_backend (Ruff formatter) — inconsistent formatting creates noisy diffs that hide real changes in code review. Format first, review second.make lint (MyPy type checking) — type errors caught at lint time are 10x cheaper to fix than runtime crashes in production. Langflow services use duck typing via Service base class; MyPy catches mismatches early.make unit_tests (pytest) — a failing test means the change breaks existing behavior. Never merge with failing tests; investigate whether the code or the test is wrong.When this skill is invoked, the response must exactly follow one of the two templates:
# Code Review Summary
Found <X> critical issues need to be fixed:
## 🔴 Critical (Must Fix)
### 1. <brief description of the issue>
FilePath: <path> line <line>
<relevant code snippet or pointer>
#### Explanation
<detailed explanation and references of the issue>
#### Suggested Fix
1. <brief description of suggested fix>
2. <code example> (optional, omit if not applicable)
---
... (repeat for each critical issue) ...
Found <Y> suggestions for improvement:
## 🟡 Suggestions (Should Consider)
### 1. <brief description of the suggestion>
FilePath: <path> line <line>
<relevant code snippet or pointer>
#### Explanation
<detailed explanation and references of the suggestion>
#### Suggested Fix
1. <brief description of suggested fix>
2. <code example> (optional, omit if not applicable)
---
... (repeat for each suggestion) ...
Found <Z> optional nits:
## 🟢 Nits (Optional)
### 1. <brief description of the nit>
FilePath: <path> line <line>
<relevant code snippet or pointer>
#### Explanation
<explanation and references of the optional nit>
#### Suggested Fix
- <minor suggestions>
---
... (repeat for each nits) ...
## ✅ What's Good
- <Positive feedback on good patterns>
## Code Review Summary
✅ No issues found.