Name: Google Adk
Author: jota-batuta

Search skills.../

Google Adk | Skills Pool

from google.adk.tools import ToolContext

def query_stock(
    nombre_producto: str,
    tool_context: ToolContext,  # ADK injects this — do NOT include in docstring
) -> dict:
    """Consulta el stock actual de un producto en la tienda.

    Args:
        nombre_producto: Nombre o parte del nombre del producto.

    Returns:
        Dict con productos encontrados y sus stocks.
    """
    store_code = tool_context.state.get("store_code")
    # ... query SQL Server ...
    return {"status": "success", "productos": results}

from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner

session_service = InMemorySessionService()
runner = Runner(agent=agent, app_name="batovf", session_service=session_service)

# Create session with store context
session = await session_service.create_session(
    app_name="batovf",
    user_id="system",
    session_id=group_jid,  # "[email protected]"
    state={"store_code": "PQ", "store_name": "Parque", "almacen": "PQ"},
)

# Startup — once
runner = Runner(agent=agent, app_name="batovf", session_service=session_service)

# For each incoming message — reuse runner
async for event in runner.run_async(
    user_id="system",
    session_id=group_jid,
    new_message=types.Content(role="user", parts=[types.Part(text=text)]),
):
    if event.is_final_response() and event.content:
        response = event.content.parts[0].text

from fastapi import FastAPI

app = FastAPI()

# Custom endpoint BEFORE any ADK mount
@app.post("/webhook/whatsapp")
async def webhook(payload: dict):
    ...

from google.genai import types

# Text only — what we did before multimodal
content = types.Content(
    role="user",
    parts=[types.Part(text="cuantas empanadas hay")],
)

# Image with caption
content = types.Content(
    role="user",
    parts=[
        types.Part(text="esto es lo que veo en el estante"),  # caption first
        types.Part(
            inline_data=types.Blob(
                mime_type="image/jpeg",
                data=image_bytes,  # raw bytes, NOT base64
            )
        ),
    ],
)

# Audio voice note (no caption)
content = types.Content(
    role="user",
    parts=[
        types.Part(
            inline_data=types.Blob(
                mime_type="audio/ogg",
                data=audio_bytes,
            )
        ),
    ],
)

# pyproject.toml
# google-cloud-aiplatform >= 1.71.0

import vertexai
from vertexai.preview.vision_models import MultiModalEmbeddingModel

vertexai.init(project=GCP_PROJECT_ID, location="us-central1")
model = MultiModalEmbeddingModel.from_pretrained("gemini-embedding-2-preview")

# Text only
result = model.get_embeddings(contextual_text="empanada de carne en estante")
text_vector = result.text_embedding  # list[float], up to 3072 dim

# Image (raw bytes)
from vertexai.preview.vision_models import Image
image = Image(image_bytes=image_bytes)
result = model.get_embeddings(image=image)
img_vector = result.image_embedding

# Proactive — ephemeral session
session_id = f"check-{store_code}-{datetime.now().isoformat()}"

# Reactive — persistent session per group
session_id = group_jid  # "[email protected]"

web=True in get_fast_api_app breaks custom endpoints (Issue #51)
InMemorySessionService loses all data on restart — acceptable for MVP
DatabaseSessionService with asyncpg has timezone bug (Issue #4366) — use InMemory for MVP
session.state direct modification bypasses tracking — always use ToolContext
ToolContext in docstring makes LLM try to pass it as argument
*args/**kwargs are invisible to LLM schema
adk web agent loading — when launched as adk web src, the CLI adds src/ to sys.path and loads bato/ as the top-level package. Imports inside bato/ MUST use from bato.X import Y, NOT from src.bato.X import Y. The latter works in standalone smoke tests (because cwd is the project root) but fails inside adk web with ModuleNotFoundError: No module named 'src'.
adk web looks for root_agent, not agent — when loading the agent module, ADK Web imports bato.agent.root_agent (or bato.root_agent, or bato/root_agent.yaml). If you defined agent = Agent(...) for internal use, also export root_agent = agent at the bottom of the module so both work.
gemini-flash-lite-latest has text corruption bugs in structured responses (truncated words, missing characters). Use gemini-flash-latest instead. The -latest alias auto-tracks the newest stable version, so you survive Google's quarterly model deprecations without code changes.
Multimodal Part construction — inline_data takes a Blob, not raw bytes. The Blob has mime_type (string) and data (raw bytes, not base64). If you pass a base64 string, the model fails silently — b64decode it first.
Caption goes BEFORE the media Part — putting the caption Part after the media Part can confuse the model about user intent. Order: text first, then media.
Multimodal API rate limits are stricter than text-only — image + audio + video requests count against a separate quota. Plan for retries with exponential backoff in production.
gemini-embedding-2-preview is NOT in genai SDK — it's only accessible via google-cloud-aiplatform (Vertex AI). Setup requires a GCP service account with roles/aiplatform.user. The model is in PUBLIC PREVIEW (since 2026-03-10) — pin a version constant in your wrapper module and monitor the changelog for breaking API changes.
google.genai.types.Content requires a role — passing parts=[] without role raises a confusing TypeError downstream. Always set role="user" for incoming messages.

Rationalization	Reality
"InMemorySession is fine for the MVP -- I'll switch to DB later"	"Later" usually means after the first production restart that wipes every conversation. If the agent has any kind of session-scoped memory (store_code, user preferences, in-progress task), users will notice the regression immediately. Either commit to ephemeral sessions deliberately or use a persistent service from day one
"I'll modify session.state directly -- ToolContext is just a wrapper"	Direct mutation bypasses ADK's event tracking. State changes do not appear in the runner's event stream, which breaks observability, replay, and any downstream consumer of run events. Always go through `ToolContext.state` -- it is the contract
"state_delta is reliable -- I can compute the next state from it"	state_delta is a hint, not a guarantee. ADK may collapse, reorder, or drop deltas depending on the runner mode. Treat session.state as the source of truth and reread it after each tool returns; do not reconstruct state from deltas
"I'll put `tool_context` in the docstring -- the LLM should know about it"	The LLM reads the docstring as the tool contract. If `tool_context` appears there, the model will try to pass a value for it, and the call will fail with a schema mismatch. ADK injects `tool_context` automatically -- it must be invisible to the LLM
"I'll use `web=True` in get_fast_api_app -- it gives me a nice UI for free"	`web=True` mounts static files at `/`, which silently overrides every custom endpoint defined after it (Issue #51). Webhooks stop working. Either use `adk web` separately for development, or build a minimal FastAPI without `web=True` for production

Google Adk

Purpose

When to Use

Critical Patterns

Pattern 1: Agent Definition

Google Adk

Purpose

When to Use

Critical Patterns

Pattern 1: Agent Definition

Pattern 2: Tools as Plain Functions

Pattern 3: Sessions Per Group

Pattern 4: Runner is Stateless and Thread-Safe

Pattern 5: FastAPI Integration

Pattern 6: Multimodal Input (image, audio, video, document)

Pattern 7: Embeddings via Vertex AI

Pattern 8: Proactive Mode (Scheduled Checks)

Gotchas (Verified)

Common Rationalizations

Red Flags

Verification Checklist

Openai Whisper

Voice Call

Prose

Clawhub

Sherpa Onnx Tts

Openai Whisper Api