Structured Retrieval: BigQuery as a Tool

One-Line Summary: The most reliable way to feed an agent is to expose a small set of narrowly-scoped functions that query gold tables — predictable, cheap, and auditable.

Prerequisites: Module 04.

What's the Concept?

When people first wire an agent to a warehouse, the instinct is to expose "SQL access" — let the model write its own queries. This works in demos and breaks in production: hallucinated table names, runaway scans, accidental joins across PII boundaries, no caching.

The production pattern is the opposite: expose a small number of pre-built tools (function definitions in the agent's tool schema) that each wrap one or two parameterized queries against a gold table. The agent doesn't write SQL; it calls a function. You as the data engineer own the query, the cost ceiling, and the schema of the result.

How It Works

The tool schema for a billing agent might look like:

TOOLS = [
    {
        "name": "get_billing_context",
        "description": (
            "Look up a customer's billing summary: plan, recent orders, "
            "90-day spend, refunds, open issues. Use this whenever you "
            "need facts about a customer's billing history."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "customer_id": {
                    "type": "string",
                    "description": "The customer's unique ID (UUID).",
                }
            },
            "required": ["customer_id"],
        },
    },
    {
        "name": "search_orders",
        "description": (
            "Find orders matching specific criteria. Use this for "
            "questions like 'show me my failed payments last month.'"
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "customer_id": {"type": "string"},
                "status": {
                    "type": "string",
                    "enum": ["succeeded", "failed", "refunded", "any"],
                },
                "since_days": {"type": "integer", "minimum": 1, "maximum": 365},
            },
            "required": ["customer_id"],
        },
    },
]

Behind each tool is a small Python function that runs a parameterized BigQuery query:

def get_billing_context(customer_id: str) -> dict:
    """Return the billing context for a single customer."""
    query = """
        SELECT
          customer_id, email, plan_name, signup_date,
          recent_orders, spend_last_90d_cents, total_refunded_cents,
          open_issue, last_activity_at, _refreshed_at
        FROM `myco.gold.billing_agent_context`
        WHERE customer_id = @customer_id
    """
    job = bigquery_client.query(
        query,
        job_config=bigquery.QueryJobConfig(
            query_parameters=[
                bigquery.ScalarQueryParameter("customer_id", "STRING", customer_id)
            ],
            maximum_bytes_billed=10_000_000,   # 10 MB ceiling
        ),
    )
    rows = list(job.result())
    return {"customer": dict(rows[0]) if rows else None}

Notice four guarantees built into this implementation:

Parameterized. No string interpolation; SQL injection is impossible.
Bounded. maximum_bytes_billed caps the query cost at the call site. A misconfigured cluster key or accidental cross join can't blow the budget.
Typed. The tool returns a fixed shape the agent can rely on. Schema changes are deployable, not silent.
Single-purpose. The tool answers one question. The agent doesn't decide what to query — it decides what to ask.

Why It Matters

The retrieval shape is part of your product. The agent's behavior — what it can and can't answer — is directly the function of which tools you ship.
Costs become predictable. Per-call cost ceiling × calls-per-conversation = a knowable upper bound. Free-form SQL has no such bound.
Auditing becomes possible. Every tool call is a structured event with named parameters. Logs show "agent called get_billing_context(customer_id=abc)" — easy to debug, easy to attribute.
Schema migrations don't break agents. When you rename a column in gold.billing_agent_context, you update the tool implementation in one place; the agent's contract is unchanged.

Key Technical Details

Keep the tool inventory small — 3 to 8 tools per agent. Beyond ~15, the model starts confusing them.
Tool descriptions are part of the prompt. Write them like you'd write API docs for a junior engineer; they're the only thing the model has to go on.
Return JSON, not free-form text. The agent will format for humans; the tool's job is to deliver facts.
Always include a _freshness or _refreshed_at field in the response. The model can mention staleness in its answer — useful for trust.

Common Misconceptions

"Let the agent write SQL." This works for sandboxes and exploratory tools, not production. Production agents call typed functions; free-form SQL belongs to engineers.

"More tools = more capability." Past about 10 tools, the model's tool-selection accuracy degrades. Compose by coverage, not by quantity.

"The tool should return everything in case the agent wants it." No — the tool should return what the agent's prompt described. Extra data wastes context and confuses the model.

Connections to Other Concepts

02-semantic-retrieval-embeddings-and-vector-search.md — When the agent doesn't know what to look up by ID.
04-the-retrieval-contract-between-pipeline-and-agent.md — The full interface design.
Course 04-refinement-in-bigquery/02-silver-to-gold-modeling-for-agents.md — The tables these tools query.