The Deployment Shape
Three packages, three Cloud Run services. Each one:
- Has its own URL
- Scales independently
- Has its own deployment cadence
- Can be rolled back without touching the others
Enable the APIs
# Set your project
gcloud config set project YOUR_PROJECT_ID
# Authenticate (interactive)
gcloud auth login
gcloud auth application-default login
# Enable the APIs
gcloud services enable \
run.googleapis.com \
artifactregistry.googleapis.com \
cloudbuild.googleapis.com \
aiplatform.googleapis.comStore the Gemini Key in Secret Manager
Don't bake the API key into the container. Use Secret Manager:
# Enable Secret Manager
gcloud services enable secretmanager.googleapis.com
# Create the secret from your local .env
gcloud secrets create gemini-api-key --replication-policy=automatic
echo -n "your-gemini-api-key" | gcloud secrets versions add gemini-api-key --data-file=-
# Give Cloud Run's default service account read access
PROJECT_NUM=$(gcloud projects describe YOUR_PROJECT_ID --format='value(projectNumber)')
gcloud secrets add-iam-policy-binding gemini-api-key \
--member="serviceAccount:${PROJECT_NUM}-compute@developer.gserviceaccount.com" \
--role="roles/secretmanager.secretAccessor"Deploy the Researcher
The Researcher and Writer are A2A servers — they need a normal gcloud run deploy, not adk deploy cloud_run (that command wraps the ADK web UI, which is the orchestrator's role).
Each agent gets its own tiny Dockerfile:
# Dockerfile.researcher
FROM python:3.12-slim
WORKDIR /app
COPY pyproject.toml uv.lock ./
RUN pip install uv && uv sync --frozen --no-dev
COPY researcher ./researcher
# Cloud Run injects PORT at runtime; default to 8080
ENV PORT=8080
CMD uv run uvicorn researcher.agent:a2a_app --host 0.0.0.0 --port ${PORT}Deploy it:
gcloud run deploy researcher \
--source . \
--region=us-central1 \
--platform=managed \
--allow-unauthenticated \
--memory=512Mi \
--set-secrets="GOOGLE_API_KEY=gemini-api-key:latest" \
--command="uv" \
--args="run,uvicorn,researcher.agent:a2a_app,--host,0.0.0.0,--port,8080"When it finishes you will see something like:
Service URL: https://researcher-abc123-uc.a.run.appSave that URL. The orchestrator needs it.
Deploy the Writer
Same pattern, different package name:
gcloud run deploy writer \
--source . \
--region=us-central1 \
--platform=managed \
--allow-unauthenticated \
--memory=512Mi \
--set-secrets="GOOGLE_API_KEY=gemini-api-key:latest" \
--command="uv" \
--args="run,uvicorn,writer.agent:a2a_app,--host,0.0.0.0,--port,8080"Save the URL it returns.
Verify Each Agent Card Is Public
Before deploying the orchestrator, confirm the specialists are reachable:
curl https://researcher-abc123-uc.a.run.app/.well-known/agent-card.json | jq .name
curl https://writer-abc123-uc.a.run.app/.well-known/agent-card.json | jq .nameYou should get "researcher" and "writer" back. If you get a 404, your service is up but A2A is not exposing the well-known path — double-check the uvicorn command in your Dockerfile.
Deploy the Orchestrator
The orchestrator gets the specialists' URLs as env vars:
gcloud run deploy orchestrator \
--source . \
--region=us-central1 \
--platform=managed \
--allow-unauthenticated \
--memory=512Mi \
--set-secrets="GOOGLE_API_KEY=gemini-api-key:latest" \
--set-env-vars="RESEARCHER_URL=https://researcher-abc123-uc.a.run.app" \
--set-env-vars="WRITER_URL=https://writer-abc123-uc.a.run.app" \
--command="uv" \
--args="run,adk,api_server,--host,0.0.0.0,--port,8080"Note we use adk api_server here, not uvicorn, because the orchestrator is consumed by users (or your frontend), not by other agents. adk api_server exposes the standard ADK Runner endpoints.
Call Your Live Stack
# Replace with your orchestrator's URL
ORCH=https://orchestrator-xyz789-uc.a.run.app
curl -X POST $ORCH/run \
-H "Content-Type: application/json" \
-d '{
"app_name": "orchestrator",
"user_id": "test_user",
"session_id": "session_1",
"message": "Brief me on the state of agentic web browsing in 2026."
}'Behind the scenes:
- The request hits the orchestrator's Cloud Run service.
- The orchestrator calls the researcher's Cloud Run service over HTTPS A2A.
- The researcher hits Gemini's
google_searchand returns findings. - The orchestrator calls the writer's Cloud Run service over HTTPS A2A.
- The writer returns the polished brief.
- The orchestrator hands it back to you.
Three independent services, one coherent answer.
Lock Down Inter-Agent Traffic (Optional)
The example above uses --allow-unauthenticated so anyone with the URLs can call the specialists directly. For production, you have two clean options:
- Cloud Run IAM: drop
--allow-unauthenticated, then give the orchestrator's service accountroles/run.invokeron the researcher and writer services.RemoteA2aAgentpicks up the default credentials and signs the requests automatically. - A2A Signed Agent Cards (v1.0): each specialist signs its card with the deployment's key. The orchestrator verifies before trusting. This is the protocol-level approach and works across cloud providers.
Pick the one that matches your trust boundary. IAM is simpler; signed cards are portable.
Costs
Cloud Run scales to zero, so when nobody is asking for briefs, you pay nothing for the services themselves. The active costs are:
| Resource | Notes |
|---|---|
| Cloud Run vCPU + memory | Free tier covers ~180,000 vCPU-seconds/month total |
| Gemini API calls | Researcher + Writer each consume tokens. Flash is cheap. |
| Egress | Inter-service A2A is in-region — minimal |
Three sleeping services cost effectively nothing.
Update the Stack
Each agent redeploys independently with the same gcloud run deploy command. Cloud Run does zero-downtime traffic shifting on every revision.
Key Takeaways
- One Cloud Run service per agent. They scale, deploy, and roll back independently.
- The orchestrator finds the specialists via env vars — no hardcoded URLs in code.
- Use Secret Manager for the Gemini key. Never bake it into the image.
- For production, lock down inter-agent traffic with Cloud Run IAM or A2A signed cards.
Reference: Cloud Run deploy from source · Secret Manager for Cloud Run · Cloud Run IAM invoker