Step 6: Deploy to Cloud Run — Multi-Agent A2A System on Cloud Run

The Deployment Shape

Three packages, three Cloud Run services. Each one:

Has its own URL
Scales independently
Has its own deployment cadence
Can be rolled back without touching the others

Enable the APIs

# Set your project
gcloud config set project YOUR_PROJECT_ID
 
# Authenticate (interactive)
gcloud auth login
gcloud auth application-default login
 
# Enable the APIs
gcloud services enable \
  run.googleapis.com \
  artifactregistry.googleapis.com \
  cloudbuild.googleapis.com \
  aiplatform.googleapis.com

Store the Gemini Key in Secret Manager

Don't bake the API key into the container. Use Secret Manager:

# Enable Secret Manager
gcloud services enable secretmanager.googleapis.com
 
# Create the secret from your local .env
gcloud secrets create gemini-api-key --replication-policy=automatic
echo -n "your-gemini-api-key" | gcloud secrets versions add gemini-api-key --data-file=-
 
# Give Cloud Run's default service account read access
PROJECT_NUM=$(gcloud projects describe YOUR_PROJECT_ID --format='value(projectNumber)')
gcloud secrets add-iam-policy-binding gemini-api-key \
  --member="serviceAccount:${PROJECT_NUM}-compute@developer.gserviceaccount.com" \
  --role="roles/secretmanager.secretAccessor"

Deploy the Researcher

The Researcher and Writer are A2A servers — they need a normal gcloud run deploy, not adk deploy cloud_run (that command wraps the ADK web UI, which is the orchestrator's role).

Each agent gets its own tiny Dockerfile:

# Dockerfile.researcher
FROM python:3.12-slim
 
WORKDIR /app
COPY pyproject.toml uv.lock ./
RUN pip install uv && uv sync --frozen --no-dev
 
COPY researcher ./researcher
 
# Cloud Run injects PORT at runtime; default to 8080
ENV PORT=8080
CMD uv run uvicorn researcher.agent:a2a_app --host 0.0.0.0 --port ${PORT}

Deploy it:

gcloud run deploy researcher \
  --source . \
  --region=us-central1 \
  --platform=managed \
  --allow-unauthenticated \
  --memory=512Mi \
  --set-secrets="GOOGLE_API_KEY=gemini-api-key:latest" \
  --command="uv" \
  --args="run,uvicorn,researcher.agent:a2a_app,--host,0.0.0.0,--port,8080"

When it finishes you will see something like:

Service URL: https://researcher-abc123-uc.a.run.app

Save that URL. The orchestrator needs it.

Deploy the Writer

Same pattern, different package name:

gcloud run deploy writer \
  --source . \
  --region=us-central1 \
  --platform=managed \
  --allow-unauthenticated \
  --memory=512Mi \
  --set-secrets="GOOGLE_API_KEY=gemini-api-key:latest" \
  --command="uv" \
  --args="run,uvicorn,writer.agent:a2a_app,--host,0.0.0.0,--port,8080"

Save the URL it returns.

Verify Each Agent Card Is Public

Before deploying the orchestrator, confirm the specialists are reachable:

curl https://researcher-abc123-uc.a.run.app/.well-known/agent-card.json | jq .name
curl https://writer-abc123-uc.a.run.app/.well-known/agent-card.json | jq .name

You should get "researcher" and "writer" back. If you get a 404, your service is up but A2A is not exposing the well-known path — double-check the uvicorn command in your Dockerfile.

Deploy the Orchestrator

The orchestrator gets the specialists' URLs as env vars:

gcloud run deploy orchestrator \
  --source . \
  --region=us-central1 \
  --platform=managed \
  --allow-unauthenticated \
  --memory=512Mi \
  --set-secrets="GOOGLE_API_KEY=gemini-api-key:latest" \
  --set-env-vars="RESEARCHER_URL=https://researcher-abc123-uc.a.run.app" \
  --set-env-vars="WRITER_URL=https://writer-abc123-uc.a.run.app" \
  --command="uv" \
  --args="run,adk,api_server,--host,0.0.0.0,--port,8080"

Note we use adk api_server here, not uvicorn, because the orchestrator is consumed by users (or your frontend), not by other agents. adk api_server exposes the standard ADK Runner endpoints.

Call Your Live Stack

# Replace with your orchestrator's URL
ORCH=https://orchestrator-xyz789-uc.a.run.app
 
curl -X POST $ORCH/run \
  -H "Content-Type: application/json" \
  -d '{
    "app_name": "orchestrator",
    "user_id": "test_user",
    "session_id": "session_1",
    "message": "Brief me on the state of agentic web browsing in 2026."
  }'

Behind the scenes:

The request hits the orchestrator's Cloud Run service.
The orchestrator calls the researcher's Cloud Run service over HTTPS A2A.
The researcher hits Gemini's google_search and returns findings.
The orchestrator calls the writer's Cloud Run service over HTTPS A2A.
The writer returns the polished brief.
The orchestrator hands it back to you.

Three independent services, one coherent answer.

Lock Down Inter-Agent Traffic (Optional)

The example above uses --allow-unauthenticated so anyone with the URLs can call the specialists directly. For production, you have two clean options:

Cloud Run IAM: drop --allow-unauthenticated, then give the orchestrator's service account roles/run.invoker on the researcher and writer services. RemoteA2aAgent picks up the default credentials and signs the requests automatically.
A2A Signed Agent Cards (v1.0): each specialist signs its card with the deployment's key. The orchestrator verifies before trusting. This is the protocol-level approach and works across cloud providers.

Pick the one that matches your trust boundary. IAM is simpler; signed cards are portable.

Costs

Cloud Run scales to zero, so when nobody is asking for briefs, you pay nothing for the services themselves. The active costs are:

Resource	Notes
Cloud Run vCPU + memory	Free tier covers ~180,000 vCPU-seconds/month total
Gemini API calls	Researcher + Writer each consume tokens. Flash is cheap.
Egress	Inter-service A2A is in-region — minimal

Three sleeping services cost effectively nothing.

Update the Stack

Each agent redeploys independently with the same gcloud run deploy command. Cloud Run does zero-downtime traffic shifting on every revision.

Key Takeaways

One Cloud Run service per agent. They scale, deploy, and roll back independently.
The orchestrator finds the specialists via env vars — no hardcoded URLs in code.
Use Secret Manager for the Gemini key. Never bake it into the image.
For production, lock down inter-agent traffic with Cloud Run IAM or A2A signed cards.

Reference: Cloud Run deploy from source · Secret Manager for Cloud Run · Cloud Run IAM invoker