Google ADK Tutorial: Skills, Parallel Agents & Vertex AI

Every Google ADK tutorial follows the same arc: install the package, spin up one Agent, run a single-turn conversation. The official quickstart does it. Every Colab notebook does it. Most third-party posts do it. Then they stop — right before the part that actually matters.

If you’ve done the quickstart and are wondering how this holds up in production — how to load skills without blowing your context window, how to run sub-agents in parallel, and how to deploy without rewriting half your code — you’re in the right place. This google adk tutorial skips the basics and goes straight to the three patterns that separate toy projects from production systems: runtime skill loading with SkillToolset, parallel execution with ParallelAgent and async tools, and deployment to Vertex AI Agent Engine. By the end, you’ll have working, copy-paste-ready Python code for a complete multi-agent system and a clear mental model for when ADK makes sense over LangGraph or CrewAI.

What Makes Google ADK Different (and Why Most Tutorials Miss the Point)

ADK isn’t trying to be LangChain or CrewAI. It’s a Google-native framework built for Gemini models, bidirectional audio/video via the Gemini Live API, and the Agent-to-Agent (A2A) protocol — the open standard Google published for inter-agent communication.

The GitHub numbers reflect genuine traction: over 17,500 stars, 2,500+ forks, and 2,800 dependent projects as of early 2026. TypeScript support officially launched in April 2026, with Go and Java in preview. This is infrastructure, not a research project.

What most tutorials miss is that ADK’s design is built around composable primitives: Agent, SequentialAgent, ParallelAgent, LoopAgent, and the newer SkillToolset. Understanding how those pieces fit together — not just how to instantiate a single agent — is the actual skill. The rest of this post is about exactly that.

Understanding SkillToolset and Progressive Disclosure

The biggest unsolved problem in agentic systems isn’t model quality — it’s context window management. Every tool description loaded into your agent costs tokens. A monolithic system prompt listing 40 tools upfront might consume 8,000–12,000 tokens before the user types a single character.

SkillToolset, available from google-adk v1.25.0 (currently experimental; stable release is v1.26.0), solves this with three-level progressive disclosure:

  • L1 — List: The agent sees only skill names and ~100-token descriptions. Cost: minimal.
  • L2 — Load: When the agent determines it needs a skill, it calls load_skill. The full skill definition enters context.
  • L3 — Resources: If the skill references external data (specs, schemas, example files), those load on demand via load_skill_resource.

The agent only pays the token cost for what it actually uses. For a system with 20 skills, the difference between L1 browsing and loading everything upfront can be an order of magnitude in token spend per turn.

One detail worth knowing upfront: ADK Skills follow the agentskills.io open spec. A skill you build for ADK works as-is in Gemini CLI, Claude Code, Cursor, and 40+ other compatible tools. You’re not locked into Google’s ecosystem.

Building Your First Skill — Folder Structure, SKILL.md, and Runtime Loading

An ADK Skill is a folder with a predictable structure:

skills/
└── market_analyzer/
    ├── SKILL.md          # required — name, description, tools
    ├── tool.py           # the actual implementation
    └── references/       # optional — schemas, docs, examples
        └── api_spec.json

SKILL.md is the contract. It tells the agent what this skill does and what tools it exposes. A minimal example:

# market_analyzer

Analyzes financial market data for a given ticker symbol.

## Tools
- `get_price_history`: Fetch OHLCV data for a symbol over a date range
- `compute_indicators`: Calculate RSI, MACD, and Bollinger Bands

Wiring it into an agent takes four lines:

from google.adk import Agent
from google.adk.toolsets import SkillToolset

skill_toolset = SkillToolset(skills_dir="./skills")

root_agent = Agent(
    name="financial_assistant",
    model="gemini-2.0-flash",
    instruction="You are a financial analyst assistant.",
    toolsets=[skill_toolset],
)

At runtime, when the agent receives “Analyze AAPL’s momentum over the last 30 days,” it first calls list_skills (L1), sees market_analyzer, calls load_skill("market_analyzer") (L2) to retrieve the full tool definitions, then calls get_price_history and compute_indicators. The references/ folder is fetched only if the tool logic explicitly requests it via load_skill_resource. Nothing loads until it’s needed.

The Agent-as-a-Tool Pattern vs. Native Sub-Agent Delegation

ADK gives you two ways to compose agents: native sub-agent delegation (built into SequentialAgent and ParallelAgent) and agent-as-a-tool (wrapping an agent as a callable function inside another agent). They’re not interchangeable.

Native delegation is right when you have a defined workflow — a pipeline where output from step A feeds step B. The parent agent doesn’t decide to invoke the sub-agent; the orchestrator routes automatically.

from google.adk.agents import SequentialAgent

pipeline = SequentialAgent(
    name="report_pipeline",
    sub_agents=[research_agent, writer_agent, reviewer_agent],
)

Agent-as-a-tool is right when a generalist agent needs to selectively invoke a specialist based on the conversation. The parent treats the specialist like any other tool — calling it with arguments, getting a string back.

from google.adk.tools import agent_tool

@agent_tool
async def research(query: str) -> str:
    """Research a topic thoroughly and return a structured summary."""
    result = await research_agent.run_async(query)
    return result.text

root_agent = Agent(
    name="assistant",
    model="gemini-2.0-flash",
    instruction="Answer questions, delegating complex research as needed.",
    tools=[research, web_search, calculator],
)

The trade-off is real: agent-as-a-tool is more flexible but harder to trace and test in isolation. Native sub-agents give you clean execution graphs that appear in Cloud Trace when you deploy to Vertex AI — which matters a lot when you’re debugging a production incident at 2am.

Adding Parallel Execution with ParallelAgent and Async Tools

Latency is a first-class concern in multi-agent systems. ADK addresses it through two independent mechanisms.

ParallelAgent

ParallelAgent runs its sub-agents concurrently. Per ADK’s official documentation, three sub-agents each taking 2 seconds complete in ~2 seconds total instead of 6 — the same logic that makes any parallel I/O faster than sequential I/O.

from google.adk.agents import ParallelAgent, SequentialAgent

# These three run simultaneously
parallel_research = ParallelAgent(
    name="parallel_research",
    sub_agents=[web_agent, docs_agent, news_agent],
)

# Synthesis runs after all three complete
pipeline = SequentialAgent(
    name="full_pipeline",
    sub_agents=[parallel_research, synthesis_agent],
)

Sub-agents in a ParallelAgent share session.state. This is the most common source of bugs. The fix is simple: give each sub-agent a distinct state key to write to.

# In web_agent's instruction:
# "Write your findings to session.state['web_results']"

# In docs_agent's instruction:
# "Write your findings to session.state['docs_results']"

If two agents write to the same key concurrently, the last write wins — silently. Naming your keys per-agent is not optional.

Parallel tool execution (ADK v1.10.0+)

You don’t always need ParallelAgent. If a single agent calls multiple tools in one turn, ADK can run those tools concurrently as long as they’re defined as async def:

import aiohttp
from google.adk.tools import tool

@tool
async def fetch_price(symbol: str) -> dict:
    async with aiohttp.ClientSession() as session:
        async with session.get(f"https://api.example.com/price/{symbol}") as resp:
            return await resp.json()

@tool
async def fetch_sentiment(symbol: str) -> dict:
    async with aiohttp.ClientSession() as session:
        async with session.get(f"https://api.example.com/sentiment/{symbol}") as resp:
            return await resp.json()

When the model decides to call both tools in the same response, ADK schedules them on the event loop concurrently. No extra configuration needed — the async def signature is the signal.

Wiring It All Together — A Complete Multi-Agent System in Under 150 Lines

Here’s a condensed but working system that combines SkillToolset, ParallelAgent, and agent-as-a-tool:

import asyncio
from google.adk import Agent
from google.adk.agents import ParallelAgent, SequentialAgent
from google.adk.toolsets import SkillToolset
from google.adk.tools import agent_tool, tool

# --- Skills (loaded progressively) ---
skill_toolset = SkillToolset(skills_dir="./skills")

# --- Async parallel tools ---
@tool
async def search_web(query: str) -> str:
    """Search the web for recent information."""
    # your search implementation
    return f"Web results for: {query}"

@tool
async def search_knowledge_base(query: str) -> str:
    """Search internal knowledge base."""
    # your KB implementation
    return f"KB results for: {query}"

# --- Specialist agents ---
web_research_agent = Agent(
    name="web_researcher",
    model="gemini-2.0-flash",
    instruction=(
        "You research topics using web search. "
        "Store your findings in session.state['web_findings']."
    ),
    tools=[search_web],
)

kb_research_agent = Agent(
    name="kb_researcher",
    model="gemini-2.0-flash",
    instruction=(
        "You research topics using the internal knowledge base. "
        "Store your findings in session.state['kb_findings']."
    ),
    tools=[search_knowledge_base],
)

synthesis_agent = Agent(
    name="synthesizer",
    model="gemini-2.0-flash",
    instruction=(
        "Synthesize research findings from session.state['web_findings'] "
        "and session.state['kb_findings'] into a coherent answer."
    ),
)

# --- Orchestration ---
parallel_research = ParallelAgent(
    name="parallel_research",
    sub_agents=[web_research_agent, kb_research_agent],
)

research_pipeline = SequentialAgent(
    name="research_pipeline",
    sub_agents=[parallel_research, synthesis_agent],
)

# --- Root agent: skills + on-demand deep research ---
@agent_tool
async def deep_research(query: str) -> str:
    """Conduct deep research using parallel web and KB search."""
    runner = research_pipeline.create_runner()
    result = await runner.run_async(query)
    return result.text

root_agent = Agent(
    name="root",
    model="gemini-2.0-flash",
    instruction=(
        "You are a professional research assistant. "
        "For quick questions, answer using your skills. "
        "For thorough research requests, use deep_research."
    ),
    toolsets=[skill_toolset],
    tools=[deep_research],
)

This is the full stack: SkillToolset for context-efficient tool loading, ParallelAgent for latency reduction, distinct session.state keys to avoid write conflicts, and agent-as-a-tool for flexible specialist delegation — all composable.

Deploying to Vertex AI Agent Engine (Zero Code Changes Required)

The deployment story is one of ADK’s genuine strengths. Vertex AI Agent Engine wraps your root_agent with managed infrastructure — authentication, Cloud Trace integration, stateful sessions via Memory Bank, and automatic scaling — without a single change to your agent definition.

import vertexai
from vertexai.preview import reasoning_engines

vertexai.init(project="your-gcp-project", location="us-central1")

# Wrap your existing agent — zero modifications
app = reasoning_engines.AdkApp(agent=root_agent)

# Deploy
remote_app = reasoning_engines.ReasoningEngine.create(
    app,
    requirements=[
        "google-adk>=1.26.0",
        "aiohttp>=3.9.0",
    ]
)

print(remote_app.resource_name)
# → projects/123/locations/us-central1/reasoningEngines/456

What you inherit for free once deployed:

  • Authentication: OAuth 2.0 and service account flows handled by Vertex AI — nothing to implement
  • Distributed tracing: Every tool call appears in Cloud Trace with latency breakdowns per agent and per tool
  • Stateful sessions: Memory Bank persists session state across turns without you managing a database
  • Autoscaling: Cold starts are Google’s problem, not yours

The one verification step after deployment: call remote_app.query(input="test") and confirm a trace appears in Cloud Console. If the trace shows up, the managed session and telemetry stack is working. If it doesn’t, check that AdkApp received the correct agent reference — the most common deployment mistake is wrapping the pipeline sub-agent instead of root_agent.

You can develop locally with adk web (which spins up a test UI against the same root_agent) and deploy the identical object. Local and production behavior are the same — no adapter layer, no environment-specific config.

When to Choose ADK Over LangGraph or CrewAI

Honest answer: it depends on your stack, not on which framework has more GitHub stars.

Choose ADK when:
– You’re on Google Cloud or planning to deploy to Vertex AI — the managed infrastructure story is the strongest argument for ADK
– You need bidirectional audio/video — Gemini Live API support is native, not bolted on as an afterthought
– You’re building in the A2A protocol ecosystem — ADK agents are first-class A2A citizens
– You want Skills that are portable across Gemini CLI, Claude Code, Cursor, and other agentskills.io-compatible tools

Choose LangGraph when:
– Your workflow involves genuinely complex conditional graphs — LangGraph’s state machine model is more expressive for branching, checkpointing, and human-in-the-loop pauses
– You need proven production track record: LangGraph is deployed by 400+ companies including Uber, LinkedIn, and JPMorgan, establishing it as the current benchmark for production readiness

Choose CrewAI when:
– You want the fastest path from idea to working prototype — CrewAI’s opinionated scaffolding reduces setup friction significantly
– Your team is new to multi-agent systems and values convention over flexibility

The broader context: multi-agent architectures already account for 66.4% of the AI agents market, and Gartner projects 40% of enterprise applications will include task-specific AI agents by end of 2026 — up from under 5% in 2025. Every major framework is competing for that adoption curve. ADK’s bet is that the winning pattern is Google Cloud-native, open-protocol, and streaming-first. For teams already in that orbit, it’s the right bet.

The Patterns That Actually Matter

Every google adk tutorial leaves you at the same cliff: a working single-agent demo with no path forward. Four patterns bridge that gap: SkillToolset for context-efficient tool management, ParallelAgent for latency reduction, async tools for concurrency inside a single agent, and AdkApp for zero-code deployment.

The complete multi-agent system above is deployable today. Start with it, swap in your real tools and skill definitions, and deploy with the four-line ReasoningEngine.create call. The framework is solid — the tutorials are just catching up.

Building something with ADK? Drop a comment with how your SkillToolset folder structure turned out — that’s consistently where people hit their first real wall, and specific examples help everyone else avoid the same detours.

Leave a Reply

Your email address will not be published. Required fields are marked *