\
The AI landscape is shifting beneath our feet. We've moved past the "God Model" era where one massive LLM tries to do everything into the age of Multi-Agent Systems. We have specialized agents for coding, reviewing, designing, and testing. It's a beautiful vision of digital collaboration.
But there's a problem.
They don't speak the same language. Your Coding Agent speaks JSON-RPC, your Review Agent expects gRPC, and your Design Agent just wants a REST API. It's the Tower of Babel all over again. Instead of a symphony, we get a cacophony of 400 Bad Request errors.
This is where Google's A2A (Agent-to-Agent) Protocol comes in a universal translator for the AI age.
In this deep dive, we're not just reading documentation. We're going to build A2A StoryLab, a collaborative storytelling system where three distinct AI agents work together to create, critique, and refine stories. It's practical, it's standardized, and it's how you future-proof your AI architecture.
To demonstrate the power of A2A, we need a team. A single agent is just a script; a team is a system.
Our StoryLab consists of three specialized roles:
It starts with a simple user request: "Adapt 'Bear Loses Roar' as a scientist who lost formulas."
The Orchestrator spins up a session and pings the Creator. The Creator drafts a version. The Orchestrator passes that draft to the Critic. The Critic hates it (score: 4/10) and explains why. The Orchestrator passes that feedback back to the Creator.
They iterate. Once the score hits 8/10, the Orchestrator ships the final story.
In a messy microservices world, finding the right service is half the battle. A2A solves this with Agent Cards. Think of them as a standardized business card that lives at /.well-known/agent.json.
When the Orchestrator needs a writer, it doesn't need to know the internal API schema of the Creator. It just checks the card.
python# src/creator_agent/main.py @app.get("/.well-known/agent.json") async def get_agent_card(): return { "name": "Story Creator Agent", "description": "Creates and refines story adaptations", "url": "http://localhost:8001", "protocolVersion": "a2a/1.0", "capabilities": ["remix_story", "refine_story"], "skills": [ { "id": "remix_story", "name": "Remix Story", "description": "Create a story variation from base story", "inputModes": ["text", "data"], "outputModes": ["text"] } ] }
This simple endpoint allows for dynamic discovery. You could swap out the Creator agent for a completely different model or service, and as long as it presents this card, the system keeps humming.
How do they actually talk? A2A enforces a strict Message Envelope. No more guessing if the data is in body, payload, or data.
Here is a real message captured from our StoryLab logs. This is the Orchestrator asking the Creator to get to work:
json{ "protocol": "google.a2a.v1", "message_id": "msg_abc123xyz789", "conversation_id": "conv_def456uvw012", "timestamp": "2025-12-07T10:30:45.123456Z", "sender": { "agent_id": "orchestrator-001", "agent_type": "orchestrator", "instance": "http://localhost:8000" }, "recipient": { "agent_id": "creator-agent-001", "agent_type": "creator" }, "message_type": "request", "payload": { "action": "remix_story", "parameters": { "story_id": "bear_loses_roar", "variation": "scientist who lost formulas" } } }
Notice conversation_id. This ID persists across the entire back-and-forth between the Orchestrator, Creator, and Critic. In a distributed system, this is your lifeline. It allows you to trace a single user request across dozens of agent interactions.
Talking about protocols is dry; let's look at the implementation. We use Python and FastAPI to build these agents, with Ollama providing local LLM inference for both story generation and evaluation.
This is the brain of the operation. It implements an iterative refinement loop. It doesn't just fire and forget; it mediates a conversation.
python# src/orchestrator/main.py @app.post("/adapt-story") async def adapt_story(request: AdaptStoryRequest): # ... setup session ... for iteration in range(1, MAX_ITERATIONS + 1): # Step 1: Ask Creator to generate (or refine) if iteration == 1: story_result, msg_id = await _call_creator_remix( conversation_id, story_id, variation, session_id ) else: story_result, msg_id = await _call_creator_refine( conversation_id, session_id, current_version, current_story_text, feedback=evaluation ) current_story_text = story_result["story_text"] # Step 2: Ask Critic to judge eval_result, msg_id = await _call_critic_evaluate( conversation_id, session_id, current_story_text, original_id, iteration ) # Step 3: The Quality Gate if eval_result["approved"] and eval_result["score"] >= APPROVAL_THRESHOLD: logger.info(f"✓ Story approved at iteration {iteration}") break return {"story": current_story_text, "score": eval_result["score"]}
This pattern Generate, Evaluate, Iterate is a fundamental building block of agentic workflows. A2A makes it robust because every step is tracked and standardized.
The Critic agent is interesting because it uses an LLM not to generate, but to analyze. It evaluates the story on four dimensions: Moral Preservation, Structure, Creativity, and Coherence.
python# src/critic_agent/main.py EVALUATION_WEIGHTS = { "moral_preservation": 0.30, "structure_quality": 0.25, "creativity": 0.25, "coherence": 0.20 } async def evaluate_story(message_data: dict): # ... unpack A2A message ... # LLM-powered evaluation eval_result = await ollama_client.evaluate_story( story_text=story_text, original_story=original_story.text, original_moral=original_moral ) # Calculate weighted score overall_score = ( eval_result["moral_preservation"] * EVALUATION_WEIGHTS["moral_preservation"] + eval_result["structure_quality"] * EVALUATION_WEIGHTS["structure_quality"] + eval_result["creativity"] * EVALUATION_WEIGHTS["creativity"] + eval_result["coherence"] * EVALUATION_WEIGHTS["coherence"] ) scaled_score = overall_score * 10.0 # Scale to 0-10 approved = scaled_score >= APPROVAL_THRESHOLD # Return A2A Response return create_response_message(..., payload={"score": scaled_score, "approved": approved})
By separating the Critic from the Creator, we avoid "hallucination myopia," where a model fails to see its own mistakes. It's pair programming, but for AI.
We are moving towards a world where you will buy a "Research Agent" from one vendor, a "Coding Agent" from another, and a "Security Agent" from a third. Without a standard like A2A, integrating them would be a nightmare of custom adapters.
With A2A, they just… talk.
A2A StoryLab is a proof of concept, but the pattern is production-ready:
The future of AI isn't a bigger model. It's a better team.
\


Copy linkX (Twitter)LinkedInFacebookEmail
Asia Morning Briefing: BTC Steadie